Current Students

 

Supervisor: John Calarco, Department of Cell and Systems Biology

A parallelized reporter screen to uncover tissue-specific splicing activator and repressor sequences in a multicellular organism

Alternative splicing is an indispensable layer of gene regulation that contributes to distinct cell andtissue identities within an organism. Many studies have focused on examining the context-dependentregulation of alternative splicing, identifying key cis-regulatory elements and trans-acting RNA bindingproteins (RBPs). However, large-scale screening approaches to identify features regulating alternativesplicing are typically performed in cell culture. As such, these studies can fail to capture the effects ofextracellular cues or varying tissue-specific signatures that dictate the context-dependent behavior ofcis-elements in a multicellular organism. Hence, to understand the complex regulation of splicingpatterns, it would be ideal to more systematically study the impact of cis-elements on splicing regulationin different tissue contexts in vivo. We have developed a high-throughput parallelized reporter assay(PRA) in Caenorhabditis elegans, capable of simultaneously profiling the activity of several thousandsynthetic splicing minigene reporters. These reporter libraries were used to monitor alternative splicingpatterns in muscle and neuronal cells in vivo. Using this approach, we introduced ten nucleotide- longrandom intronic sequences into our reporter libraries, and from thousands of such sequences, weidentified hundreds of putative cis- elements that enhance or repress alternative splicing patterns inneuronal and muscle tissue. We then subjected this catalog of PRA cis-elements to sequence featureanalysis. Interestingly, using regression and de novo motif enrichment analysis, we identified multipleshort motifs associated with splicing activation. We also scanned these splicing enhancing andrepressing PRA elements for known motifs bound by RBPs and found that specific RBPs could bemapped to different regulatory outcomes. Somewhat surprisingly, the majority of 10mer activatorsequences identified in neurons also activated splicing in muscle tissue, albeit to a weaker degree,suggesting that highly tissue-specific regulatory sequences may be less prevalent in nature whensearching the sequence space of available motifs. To our knowledge, our study represents the first useof a parallelized reporter assay employed in a multicellular animal to identify splicing regulatoryelements. This work will be used to derive mechanistic insights and potentially shed light on theinterplay of sequence features and their cognate RNA binding proteins in establishing tissue-specificsplicing outcomes in vivo.

 

Supervisor: Benjamin Blencowe, Computational Biology in Molecular Genetics (CBMG)

The unifying theme of my graduate research is the development and application of computational and statistical approaches to uncover regulatory features that globally impact mRNA level and translation efficiency, leveraging large omics datasets. A key focus is modeling codon usage bias, particularly in identifying and characterizing rare codon patches and understanding their roles in RNA regulatory networks linked to critical biological processes. Using sliding window approaches, string-searching algorithms, and Monte Carlo simulations, I have identified subsets of genes with intriguing patterns of codon bias that contribute to the coordinated control of regulatory networks, with implications for understanding gene regulation in fundamental biological processes. Altogether, my project is expected to reveal critical insights about the nature and function of biological information stored in coding sequences through codon bias.

Supervisor: Maxwell Shafer, Department of Cell and Systems Biology

My research investigates the evolution and genomics of temporal activity patterns in aquatic lineages. Temporal activity patterns reflect the repeated daily timing of sleep and wakefulness and include diurnal, nocturnal and crepuscular (twilight active) states. My PhD project uses the model clade of Lake Tanganyika cichlid fish to interrogate the molecular and evolutionary mechanisms driving differences in temporal activity patterns.  

Specifically, one of my aims is to perform quantitative trait loci mapping of crepuscular activity preference using a hybrid cross of a crepuscular and a non-crepuscular cichlid species. My second aim is to perform a transcriptomic analysis to look at differential expression of clock genes during different time periods (dawn, day, dusk, night) in a crepuscular species and compare this to a non-crepuscular species, potentially using methods like TRAP-Seq.

Supervisor: Farzad Khalvati, Institute of Medical Science

My Ph.D. thesis is focused on identifying molecular biomarkers of pediatric low-grade glioma (pLGG) using MRI and Artificial Intelligence (AI), as an alternative to the invasive procedure of biopsy. Additionally, I emphasize reproducible research, translational medicine, and Human-AI interaction. To that end, I have introduced OpenRadiomics, which provides the research community with the largest and most comprehensive open-source AI-ready radiomics data. OpenRadiomics also proposes a reproducible research protocol, stressing generalizable training and evaluation pipelines instead of individual trained models. The vision of my research is to make the pipelines end-to-end and extend them beyond pLGG.

Supervisor: Sushant Kumar, Department of Medical Biophysics

The overarching theme of my thesis is: features which underlie DNA repair. 

In my first aim, I used structural variants (SVs; 50bp+) from long and short read resources (ie. Human Genome Structural Variation Consortium (HGSVC), 1000 Genomes Project (1KGP), and the 100,000 Genomes Project (100KGP)), to identify features which differentiate homology-independent and homology-dependent SVs. Through a series of analysis, which included generating simulations and applying an active-learning unsupervised approach, we found that chromatin accessibility and DNA shape features are relevant. Further, we applied our method to probe differences among de novo SVs in individuals with Ultra Rare Disease. This aim is complete.

My second aim focuses on the relationship between DNA shape features and DNA repair. I have started by exploring single nucleotide variants (SNVs) from a CRISPR-Cas9 dataset which created biallelic knock-outs of 42 replication/ repair genes (Zou, et al., Nature Cancer, 2021). I have started by characterizing the SNVs acquired in these cell lines, by their DNA shape profile. I will train and validate a model which can differentiate between background SNVs and variants attributed to specific repair pathways. In the future, I will apply this model to whole genome sequenced (WGS) Clinical Proteomic Tumour Analysis Consortium (CPTAC) tumour samples, to investigate the downstream consequences of DNA repair via changes to DNA shape. For example, I will search for events where SNVs impact transcription factor binding sites, and evaluate cases where expression changes. 

My third aim’s objective is to create an interpretable model to describe tumours’ landscape of germline and somatic mutations in repair genes. My goal is to find evidence supporting or refuting cases of mutual exclusion among DNA repair deficiencies (ie. Homologous Recombination versus Mismatch Repair), and to compare evolutionary trajectories of tumours with distinct repair backgrounds. 

Supervisor: Stephen Wright, Ecology and Evolutionary Biology

Supervisor: Leslie Buck, Department of Cell and Systems Biology

Supervisor: Belinda Chang, Department of Cell and Systems Biology

The infraorder Cetacea (whales and dolphins) has emerged from perhaps the most dramatic evolutionary transformation that has occurred within mammals. While modern cetaceans are obligately aquatic, foraging by breath-hold diving and intermittently resurfacing to breathe, they are in fact descended from terrestrial artiodactyls; their closest extant relatives are the land-dwelling hippopotamids. Following their emergence about 54 million years ago, cetaceans underwent major morphological and sensory changes as they transitioned from a terrestrial to an aquatic habitat. The visual system in particular underwent a number of changes, likely driven in large part by the need to forage via diving. As light availability underwater decreases with increasing depth, the cetacean eye has become highly adapted to dim-light conditions, with features such as a reflective tapetum lucidum and a rod-dominated retina. As well, since the spectrum of visible light deep underwater is blue-shifted relative to the surface, the absorption spectra of some cetacean visual pigments have shifted accordingly, while others have become non-functional. With the sequencing of many cetacean genomes over the past decade, the specific molecular mechanisms underlying the adaptation of cetacean visual systems to their aquatic environments can now be investigated. First, computational analyses can be conducted to detect signatures of positive selection acting on specific codon sites within protein-coding genes. Then, experimental analyses can be conducted to determine how the substitutions observed at positively selected sites affect the function of the encoded protein. Several studies have begun to apply these methods to cetacean visual proteins, and have identified key substitutions that may have contributed greatly to visual adaptation. Much of this research has been focused on rhodopsin, the light-sensitive pigment that initiates the phototransduction cascade in the rod cells of the retina. While these studies have provided great insight into the molecular evolution of vision in cetaceans, much more remains unknown. Hundreds of mammalian proteins are known to be involved in phototransduction and the visual cycle, and hundreds more are known to be required for eye development. Amino acid changes in any number of these proteins could have contributed to the adaptation of cetacean visual systems to their aquatic environments, but large-scale computational analyses of selection have not yet been performed on the genes encoding these proteins in cetaceans. Such analyses would provide insight into potential positively selected sites within these proteins, and provide a more complete picture of the molecular basis of this adaptation.

This project will use various computational methods to detect positive selection on a large number of cetacean visual genes, including those involved in rod and cone phototransduction, the visual cycle, photoreceptor cell development, eye structure, and other vision-related functions. Sets of visual genes will first be established using current literature, and mammalian eye-derived gene expression datasets. Genome assemblies for dozens of cetacean species will be obtained from genomic databases, including NCBI and the Zoonomia Project. The protein-coding sequences of the genes of interest will then be extracted from these genomes using existing gene annotation, together with an automated computational pipeline that has recently been developed in the Chang laboratory. The resulting coding sequences will be aligned using a multiple sequence alignment algorithm such as MUSCLE or PRANK. Using these alignments together with phylogenies of the relevant species, patterns of selection will be inferred by estimating dN/dS (denoted ω), the ratio of the nonsynonymous and synonymous substitution rates; ω<1 is indicative of purifying selection, ω=1 indicates neutral evolution, and ω>1 is a signature of positive selection. Codon-based models of sequence evolution, such as those available in the PAML and HyPhy programs, will be fitted to the sequence data and compared against their respective null models to statistically test for positive selection. Models will be implemented that allow ω to vary between codon sites; this will enable the detection of specific amino acid residues that have been the targets of positive selection. Models that additionally allow ω to vary between specified groups of branches of the phylogeny will be implemented to test for differences in selection pressure between cetacean lineages. When implementing these models, the phylogeny will be partitioned based on factors such as habitat, diet, feeding mode, and foraging depth, in order to assess the effects of these ecological factors on visual protein evolution in cetaceans. This project will identify visual proteins that have been the targets of positive selection in cetaceans, and will determine the functional effects of amino acid substitutions at positively selected sites within them. The results will provide great insight into the molecular mechanisms behind the adaptation of cetacean visual systems to the aquatic environment; they will also reveal much about the functions of visual proteins, and the functional importance of certain sites within these proteins, in cetaceans and beyond.

Supervisor: Yan Wang, Ecology and Evolutionary Biology

 

Supervisory: John Calarco, Department of Cell and Systems Biology

Alternative splicing plays a central role in pre-mRNA maturation, influencing gene expression and generating proteomic diversity. Distinct patterns of alternative splicing, largely governed by RNA-binding proteins (RBPs), give rise to unique transcriptomic identities. These RBPs tend to bind functionally related pre-mRNAs, forming dynamic post-transcriptional operons. Notably, neuronal tissue is characterized by an abundance of alternative isoforms and a unique profile of splicing factor expression, which serves as a model to study the regulation of alternative splicing networks by RBPs. One aspect of alternative splicing regulation that remains poorly understood is its connection to intracellular signalling, particularly in neuronal tissue. To address this gap, I aim to conduct a reverse genetics screen within C. elegans neuronal tissue to identify alternative splicing events responsive to intracellular signals. In parallel, I am analyzing available RNA-seq datasets to uncover neuronal alternative splicing influenced by the daf-2 insulin-like signalling pathway. To identify the splicing factors affected by these signalling pathways, I am adapting an RNA-tethered proximity-labelling method coupled with mass spectrometry. This approach will characterize the proteins associated with transcripts undergoing signal-regulated alternative splicing. Neuronal transcriptome profiling in conjunction with transcriptome-wide identification of the targets of signal-regulated splicing factors will then be used to reveal how regulation of the RBPs affects their post-transcriptional operons. This project will offer insights into the mechanisms governing the regulation of alternative splicing networks by intracellular signalling pathways and shed light on the complex interplay between distinct regulatory mechanisms that determine neuronal function and identity.