Supervisor: John Calarco, Department of Cell and Systems Biology
Alternative splicing is a tightly regulated process which forms a crucial layer of gene expression and exerts its effects in a tissue-specific manner. My project is centred on identifying the elements involved in the regulation of alternative splicing. The cis-elements are sequence determinants of alternative splicing which are recognized by trans-factors which result in diverse splicing patterns. We have employed a random library approach to identify these elements and study their effects on splicing where minigene reporters with diverse random decamers as potential cis-elements are introduced into C.elegans and parallel in-vivo measurements are made by RNAseq. This leads to identification of activators, repressors and cryptic splice site inducers. This is followed by wet-lab validation involving reverse transcription PCR, bioinformatic analysis for identification of interacting trans-factors and locating these elements genome-wide. My main emphasis would be on developing sophisticated computational models to understand the regulation of alternative splicing.
Supervisor: Jared Simpson, Department of Computer Science
My project aims to develop a tool to detect SVs from long-read sequencing data exploiting the benefits of genome assembly graph based SV calling. I aim to use structures in genome assembly graphs such as bubbles, branches and loops to detect SVs such as indels, translocations, inversions and duplications. I will develop methods to detect each type of variation from an assembly graph directly without the need for a reference, building on the work done for short read assembly based SV callers. These methods will be integrated together into a single tool for long read graph based SV detection. These methods will be subsequently tested and validated by comparing their SV calls to current reference based approaches using mixed lineage leukemia samples.
Supervisor: Melissa Holmes, Department of Cell and Systems Biology
Puberty is an essential developmental process in mammals. Previous studies have identified genes and regulators critical to puberty onset, suggesting this process is regulated epigenetically.1 However, no gene regulatory network for pubertal onset has been produced. The naked mole-rat (Heterocephalus glaber, NMR) is a unique mammal exhibiting socially-mediated reproductive suppression and whose potential for studying puberty is unmet. NMRs reside in large colonies of adult subordinates who remain in a prepubertal state due to the presence of a dominant breeding female.2 Most NMRs will never go through puberty unless they are removed from the suppressive cues of their colony. Only then do they exhibit the morphological, endocrine, and behavioural hallmarks of mammalian puberty, 3,4 providing an exceptional opportunity for experimental control of pubertal timing. The proposed studies will use NMRs to elucidate the genes (and their pathways) involved in reproductive suppression and subsequent activation. By identifying a gene regulatory network associated with pubertal delay, we aim to understand the biological mechanisms controlling pubertal timing in mammals.
Supervisor: Farzad Khalvati, Institute of Medical Science
My
Ph.D. thesis is focused on identifying molecular biomarkers of pediatric
low-grade glioma (pLGG) using MRI and Artificial Intelligence (AI), as an
alternative to the invasive procedure of biopsy. Additionally, I emphasize
reproducible research, translational medicine, and Human-AI interaction. To
that end, I have introduced OpenRadiomics, which provides the research
community with the largest and most comprehensive open-source AI-ready
radiomics data. OpenRadiomics also proposes a reproducible research protocol,
stressing generalizable training and evaluation pipelines instead of individual
trained models. The vision of my research is to make the pipelines end-to-end
and extend them beyond pLGG.
Supervisor: Stephen Wright, Ecology and Evolutionary Biology
Supervisor: Leslie Buck, Department of Cell and Systems Biology
Supervisor: Michael Hoffman, Department of Computer Science
Many transcription factors initiate transcription only in particular sequence contexts, providing the means for sequence specificity of transcriptional control. The position weight matrix (PWM) model allows for the computational identification of transcription factor binding sites (TFBSs), by characterizing a transcription factor’s position-specific preference over the DNA alphabet. This four-letter alphabet, however, only partially describes the possible diversity of nucleobases a transcription factor might encounter. For instance, cytosine is often present in a covalently modified form: 5-methylcytosine (5mC). 5mC can be successively oxidized to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC). Just as transcription factors distinguish one unmodified nucleobase from another, some have been shown to distinguish unmodified bases from these covalently modified bases. Modification-sensitive transcription factors provide a mechanism by which widespread changes in methylation and hydroxymethylation can dramatically shift active gene expression programs.
To understand the effect of modified nucleobases on gene regulation, I developed methods to discover motifs and identify TFBSs in DNA with covalent modifications. My models expand the standard A/C/G/T alphabet, adding m (5mC) and h (5hmC), f (5fC), and c (5caC). I created an expanded-alphabet sequence using whole-genome maps of 5mC and 5hmC in naive mouse T cells. Building upon my modified sequence, I discover TFBS motifs de novo and by using a hypothesis testing approach, on modified sequences in regions implicated by existing chromatin immunoprecipitation-sequencing (ChIP-seq) data. I elucidated various known methylation binding preferences, including the preference of ZFP57 and C/EBPβ for methylated motifs. I demonstrated that my method is robust to parameter perturbations, with transcription factors’ sensitivities for 5mC and 5hmC broadly conserved across a range of modified base calling thresholds. I am now beginning to discover novel transcription factor binding preferences, and am in the process of mining all Mouse ENCODE ChIP-seq data for these modified binding preferences. We plan to follow-up with collaborators, who will perform in vivo validation, via ChIP-seq experiments for the transcription factors that I predict to have altered 5mC/5hmC binding affinities.
A pre-print of our preliminary work is available on bioRxiv: http://dx.doi.org/10.1101/043794.
Supervisor: Yan Wang, Ecology and Evolutionary Biology