My photo

My long-term research goal is to develop statistical and computational methods to discover the underlying principles of gene expression regulation in eukaryotes, and to explore how variations or defects in gene regulation cause phenotypic variation or diseases. How a cell controls its gene expression is one of the most fundamental and interesting questions in various biological processes, from intrinsic developmental programs to responses to extrinsic stimuli. Meanwhile, a large majority of genetic variants reported in the genome-wide association studies of common human diseases lies in introns or intergenic regions, suggesting their roles in gene expression regulation instead of protein coding. Thus, it is not surprising that gene expression is tightly regulated and coordinated at multiple levels. At the transcriptional level, the interactions between transcription factors and DNA binding motifs play pivotal roles in transcription initiation. So do epigenetic effects, including histone modifications and DNA modifications. At the post-transcriptional level, mRNA processing, mRNA nucleus-cytosol transport, mRNA degradation, and translational control all increase the complexity of gene expression regulation. For example, alternative splicing generates multiple transcript isoforms from the same gene locus through different combinations of splice sites. RNA editing can change the genetic information encoded in mRNA by altering the nucleotide composition. The nucleus-cytosol transport determines the fraction of mRNA that can be translated to protein products. Accurate quantification and comparison of transcriptomes are needed for each of these steps. My current research focuses on transcriptome analysis (e.g., RNA-seq data analysis) and post-transcriptional regulation analysis (e.g., alternative splicing analysis). In addition, I am interested in genome-wide association studies to identify genetic variants underlying diseases or other quantiative traits.