McGill's Seminar Series in Quantitative Life Sciences and Medicine
Sponsored by CAMBAM, QLS, MiCM and the Ludmer Centre
Title: Combining genomics data to predict function of the non-coding genome
Speaker: Sara Mostafavi, (University of British Columbia)
When: Tuesday, September 17, 12-1pm
Where: McIntyre Medical Building, room 1034
Abstract:The recent availability of diverse genome-wide assays, including ATAC-Seq, ChIP-Seq, and RNA-Seq, now enables researchers to quantify, at a high resolution, the cellular and context-specific activity of every segment of DNA. Combining genetic data with these other genomics assays provide an opportunity to a) decode DNA, for example by inferring the sequence code underlying functional differences between cell types within an individual, and b) predict the impact of variation in a given base of DNA on cellular function. However, interpreting this data to extract biological insights requires disentangling meaningful, and hence reproducible and consequential associations, from mere correlations (i.e. spurious associations). In this talk, I will present statistical and machine learning approaches for integrating heterogeneous data, in order to find robust associations. First, focusing on the task of finding associations between genetic variation and cellular (expression) traits in a population-based study, I will review methods for inferring and accounting for hidden confounding factors and then will describe new approaches based on latent variable modeling to infer context-specific associations. Second, I’ll describe our efforts in using deep learning approaches for combining genetic and ATAC-Seq data across a large set of immune cells to learn non-coding motifs across the genome.