Fighting Noise with Noise: Causal Inference with Many Candidate Instruments
Dehan Kong, PhD
Associate Professor in Statistics, University of Toronto
Note: Meet & Greet Prof Dehan Kong from 3-3:30pm in Room 1140; Prior to seminar 3:30-4:30pm
WHEN: Wednesday, September 10, 2025, from 3:30 to 4:30 p.m.
WHERE: Hybrid | 2001 McGill College Avenue, Rm 1140; Zoom
NOTE: Dehan Kong will be presenting in-person
Abstract
Instrumental variable methods provide useful tools for inferring causal effects in the presence of unmeasured confounding. To apply these methods with large-scale data sets, a major challenge is to find valid instruments from a possibly large candidate set. In practice, most of the candidate instruments are often not relevant for studying a particular exposure of interest. Moreover, not all relevant candidate instruments are valid as they may directly influence the outcome of interest. In this article, we propose a data-driven method for causal inference with many candidate instruments that addresses these two challenges simultaneously. A key component of our proposal involves using pseudo variables, known to be irrelevant, to remove variables from the original set that exhibit spurious correlations with the exposure. Synthetic data analyses show that the proposed method performs favourably compared to existing methods. We apply our method to a Mendelian randomization study estimating the effect of obesity on health-related quality of life.
Speaker Bio
I am currently an associate professor in statistics at the University of Toronto. I received my B.S. in Mathematics from Nankai University in 2008, and my Ph.D. in Statistics from North Carolina State University in 2013. I was a postdoctoral fellow in the Department of Biostatistics at the University of North Carolina, Chapel Hill from 2013-2016. My research aims to develop advanced data science tools and methodologies to handle large, complex, multi-scale real-world data. I work on topics including statistical machine learning, neuroimaging data analysis, statistical genetics and genomics, and causal inference. My research is being supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), the Canadian Institutes of Health Research (CIHR), the University of Toronto’s Data Science Institute, Canadian Statistical Sciences Institute (CANSSI), CANSSI Ontario, and Mitacs.