Knowledge Graph Embedding with ElectronicHealth Records Data

Wednesday, April 3, 2024 15:30to16:30

Junwei Lu, PhD

Assistant Professor of Biostatistics
Department of Biostatistics |
Harvard T.H. Chan School of Public Health


Due to the increasing adoption of electronic healthrecords (EHR), large scale EHRshave become another rich data source for translational clinical research. We propose to infer the conditionaldependency structure among EHR features via a latent graphical block model (LGBM).The LGBM has a two layer structure with the first providing semantic embedding vector(SEV) representation for the EHR features and the second overlaying a graphical blockmodel on the latent SEVs. The block structures on the graphical model also allows us tocluster synonymous features in EHR. We propose to learn the LGBM efficiently, in bothstatistical and computational sense, based on the empirical point mutual informationmatrix. We establish the statistical rates of the proposed estimators and show the perfectrecovery of the block structure. Numerical results from simulation studies and real EHRdata analyses suggest that the proposed LGBM estimator performs well in finite sample.

Speaker bio

Junwei Lu is an Assistant Professor of Biostatistics, Department of Biostatistics, Harvard T.H. Chan School of Public Health. His research focuses on the intersection of statistical machine learning and clinical studies, revealing scientific associations among the clinical treatment strategies and patient phenotyping, especially focusing on precision medicine leveraging real-world clinical data such as electronic health records data for risk prediction and clinical optimization.

Back to top