Event

Biostatistics Seminar - "Cluster analysis of genetic sequence data via the Gap Procedure"

Tuesday, February 16, 2016 15:30to16:30
Purvis Hall Room 24, 1020 avenue des Pins Ouest, Montreal, QC, H3A 1A2, CA

Irene Vrbik, PhD

Post doctoral Fellow, Department of Mathematics and Statistics, McGill University

Cluster analysis of genetic sequence data via the Gap Procedure

ALL ARE WELCOME

Abstract:

Phylogenetic clustering typically involves estimating a phylogenetic tree and identifying groups of sequences having small genetic pairwise distances and sufficiently high clade support (either bootstrap or posterior probabilities). In this talk, we explore a simple distance-based clustering algorithm, called the Gap Procedure, which uses gaps in sorted pairwise distances to suggest a natural divide between group members and non-members. We show that the clusters found using the Gap Procedure agree closely with computationally expensive gold standard techniques on well separated groups of HIV DNA sequence data. Simulation studies are also presented to illustrate the scenarios in which this fast and easy to implement algorithm may be employed, and more importantly, when more sophisticated methods are required.

Bio:

http://www.math.mcgill.ca/ivrbik/

Back to top