Join us for a McGill School of Information Studies (SIS) Seminar Series talk on semantic visualization models for data mining with guest speaker Dr. Hady Lauw, Assistant Professor of Information Systems at Singapore Management University.
Visualization of high-dimensional data such as text documents is widely applicable. Classical approaches to visualization directly reduce a document's high-dimensional representation into visualizable two or three dimensions, using techniques such as multidimensional scaling. While such approaches may preserve the relationship among the data points, they are somewhat limited in terms of explanatory power. The current way to model semantics in text documents is through topic modelling, a class of probabilistic generative models for word occurrences in documents. In this talk, we will explore semantic visualization models, which are recent approaches for visualization that consider an intermediate representation in topic space between the input word space and the output visualization space. This has the useful effect of producing both a topic model and a visualization model simultaneously. The reduced dimensionalities are shown to still be effective in some data mining tasks, such as nearest neighbor classification.
Hady W. Lauw is an Assistant Professor of Information Systems at Singapore Management University. Previously, he was a post-doc at Microsoft Research Silicon Valley, and then a scientist at A*STAR's Institute for Infocomm Research. He obtained his bachelor's and PhD degrees from Nanyang Technological University. His research interest is in data mining, focusing on Web and social media data. He is an active member of the community and he has served several times as a PC member and reviewer for all major data mining and web search conferences and journals. More details can be found in his web site: http://www.hadylauw.com.
This talk is free and open to all. Please arrive early and ring for entry to the building.