Benjamiin Fung (Information Studies) will give this year's final DH work in progress talk.
E-mail Authorship Analysis for Crime Investigation
The cyber world provides an anonymous environment for criminals to conduct malicious activities such as spamming, sending ransom e-mails, and spreading botnet malware. Often, these activities involve textual communication between a criminal and a victim, or between criminals themselves. The forensic analysis of online textual documents for addressing the anonymity problem called authorship analysis is the focus of most cybercrime investigations. Authorship analysis is the statistical study of linguistic and computational characteristics of the written documents of individuals. In this presentation, we will present a unified data mining solution to address authorship analysis problems based on the concept of frequent pattern-based writeprint, and demonstrate a new writeprint visualization tool. Extensive experiments on real-life data suggest that our proposed solution can precisely capture the writing styles of individuals. Furthermore, the writeprint is effective to identify the author of an anonymous text from a group of suspects and to infer some sociolinguistic characteristics of the author.