Learning Unhealthy Beverage Demand from Grocery Transaction Data - COMP 396 Undergraduate Research Project Application Form

Supervisor's Name: Joseph Vybihal

Supervisor's Email: jvybihal [at] cs.mcgill.ca

Supervisor's Phone: 514-398-7071

Supervisor's Website: http://www.cs.mcgill.ca/~jvybihal

Supervisor's department: Computer Science

Course number: COMP 396 (Computer Science)

Term: Winter 2018

Project start date: Monday, January 8, 2018

Project end date: Monday, April 16, 2018

Project title: Using Digital Purchasing Data to Generate Public Health

Evidence: Learning Unhealthy Beverage Demand from Grocery Transaction Data

Project description (50-100 words suggested): Unhealthy diet is the most important preventive cause of mortality and morbidity due to chronic diseases. Taxation of unhealthy food has been proposed to improve population-level dietary patterns, and its effectiveness can be estimated by the prediction of the change in unhealthy food purchasing upon increase of food price. Recent availability of grocery transaction data from scanner technologies enables an accurate prediction of food sales as a function of own-product attributes. However, the very large competing product attributes, typically a few thousand products, in these data prohibits the application of conventional statistical learning algorithms such a Ordinally Least Square (OLS). In this study, we explored the predictive performance of learningalgorithms adapted for high-dimensional data, namely the Least Absolute Shrinkage and Selection Operator (LASSO) and Decision Tree Regressor with Adaptive Boosting (DTR-AdaBoost), in comparison with a conventional statistical learning based on OLS.  LASSO demonstrated superior predictive accuracy to OLS, possibly due to its ability to reduce overfitting and collinearity across predictive features of food sales. DTR-AdaBoost showed the best predictive accuracy, suggesting the presence of extensive non-linearity between the predictive features in the transaction data and sales.

Prerequisite: 1 term completed at McGill + CGPA of 3.0 or higher; or permission of instructor.

Grading scheme (The final report must be worth at least 50% of final grade): Final Report: 70%; Verbal Evaluation at bi-weekly meetings: 30%

Other project information: This project will be co-supervised by Dr. David Buckeridge in the Department of Epidemiology, Biostatistics and Occupational Health.

Project status: This project is taken; however students may contact the professor to discuss other possible '396' projects this term.

How students can apply / Next steps: Bring a printed copy of this application form and your advising transcript to me during office hours.

Ethics, safety, and training: Supervisors are responsible for the ethics and safety compliance of undergraduate students. This project involves NEITHER animal subjects, nor human subjects, nor biohazardous substances, nor radioactive materials, nor handling chemicals, nor using lasers.