Report on Gender Equity in Mercury Course Evaluations at McGill

Published: 12 September 2018


McGill University values quality in the courses it offers its students. End-of-course evaluations provide valuable student feedback and are one of the ways that McGill works towards maintaining and improving the quality of courses and the student learning experience.

In recent years, there have been several articles looking at potential bias in course evaluations.1 To determine the extent of gender bias in official end-of-course evaluations, Teaching and Learning Services (TLS) and the Office of Analysis, Planning, and Budget (APB) analyzed McGill’s course evaluation data.

Analysis and findings

Data used in the analysis

  • McGill students rate their instructors on a scale from 1 (strongly disagree) to 5 (strongly agree) on two statements: “Overall, this instructor is an excellent teacher” (Q3) and “Overall, I learned a great deal from this instructor” (Q4).
  • Instructor ratings (for lecture type courses only) from Fall 2013, Winter 2014, Fall 2014 and Winter 2015 course evaluations were analyzed.
  • Only instructor ratings associated with tenured and tenure-stream academics were included.
  • Nearly 125,000 instructor ratings for over 1,350 academics for each of Q3 and Q4 were used in the analysis.
  • In the absence of accessible data on instructors’ gender identity, information on gender was inferred from the legal sex designation appearing in each instructor’s employee file.


Statistically significant differences according to results from Wilcoxon Rank-Sum tests in instructor ratings across gender, academic rank, and Faculty2 were observed in end-of-course evaluations. While the differences were statistically significant, the difference was only 0.02 (on a scale of 1 to 5) in average score observed.

Male instructors were often rated higher than female instructors (statistically significant), even when controlling for differences in ratings across academic rank and Faculty.


Gender of Instructor

No. of Instructor Ratings

No. of Instructors Rated

Avg. Instructor Rating

Q3: Overall, this instructor is an excellent teacher.









Q4: Overall, I learned a great deal from this instructor.









Despite these findings, McGill’s Guidelines for Interpreting End-of-Course Evaluation Results state:

“Mercury results are reported to only 1 decimal place to avoid overemphasis on differences that are not meaningful … Small differences that are statistically significant are common with large sample sizes. As a result, it is important to ask whether the difference is large enough to have some practical implication. For example, if two instructors in a department receive average ratings of 4.7 and 4.8 on the question ’Overall, this instructor is an excellent teacher’; it would be difficult to argue that the difference of 0.1, although statistically significant, is large enough to claim that the instructor with a rating of 4.8 is a better teacher.”


This study only looked at differences in ratings between male and female instructors and does not include the experiences of those who do not identify within the gender binary. We were unable to look at cross-gender differences in ratings (i.e., male students evaluating female instructors, female students evaluating male instructors) because of the anonymity of the respondents. Bias in course evaluations may also arise with other personal attributes, such as sexual orientation, race, ethnicity, religion, and disability. However, these data are confidential to Human Resources and therefore were not included in this analysis. This analysis also therefore does not consider intersectional identities.

Recommendations to the McGill Community

When reviewing course evaluation results for purposes such as tenure, promotion, and merit decisions, academic unit heads should be aware of the possibility of gender and other biases.

Academic unit heads and instructors are encouraged to review McGill’s Guidelines for Interpreting End-of-Course Evaluation Results as they contain suggestions that are particularly meaningful given the findings of the analysis. These include considering the entire pattern of an instructor’s results and avoiding ranking instructors from “best” to “worst.”

For assistance with interpreting course evaluation data, academic unit heads and instructors may request a consultation with a member of TLS.

How McGill is addressing potential bias

The Senate Policy on End-of-Course Evaluations states that course evaluations are “one indicator of teaching effectiveness” (item 3.1). McGill’s guidelines for developing a teaching portfolio offer additional examples of indicators of teaching effectiveness, such as:

  • Measures taken in response to feedback on teaching;
  • Evidence of progress in teaching the same course over time;
  • Evidence of effective postdoctoral, graduate and undergraduate supervision;
  • Formal recognition of teaching accomplishment (e.g., teaching awards); and
  • Comments from peer observers.

Any instructor or Teaching Assistant who receives a comment in a course evaluation that appears to be hateful or discriminatory on the basis of attributes such as gender, sexual or gender identity, race, ethnicity, religion, or disability may request that the student’s response in its entirety (numerical responses and comments) be removed from Mercury. Learn more about this protocol.

TLS has developed resources to raise students’ awareness of the impact of bias in course evaluations. These resources include:

Instructors and academic unit heads are encouraged to share these resources with their students when encouraging them to complete course evaluations. These resources will be included in email reminders about course evaluations sent to students from the University.


Email tls [at]

Click here to access a PDF version of this report.

[1] Examples include Boring, et al., MacNell et al., and Mitchell.
[2] Faculty refers to the Faculty that offers the course.

While this web page is accessible worldwide, McGill University is on land which has served and continues to serve as a site of meeting and exchange amongst Indigenous peoples, including the Haudenosaunee and Anishinabeg nations. Teaching and Learning Services acknowledges and thanks the diverse Indigenous peoples whose footsteps mark this territory on which peoples of the world now gather. This land acknowledgement is shared as a starting point to provide context for further learning and action.

Back to top