Tour of research on student evaluations of teaching [B1-B3]: Meltzer and McNulty 2011, Basow et al. 2013, and Chisadza et al. 2019

Let's pause our discussion of studies in Holman et al. 2019 "Evidence of Bias in Standard Evaluations of Teaching" listed as "finding bias", to discuss three studies of student evaluations of teaching that are not in the Holman et al. 2019 list. I'll use the prefix "B" to refer to these bonus studies.

---

B1.

Meltzer and McNulty 2011 "Contrast Effects of Stereotypes: 'Nurturing' Male Professors Are Evaluated More Positively than 'Nurturing' Female Professors" reported on an experiment in which undergraduates rated a psychology job candidate, with variation in candidate gender (Dr. Michael Smith or Dr. Michelle Smith), variation in whether the candidate was described as "particularly nurturing", and variation in whether the candidate was described as "organized" or "disorganized". Participants responded to items such as "Do you think Dr. Smith's responses to students' questions in class would be helpful?" and "How do you think you would rate Dr. Smith's overall performance in this course?". Results indicated no main effect for gender, but the nurturing male candidate was rated higher than the control male candidate and the nurturing female candidate and marginally higher than the control female candidate.

For some reason, results for the "organized"/"disorganized" variation were not reported.

---

B2.

Basow et al. 2013 "The Effects of Professors' Race and Gender on Student Evaluations and Performance" reported on an experiment in which undergraduates from psychology, economics, and mathematics courses evaluated a three-minute engineering lecture from an animated instructor whose race and sex was Black or White and male or female; participants also took a quiz on lecture content. Results indicated that "student evaluations did not vary by teacher gender", that "students rated the African American professor higher than the White professor on several teaching dimensions", and that students in the male instructor condition and in the White instructor condition did better on the quiz (p. 359).

---

B3.

I don't have access to Chisadza et al. 2019 "Race and Gender Biases in Student Evaluations of Teachers", but the highlights indicate that "We use an RCT to investigate race and gender bias in student evaluations of teachers" and that "We note biases in favor of female lecturers and against black lecturers". The abstract at Semantic Scholar indicates that the experiment was conducted in South Africa and that "Students are randomly assigned to follow video lectures with identical narrated slides and script but given by lecturers of different race and gender".

---

Comments are open if you disagree, but I don't think that there is much in B1 or B2 that would undercut the use of student evaluations in employment decisions. The experiments have high internal validity, but B1 had no main effect for gender and B2 results aren't strong and consistent. Moreover, B1 and B2 use brief stimuli, so I don't know that the results are sufficiently informative about student evaluations at the end of a 15-week course.

Tour of research on student evaluations of teaching [B1-B3]: Meltzer and McNulty 2011, Basow et al. 2013, and Chisadza et al. 2019

Leave a Reply Cancel reply