Meta-Psychology has published my manuscript "Perceived Discrimination against Black Americans and White Americans".

---

Norton and Sommers 2011 presented evidence for a claim that has become widely cited, that "Whites have now come to view anti-White bias as a bigger societal problem than anti-Black bias". I was skeptical of that claim, so I checked the data for the American National Election Studies 2012 Time Series Study, which was the most recent ANES Time Series Study available at the time. These ANES data contradicted that claim.

I preregistered an analysis of the then-upcoming ANES 2016 Time Series Study and then preregistered an analysis of an item on a 2017 survey that YouGov conducted for me with different wording for the measure of perceived discrimination.

Each of these three sets of data indicated that a higher percentage of Whites reported the perception that there is more discrimination in the United States today against Blacks than against Whites, compared to the percentage of Whites that reported the perception that there is more discrimination in the United States today against Whites than against Blacks. And it's not particularly close. Here is a new data point: in weighted analyses of data from the ANES 2020 Time Series Study, 63% of non-Hispanic Whites rated discrimination against Blacks as larger than discrimination against Whites, but only 8% of non-Hispanic Whites rated discrimination against Whites as larger than discrimination against Blacks.

The Meta-Psychology article has an explanation for why the Norton and Sommers 2011 claim appears to be incorrect.

---

NOTE

1. Data source: American National Election Studies. 2021. ANES 2020 Time Series Study Preliminary Release: Combined Pre-Election and Post-Election Data [dataset and documentation]. March 24, 2021 version. www.electionstudies.org.

Tagged with: ,

The Journal of Academic Ethics published Kreitzer and Sweet‑Cushman 2021 "Evaluating Student Evaluations of Teaching: A Review of Measurement and Equity Bias in SETs and Recommendations for Ethical Reform".

---

Kreitzer and Sweet‑Cushman 2021 reviewed "a novel dataset of over 100 articles on bias in student evaluations of teaching" (p. 1), later described as "an original database of more than 90 articles on evaluative bias constructed from across academic disciplines" (p. 2), but a specific size of the dataset/database is not provided.

I'll focus on the Kreitzer and Sweet‑Cushman 2021 discussion of evidence for an "equity bias".

---

Footnote 4

Let's start with Kreitzer and Sweet‑Cushman 2021 footnote 4:

Research also finds that the role of attractiveness is more relevant to women, who are more likely to get comments about their appearance (Mitchell & Martin, 2018; Key & Ardoin, 2019). This is problematic given that attractiveness has been shown to be correlated with evaluations of instructional quality (Rosen, 2018)

Mitchell and Martin 2018 reported two findings about comments on instructor appearance. MM2018 Table 1 reported on a content analysis of official university course evaluations, which indicated that 0% of comments for the woman instructor and 0% of comments for the man instructor were appearance-related. MM2018 Table 2 reported on a content analysis of Rate My Professors comments, which indicated that 10.6% of comments for the woman instructor and 0% of comments for the man instructor were appearance-related, with p<0.05 for the difference between the 10.6% and the 0%.

So Kreitzer and Sweet‑Cushman 2021 footnote 4 cited the p<0.05 Rate My Professors finding but not the zero result for the official university course evaluations, even though official university course evaluations are presumably much more informative about bias in student evaluations as used in practice, compared to Rate My Professors comments that presumably are unlikely to be used for faculty tenure, promotion, and end-of-year evaluations.

Note also that Kreitzer and Sweet‑Cushman 2021 reported this Rate My Professors appearance-related finding without indicating the low quality of the research design: Mitchell and Martin 2018 compared comments about one woman instructor (Mitchell herself) to comments about one man instructor (Martin himself), from a non-experimental research design.

Moreover, the p<0.05 evidence for this "appearance" finding from Mitchell and Martin is based on an error by Mitchell and/or Martin. I blogged about the error in 2019, and MM2018 was eventually corrected (26 May 2020) to indicate that there is insufficient evidence (p=0.3063) to infer than the 10.6 percentage point gender difference in appearance-related comments is inconsistent enough with chance. However, Kreitzer and Sweet‑Cushman 2021 (accepted 27 Jan 2021) cited this "appearance" finding from the uncorrected version of the article.

---

And I'm not sure what footnote 4 is referencing in Key and Ardoin 2019. The closest Key and Ardoin 2019 passage that I see is below:

Another telling point is whether students comment on the faculty member's teaching and expertise, or on such personal qualities as physical appearance or fashion choices. Among the students who'd received the bias statement, comments on female faculty were substantially more likely to be about the teaching.

But this Key and Ardoin 2019 passage is about a difference between groups in comments about female faculty (involving personal qualities and not merely comments on appearance), and does not compare comments about female faculty to comments about male faculty, which is what would be needed to support the Kreitzer and Sweet‑Cushman 2021 claim in footnote 4.

---

And for the Kreitzer and Sweet‑Cushman 2021 claim that "the role of attractiveness is more relevant to women", consider this passage from Hamermesh and Parker (2005: 373):

The reestimates show, however, that the impact of beauty on instructors' course ratings is much lower for female than for male faculty. Good looks generate more of a premium, bad looks more of a penalty for male instructors, just as was demonstrated (Hamermesh & Biddle, 1994) for the effects of beauty in wage determination.

This finding is the *opposite* of the claim that "the role of attractiveness is more relevant to women".

Kreitzer and Sweet‑Cushman 2021 cited Hamermesh and Parker 2005 elsewhere, so I'm not sure why Kreitzer and Sweet‑Cushman 2021 footnote 4 claimed that "the role of attractiveness is more relevant to women" without at least noting the contrary evidence from Hamermesh and Parker 2005.

---

"...react badly when those expectations aren't met"

From Kreitzer and Sweet‑Cushman 2021 (p. 4):

Students are also more likely to expect special favors from female professors and react badly when those expectations aren't met or fail to follow directions when they are offered by a woman professor (El-Alayli et al., 2018; Piatak & Mohr, 2019).

From what I can tell, neither Piatak and Mohr 2019 nor Study 1 of El-Alayli et al 2018 support the "react badly when those expectations aren't met" part of this claim. I think that this claim refers to the "negative emotions" measure of El-Alayli et al 2018 Study 2, but I don't think that the El-Alayli et al 2018 data support that inference.

El-Alayli et al 2018 *claimed* that there was a main effect of professor gender for the "negative emotions" measure, but I think that that claim is incorrect: the relevant means in El-Alayli et al 2018 Table 1 are 2.38 and 2.28, with a sample size of 121 across two conditions and corresponding standard deviations of 0.93 and 0.93, so that there is insufficient evidence of a main effect of professor gender for that measure.

---

"...no discipline where women receive higher evaluative scores"

From Kreitzer and Sweet‑Cushman 2021 (p. 4):

Rosen (2018), using a massive (n = 7,800,000) Rate My Professor sample, finds there is no discipline where women receive higher evaluative scores.

I think that the relevant passage from Rosen 2018 is:

Importantly, out of all the disciplines on RateMyProfessors, there are no fields where women have statistically higher overall quality scores than men.

But this claim is based on an analysis limited to instructors rated "not hot", so Rosen 2018 doesn't support the Kreitzer and Sweet‑Cushman 2021 claim, which was phrased without that "not hot" caveat.

My concern with limiting the analysis to "not hot" instructors was that Rosen 2018 indicated that "hot" instructors on average received higher ratings than "not hot" instructors and that a higher percentage of women instructors than of men instructors received a "hot" rating. Thus, it seemed plausible to me that restricting the analysis to "not hot" instructors removed a higher percentage of highly-rated women than of highly-rated men.

I asked Andrew S. Rosen about gender comparisons by field for the Rate My Professors ratings for all professors and not limited to "not hot" professors, and he indicated that, of the 75 fields with the largest number of Rate My Professors ratings, men faculty had a higher mean overall quality rating at p<0.05 than women faculty did in many of these fields, but that, in exactly one of these fields (mathematics), women faculty had a higher mean overall quality rating at p<0.05 than men faculty did, with women faculty in mathematics also having a higher mean clarity rating and a higher mean helpfulness rating than men faculty in mathematics (p<0.05). Thanks to Andrew S. Rosen for the information.

By the way, the 7.8 million sample size cited by Kreitzer and Sweet‑Cushman 2021 is for the number of ratings, but I think that the more relevant sample size is the number of instructors who were rated.

---

"designs", plural

From Kreitzer and Sweet‑Cushman 2021 (p. 4):

Experimental designs that manipulate the gender of the instructor in online teaching environments have even shown that students offered lower evaluations when they believed the instructor was a woman, despite identical course delivery (Boring et al., 2016; MacNell et al., 2015).

The plural "experimental designs" and the citation of two studies suggests that one of these studies replicated the other study, but, regarding this "believed the instructor was a woman, despite identical course delivery" research design, Boring et al. 2016 merely re-analyzed data from MacNell et al. 2015, so the two cited studies are not independent of each other such that a plural "experimental designs" would be justified.

And Kreitzer and Sweet‑Cushman 2021 reported the finding without mentioning shortcomings of the research design, such as a sample size small enough (N=43 across four conditions) to raise reasonable questions about the replicability of the result.

---

Discussion

I think that it's plausible that there are unfair equity biases in student evaluations of teaching, but I'm not sure that Kreitzer and Sweet‑Cushman 2021 is convincing about that.

My reading of the literature on unfair bias in student evaluations of teaching is that the research isn't of consistently high enough quality that a credulous review establishes anything: a lot of the research designs don't permit causal inference of unfair bias, and a lot of the research designs that could permit causal inference have other flaws.

Consider the uncorrected Mitchell and Martin 2018: is it plausible that a respectable peer-reviewed journal would publish results from a similar research design that claimed no gender bias in student comments, in which the data were limited to a non-experimental comparison of comments about only two instructors? Or is it plausible that a respectable peer-reviewed journal would publish a four-condition N=43 version of MacNell et al. 2015 that found no gender bias in student ratings? I would love to see these small-N null-finding peer-reviewed publications, if they exist.

But maybe non-experimental "N=2 instructors" studies and experimental "N=43 students" studies that didn't detect gender bias in student evaluations of teaching exist, but haven't yet been published. If so, then did Kreitzer and Sweet‑Cushman try to find them? From what I can tell, Kreitzer and Sweet‑Cushman 2021 does not indicate that the authors solicited information about unpublished research through, say, posting requests on listservs or contacting researchers who have published on the topic.

I plan to tweet a link to this post tagging Dr. Kreitzer and Dr. Sweet‑Cushman, and I'm curious to see whether Kreitzer and Sweet‑Cushman 2021 is corrected or otherwise updated to address any of the discussion above.

Tagged with: ,

Racial resentment (also known as symbolic racism) is a common measure of racial attitudes in social science. See this post for items commonly used for racial resentment measures. For this post, I'll report plots about racial resentment, using data from the American National Election Studies 2020 Time Series Study.

---

This first plot reports the percentage of respondents that rated Whites, Blacks, Hispanics, and Asians/Asian-Americans equally on 0-to-100 feeling thermometers, at each level of a 0-to-16 racial resentment index. Respondents at the lowest level of racial resentment had a lower chance of rating the included racial groups equally, compared to respondents at moderate levels of racial resentment or even compared to respondents at the highest level of racial resentment.

---

This next plot reports the mean racial resentment for various groups. The top section is based on responses to 0-to-100 feeling thermometers about Whites, Blacks, Hispanics, and Asians/Asian-Americans. Respondents who rated all four included racial groups equally fell at about the middle of the racial resentment index, and respondents who reported isolated negative ratings about Whites (i.e., rated Whites under 50 but rated Blacks, Hispanics, and Asian/Asian-Americans at 50 or above) fell toward the low end of the racial resentment index.

The bottom two sections of the above plot report mean racial resentment based on responses to the "lazy" and "violent" stereotype items.

---

So I think that the above plots indicate that low levels of racial resentment aren't obviously normatively good.

---

Below is an update on how well racial resentment predicts attitudes about the environment and some other things that I don't expect to have a strong direct causal relationship with racial attitudes. The plot below reports OLS regression coefficients for racial resentment on a 0-to-1 scale, predicting the indicated outcomes on a 0-to-1 scale, with controls for gender, age group, education, marital status, income, partisanship, and ideology, all controlled for using categorical predictors.

The estimated effects of racial resentment on attitudes about federal spending on welfare and federal spending on crime (attitudes presumably related to race) are of similar "small to moderately small" size as the estimated effects of racial resentment on attitudes about greenhouse regulations, climate change causing severe weather, and federal spending on the environment.

Racial resentment had the ability to predict attitudes about the environment net of controls in ANES data from 1986.

---

NOTES

1. Data source: American National Election Studies. 2021. ANES 2020 Time Series Study Preliminary Release: Combined Pre-Election and Post-Election Data [dataset and documentation]. March 24, 2021 version. www.electionstudies.org.

2. Stata and R code. Dataset for Plot 1. Dataset for Plot 2. Dataset for Plot 3.

Tagged with: ,

Sex Roles published El-Alayli et al 2018 "Dancing Backwards in High Heels: Female Professors Experience More Work Demands and Special Favor Requests, Particularly from Academically Entitled Students".

El-Alayli et al 2018 discussed their research design for Study 2 as follows (pp. 141-142):

The name of the professor, as well as the use of gendered pronouns in some of the subsequent survey questions, served as our professor gender manipulation, and participants were randomly assigned to one of the two experimental conditions...After reviewing the profile, participants were given seven scenarios that involved imagining special favor requests that could be asked of the professor...For each scenario, participants were first asked to indicate how likely they would be to ask for the special favor on a scale from 1 (Not at all likely) to 6 (Extremely likely). Using the same response scale, participants were then asked the likelihood that they would expect the professor to say "yes", ...

El-Alayli et al 2018 discussed the results for this item (p. 143, emphasis added):

There was a statistically significant main effect of professor gender on expectations, F(1, 117) = 5.68, p = .019 (b = −.80, SE = .34), such that participants were more likely to expect a "yes" response to the special favor requests when the professor had a woman's name than when the professor had a man's name. (Refer to Table 1 for condition means for all dependent measures.)

El-Alayli et al 2018 Table 1 reports that, for this "Expecting 'Yes'" item, the mean was 2.12 for the female professor and 2.05 for the male professor, with corresponding standard deviations of 0.80 and 0.66. The sample size was 121 total participants after exclusions (p. 141), so it wasn't clear to me how these data could produce a p-value of 0.019 or a b of -0.80 for the main effect of professor gender.

---

I suspect that the -0.80 is not a main effect of professor gender but is instead the predicted effect of professor gender when the other term in the interaction (academic entitlement) is zero (with the lowest level of the academic entitlement scale being 1, see p. 142).

From El-Alayli et al 2018 (p. 143):

...the professor gender × academic entitlement interaction was statistically significant, F(1, 117) = 7.80, p = .006 (b = .38, SE = .14, ΔR2 = .06).

El-Alayli et al 2018 Table 1 indicates that the mean for academic entitlement is 2.27 for the male professor and 2.27 for the female professor, with corresponding standard deviations of 1.00 and 0.87. I'll ballpark 0.93 as the combined standard deviation.

From El-Alayli et al 2018 (p. 143):

Students had a stronger expectation of request approval from the female professor than from the male professor when they had a high level (+1 SD) of academic entitlement, t = 2.37, p = .020 (b = .42, SE = .18, 95% CI [.07, .78]), but not when they had average, t = .54, p = .590 (b = .07, SE = .13, 95% CI [−.18, .32]) or low (−1 SD) levels of entitlement, t = −1.61, p = .111 (b = −.29, SE = .18, 95% CI [−.64, .07]).

So the above passage provides three data points:

X1 = +1 SD = 2.27 + 0.93 = 3.20 || Y1 = 0.42

X2 = average = 2.27 || Y2 = 0.07

X3 = -1 SD = 2.27 - 0.93 = 1.34 || Y = -0.29

I used an OLS regression to predict these Ys using the Xs: to two decimal places, the X coefficient was 0.38 (which equals the coefficient on the interaction term), and the constant was -0.80 (which equals the purported "main effect"); however, in this regression -0.80 is the predicted value of Y when X (academic entitlement) is zero.

I'll add a highlight to the passage quoted above from El-Alayli et al 2018 (p. 143) to indicate what I think the main effect is:

Students had a stronger expectation of request approval from the female professor than from the male professor when they had a high level (+1 SD) of academic entitlement, t = 2.37, p = .020 (b = .42, SE = .18, 95% CI [.07, .78]), but not when they had average, t = .54, p = .590 (b = .07, SE = .13, 95% CI [−.18, .32]) or low (−1 SD) levels of entitlement, t = −1.61, p = .111 (b = −.29, SE = .18, 95% CI [−.64, .07]).

The b=0.07 is equal to the difference between the aforementioned means of 2.12 for the female professor and 2.05 for the male professor.

---

The subscripts for El-Alayli et al 2018 Table 1 indicate that p<0.05 for the main effect of professor gender for four of the six items, but I don't think that p<0.05 for the main effect of professor gender any of those four items.

Moreover, I think that González-Morales 2019 incorrectly described the El-Alayli et al 2018 results as applying to all students with a stronger effect among academically entitled students, instead of the effect being detected only among academically entitled students:

In a recent experimental study, El-Alayli, Hansen-Brown, and Ceynar (2018) found that when students identified a fictitious professor as a woman, they expected that this professor would respond positively to requests for special favors or accommodations. This effect was stronger among academically entitled students.

Tagged with: