The Journal of Race, Ethnicity, and Politics published Nelson 2021 "You seem like a great candidate, but…: Race and gender attitudes and the 2020 Democratic primary".

Nelson 2021 is an analysis of racial attitudes and gender attitudes that makes inferences about the effect of "gender attitudes" using measures that ask only about women, without any appreciation of the need to assess whether the effect of gender attitudes about women are offset by the effect of gender attitudes about men.

But Nelson 2021 has another element that I thought worth blogging about. From pages 656 and 657:

Importantly, though, I hypothesized that the respondent's race will be consequential for whether these race and gender attitudes matter—specifically, that I expect it is white respondents who are driving these relationships. To test this hypothesis, I reran all 16 logit models from above with some minor adjustments. First, I replaced the IVs "Black" and "Latina/o/x" with the dichotomous variable "white." This variable is coded 1 for those respondents who identify as white and 0 otherwise. I also added interaction terms between the key variables of interest—hostile sexism, modern sexism, and racial resentment—and "white." These interactions will help assess whether white respondents display different patterns than respondents of color...

This seems like a good research design: if, for instance, the p-value is less than p=0.05 for the "Racial resentment X White" interaction term, then we can infer that, net of controls, racial resentment associated with the outcome among White respondents differently than racial resentment associated with the outcome among respondents of color.

---

But, instead of reporting the p-value for the interaction terms, Nelson 2021 compared the statistical significance for an estimate among White respondents to the statistical significance for the corresponding estimate among respondents of color, such as:

In seven out of eight cases where racial resentment predicts the likelihood of choosing Biden or Harris, the average marginal effect for white respondents is statistically significant. In those same seven cases, the average marginal effect for respondents of color on the likelihood of choosing Biden or Harris is insignificant...

But the problem with comparing statistical significance for estimates is that a difference in statistical significance doesn't permit an inference that the estimates differ.

For example, Nelson 2021 Table A5 indicates that, for the association of racial resentment and the outcome of Kamala Harris's perceived electability, the 95% confidence interval among White respondents is [-.01, -.001]; this 95% confidence interval doesn't include zero, so that's a statistically significant estimate. The corresponding 95% confidence interval among respondents of color is [-.01, .002]; this 95% confidence interval includes zero, so that's not a statistically significant estimate.

But the corresponding point estimates are reported as -0.01 among White respondents and -0.01 among respondents of color, so there doesn't seem to be sufficient evidence to claim that these estimates differ from each other. Nonetheless, Nelson 2021 counts this as one of the seven cases referenced in the aforementioned passage.

Nelson 2021 Table 1 indicates that the sample had 906 White respondents and 466 respondents of color. The larger sample for Whites than respondents of color biases the analysis toward a better chance of detecting statistical significance among White respondents than among respondents of colors.

---

Table A5 provides sufficient evidence that some interaction terms had a p-value less than p=0.05, such as for the policy outcome for Joe Biden, with non-overlapping 95% confidence intervals for hostile sexism of [-.02, .0004] for respondents of color and [.002, .02] for White respondents.

But I'm not sure how much this matters, without evidence about how well hostile sexism measured gender attitudes among White respondents, compared to how well hostile sexism measured gender attitudes among respondents of color.

Tagged with: , ,

PLOS ONE recently published Gillooly et al 2021 "Having female role models correlates with PhD students' attitudes toward their own academic success".

Colleen Flaherty at Inside Higher Ed quoted Gillooly et al 2021 co-author Amy Erica Smith discussing results from the article. From the Flaherty story, with "she" being Amy Erica Smith:

"When we showed students a syllabus with a low percentage of women authors, men expressed greater confidence than women in their ability to do well in the class" she said. "When we showed students syllabi with more equal gender representation, men's self-confidence declined, but women and men still expressed equal confidence in their ability to do well. So making the curriculum more fair doesn't actually hurt men relative to women."

Figure 1 of Gillooly et al 2021 presented evidence of this male student backlash, with the figure note indicating that the analysis controlled for "orientations toward quantitative and qualitative methods". Gillooly et al 2021 indicated that these "orientation" measures incorporate respondent ratings of their interest and ability in quantitative methods and qualitative methods.

But the "Grad_Experiences_Final Qualtrics Survey" file indicates that these "orientation" measures appeared on the survey after respondents received the treatment. And controlling for such post-treatment "orientation" measures is a bad idea, as discussed in Montgomery et al 2018 "How Conditioning on Posttreatment Variables Can Ruin Your Experiment and What to Do about It".

The "orientation" items were located on the same Qualtrics block as the treatment and the self-confidence/self-efficacy item, so it seems possible that these "orientation" items might have been intended as outcomes and not as controls. I didn't find any preregistration that indicates the Gillooly et al plan for the analysis.

---

I used the Gillooly et al 2021 data to assess whether there is sufficient evidence that this "male backlash" effect occurs in straightforward analyses that omit the post-treatment controls. The p-value is about p=0.20 for the command...

ologit q14recode treatment2 if female==0, robust

...which tests the null hypothesis that male students' course-related self-confidence/self-efficacy as measured on the five-point scale did not differ by the difference in percentage of women authors on the syllabus.

See the output file below for more analysis. For what it's worth, the data provided sufficient evidence at p<0.05 that, among men students, the treatment affected responses to three of the four items that Gillooly et al 2021 used to construct the "orientation" controls.

---

NOTES

1. Data. Stata code. Output file.

2. Prior post discussing a biased benchmark in research by two of the Gillooly et al 2021 co-authors.

3. Figure 1 of Gillooly et al 2021 reports 76% confidence intervals to help assess a p<0.10 difference between estimates, and Figure 2 of Gillooly et al 2021 reports 84% confidence intervals to help assess a p<0.05 difference between estimates. I would be amazed if this p=0.05 / p=0.10 variation was planned before Gillooly et al analyzed the data.

Tagged with: , , , ,

PS: Political Science & Politics published Utych 2020 "Powerless Conservatives or Powerless Findings?", which responded to arguments in my 2019 "Left Unchecked" PS symposium entry. From Utych 2020:

Zigerell (2019) presented arguments that research supporting a conservative ideology is less likely to be published than research supporting a liberal ideology, focusing on the most serious accusations of ideological bias and research malfeasance. This article considers another less sinister explanation—that research about issues such as anti-man bias may not be published because it is difficult to show conclusive evidence that it exists or has an effect on the political world.

I wasn't aware of the Utych 2020 PS article until I saw a tweet that it was published, but the PS editors kindly permitted me to publish a reply, which discussed evidence that anti-man bias exists and has an effect on the political world.

---

One of the pieces of evidence for anti-man bias mentioned in my PS reply was the Schwarz and Coppock meta-analysis of candidate choice experiments involving male candidates and female candidates. This meta-analysis was accepted at the Journal of Politics, and Steve Utych indicated on Twitter that it was a "great article" and that he was a reviewer of the article. The meta-analysis detected a bias favoring female candidates over male candidates, so I asked Steve Utych whether it is reasonable to characterize the results from the meta-analysis as reasonably good evidence that anti-man bias exists and has an effect in the political realm.

I thought that the exchange that I had with Steve Utych was worth saving (archived: https://archive.is/xFQvh). According to Steve Utych, this great meta-analysis of candidate choice experiments "doesn't present information about discrimination or biases". In the thread, Steve Utych wouldn't describe what he would accept as evidence of anti-man bias in the political realm, but he was willing to equate anti-man bias with alien abduction.

---

Suzanne Schwarz, who is the lead author of the Schwarz and Coppock meta-analysis, issued a series of tweets (archived: https://archive.is/pFSJ0). The thread was locked before I could respond, so I thought that I would blog about my comments on her points, which she labeled "first" through "third".

Her first point, about majority preference, doesn't seem to be relevant about whether anti-man bias exists and has an effect in the political realm.

For her second point, that voting in candidate choice experiments might differ from voting in real elections, I think that it's within reason to dismiss results from survey experiments, and I think that it's within reason to interpret results from survey experiments as offering evidence about the real world. But I think that each person should hold no more than one of those positions at a given time.

So if Suzanne Schwarz doesn't think that the meta-analysis provides evidence about voter behavior in real elections, there might still be time for her and her co-author to remove language from their JOP article that suggests that results from the meta-analysis provide evidence about voter behavior in real elections, such as:

Overall, our findings offer evidence against demand-side explanations of the gender gap in politics. Rather than discriminating against women who run for office, voters on average appear to reward women.

And instead of starting the article with "Do voters discriminate against women running for office?", maybe the article could instead start by quoting language from Suzanne Schwarz's tweets. Something such as:

Do "voters support women more in experiments that simulate hypothetical elections with hypothetical candidates"? And should anyone care, given that this "does not necessarily mean that those voters would support female politicians in real elections that involve real candidates and real stakes"?

I think that Suzanne Schwarz's third point is that a person's preference for A relative to B cannot be interpreted as an "anti" bias against B, without information about that person's attitudinal bias, stereotypes, or animus regarding B.

Suzanne Schwarz claimed that we would not interpret a preference for orange packaging over green packaging as evidence of an "anti-green" bias, but let's use a hypothetical involving people, of an employer who always hires White applicants over equally qualified Black applicants. I think that it would be at least as reasonable to describe that employer as having an anti-Black bias, compared to applying the Schwarz and Coppock language quoted above, to describe that employer as "appear[ing] to reward" White applicants.

---

The Schwarz and Coppock meta-analysis of 67 survey experiments seems like it took a lot of work, was published in one of the top political science journals, and, according to its abstract, was based on an experimental methodology that "[has] become a standard part of the political science toolkit for understanding the effects of candidate characteristics on vote choice", with results that add to the evidence that "voter preferences are not a major factor explaining the persistently low rates of women in elected office".

So it's interesting to see the "doesn't present information about discrimination or biases" and "does not necessarily mean that those voters would support female politicians in real elections that involve real candidates and real stakes" reactions on Twitter archived above, respectively from a peer reviewer who described the work as "great" and from one of the co-authors.

---

NOTES

1. Zach Goldberg and I have a manuscript presenting evidence that anti-man bias exists and has a political effect, based on participant feeling thermometer ratings about men and about women in data from the 2019 wave of the Democracy Fund Voter Study Group VOTER survey. Zach tweeted about a prior version of the manuscript. The idea for the manuscript goes back at least to a Twitter exchange from March 2020 (Zach, me).

Steve Utych reported on the 2019 wave of this VOTER survey in his 2021 Electoral Studies article about sexism against women, but neither his 2021 Electoral Studies article or his PS article questioning the idea of anti-man bias reported results from the feeling thermometer ratings about men and about women.

Tagged with: ,

This plot reports disaggregated results from the American National Election Studies 2020 Time Series Study pre-election survey item:

On another topic: How much do you feel it is justified for people to use violence to pursue their political goals in this country?

Not shown is that 83% of White Democrats and 92% of White Republicans selected "Not at all" for this item.

Regression output controlling for party identification, gender, and race is in the Stata output file, along with uncertainty estimates for the plot percentages.

---

NOTES

1. Data source: American National Election Studies. 2021. ANES 2020 Time Series Study Preliminary Release: Pre-Election Data [dataset and documentation]. February 11, 2021 version. www.electionstudies.org.

2. Stata code for the analysis and R code for the plot. Dataset for the R plot.

Tagged with: , , ,

The Open Science Framework has a preregistration for the Election Research Preacceptance Competition posted in March 2017 for contributors Erin Cassese and Tiffany D. Barnes, for a planned analysis of data from the 2016 American National Election Studies Time Series Study. The preregistration was titled "Unpacking White Women's Political Loyalties".

The Cassese and Barnes 2019 Political Behavior article "Reconciling Sexism and Women's Support for Republican Candidates: A Look at Gender, Class, and Whiteness in the 2012 and 2016 Presidential Races" reported results from analyses of data for the 2016 American National Election Studies Time Series Study that addressed content similar to that of the aforementioned preregistration: 2016 presidential vote choice, responses on a scale measuring sexism, a comparison of how vote choice associates with sexism among men and among women, perceived discrimination against women, and a comparison of 2016 patterns to 2012 patterns.

But, from what I can tell, the analysis in the Political Behavior article did not follow the preregistered plan, and the article did not even reference the preregistration.

---

Moreover, some of the hypotheses in the preregistration appear to differ from hypotheses in the article. For example, the preregistration did not expect vote choice to associate with sexism differently in 2012 compared to 2016, but the article did. From the preregistration (emphasis added):

H5: When comparing the effects of education and modern sexism on abstentions, candidate evaluations, and vote choice in the 2012 and 2016 ANES data, we expect comparable patterns and effect sizes to emerge. (This is a non-directional hypothesis; essentially, we expect to see no difference and conclude that these relationships are relatively stable across election years. The alternative is that the direction and significance of the estimated effects in these models varies across the two election years.)

From the article (emphasis added):

To evaluate our expectations, we compare analysis of the 2012 and 2016 ANES surveys, with the expectation that hostile sexism and perceptions of discrimination had a larger impact on voters in 2016 due to the salience of sexism in the campaign (Hypothesis 5).

I don't think that the distinction between modern sexism and hostile sexism in the above passages matters: for example, the preregistration placed the "women complain" item in a modern sexism measure, but the article placed the "women complain" item in a hostile sexism measure.

---

Another instance, from the preregistration (emphasis added):

H3: The effect of modern sexism differs for white men and women. (This is a non-directional hypothesis. For women, modern sexism is an ingroup orientation, pertaining to women’s own group or self-interest, while for men it is an outgroup orientation. For this reason, the connection between modern sexism, candidate evaluations, and vote choice may vary, but we do not have strong a priori assumptions about the direction of the difference.)

From the article (emphasis added):

Drawing on the whiteness and system justification literatures, we expect these beliefs about gender will influence vote choice in a similar fashion for both white men and women (Hypothesis 4).

---

I think that readers of the Political Behavior article should be informed of the preregistration because preregistration, as I understand it, is intended to remove flexibility in research design, and preregistration won't be effective at removing research design flexibility if researchers retain the flexibility to not inform readers of the preregistration. I can imagine a circumstance in which the analyses reported in a publication do not need to follow the associated preregistration, but I can't think a good justification for Cassese and Barnes 2019 readers not being informed of the Cassese and Barnes 2017 preregistration.

---

NOTES

1. Cassese and Barnes 2019 indicated that (footnote omitted and emphasis added):

To evaluate our hypothesis that disadvantaged white women will be most likely to endorse hostile sexist beliefs and more reluctant to attribute gender-based inequality to discrimination, we rely on the hostile sexism scale (Glick and Fiske 1996). The ANES included two questions from this scale: (1) Do women demanding equality seek special favors? and (2) Do women complaining about discrimination cause more problems than they solve? Items were combined to form a mean-centered scale. We also rely on a single survey item asking respondents how much discrimination women face in the United States. Responses were given on a 5-point Likert scale ranging from none to a great deal. This item taps modern sexism (Cassese et al. 2015). Whereas both surveys contain other items gauging gender attitudes (e.g., the 2016 survey contains a long battery of hostile sexism items), the items we use here are the only ones found in both surveys and thus facilitate direct comparisons, with accurate significance tests, between 2012 and 2016.

However, from what I can tell, the ANES 2012 Time Series Codebook and the ANES 2016 Time Series Codebook both contain a modern sexism item about media attention (modsex_media in 2012, MODSEXM_MEDIAATT in 2016) and a gender attitudes item about bonding (women_bond in 2012, and WOMEN_WKMOTH in 2016). The media attention item is listed in the Cassese and Barnes 2017 preregistration as part of the modern sexism dependent variable / mediating variable, and the preregistration indicates that:

We have already completed the analysis of the 2012 ANES data and found support for hypotheses H1-H4a in the pre-Trump era. The analysis plan presented here is informed by that analysis.

2. Some of the content from this post is from a "Six Things Peer Reviewers Can Do To Improve Political Science" manuscript. In June 2018, I emailed the Cassese and Barnes 2019 corresponding author a draft of the manuscript, which redacted criticism of the work of other authors that I had not yet informed of my criticism of their work. For another example from the "Six Things" manuscript, click here.

Tagged with: , ,

The ANES (American National Election Studies) has released the pre- and post-election questionnaires for its 2020 Time Series Study. I thought that it would be useful or at least interesting to review the survey for political bias. I think that the survey is remarkably well done on net, but I do think that ANES 2020 contains unnecessary political bias.

---

1

ANES 2020 has two gender resentment items on the pre-election survey and two modern sexism items on the post-election survey. These four items are phrased to measure negative attitudes about women, but ANES 2020 has no parallels to these four items regarding negative attitudes about men.

Even if researchers cared about only sexism against women, parallel measures of attitudes about men would still be necessary. Evidence indicates and theory suggests that participants sexist against men would cluster at the low end of a measure of sexism against women, so that sexism against women can't properly be estimated as the change from low level to high level of these measures.

This lack of parallel items about men will plausibly produce a political bias in research that uses these four items as measures of sexism, because, while a higher percentage of Republicans than of Democrats is biased against women, a higher percentage of Democrats than of Republicans is biased against men (evidence about partisanship is in in-progress research, but check here about patterns in the 2016 presidential vote).

ANES 2020 has a feeling thermometer for several racial groups, so hopefully future ANES surveys include feeling thermometers about men and women.

---

2

Another type of political bias involves inclusion of response options so that the item can detect only errors more common on the political right. Consider this post-election item labeled "misinfo":

1. Russia tried to interfere in the 2016 presidential election

2. Russia did not try to interfere in the 2016 presidential election

So the large percentage of Hillary Clinton voters who reported the belief that Russia tampered with vote tallies to help Donald Trump don't get coded as misinformed on this misinformation item about Russian interference. The only error that the item can detect is underestimating Russian interference.

Another "misinfo" example:

Which of these two statements do you think is most likely to be true?

1. World temperatures have risen on average over the last 100 years.

2. World temperatures have not risen on average over the last 100 years.

The item permits climate change "deniers" to be coded as misinformed, but does not permit coding as misinformed "alarmists" who drastically overestimate how much the climate has changed over the past 100 years.

Yet another "misinfo" example:

1. There is clear scientific evidence that the anti-malarial drug hydroxychloroquine is a safe and effective treatment for COVID-19.

2. There is not clear scientific evidence that the anti-malarial drug hydroxychloroquine is a safe and effective treatment for COVID-19.

In April 2020, the FDA indicated that "Hydroxychloroquine and chloroquine...have not been shown to be safe and effective for treating or preventing COVID-19", so the "deniers" who think that there is zero evidence available to support HCQ as a covid-19 treatment will presumably not be coded as "misinformed".

One more example (not labeled "misinfo"), from the pre-election survey:

During the past few months, would you say that most of the actions taken by protestors to get the things they want have been violent, or have most of these actions by protesters been peaceful, or have these actions been equally violent and peaceful?

[If the response is "mostly violent" or "mostly peaceful":]

Have the actions of protestors been a lot more or only a little more [violent/peaceful]?

I think that this item might refer to the well-publicized finding that "about 93% of racial justice protests in the US have been peaceful", so that the correct response combination is "mostly peaceful"/"a lot more peaceful" and, thus, the only error that the item permits is overestimating how violent the protests were.

For the above items, I think that the response options disfavor the political right, because I expect that a higher percentage of persons on the political right than the political left will deny Russian interference in the 2016 presidential election, deny climate change, overestimate the evidence for HCQ as a covid-19 treatment, and overestimate how violent recent pre-election protests were.

But I also think that persons on the political left will be more likely than persons on the political right to make the types of errors that the items do not permit to be measured, such as overestimating climate change over the past 100 years.

Other items marked "misinfo" involved vaccines causing autism, covid-19 being developed intentionally in a lab, and whether the Obama administration or the Trump administration deported more unauthorized immigrants during its first three years.

I didn't see an ANES 2020 item about whether the Obama administration or the Trump administration built the temporary holding enclosures ("cages") for migrant children, which I think would be similar to the deportations item, in that people not paying close attention to the news might get the item incorrect.

Maybe a convincing case could be made that ANES 2020 contains an equivalent number of items with limited response options disfavoring the political left as disfavoring the political right, but I don't think that it matters whether political bias in individual items cancels out, because any political bias in individual items is worth eliminating, if possible.

---

3

ANES 2020 has an item that I think alludes to President's Trump's phone call with the Ukrainian president. Here is a key passage from the transcript of the call:

The other thing, There's a lot of talk about Biden's son, that Biden stopped the prosecution and a lot of people want to find out about that so whatever you can do with the Attorney General would be great. Biden went around bragging that he stopped the prosecution so if you can look into it...It sounds horrible to me.

Here is an ANES 2020 item:

As far as you know, did President Trump ask the Ukrainian president to investigate President Trump's political rivals, did he not ask for an investigation, or are you not sure?

I'm presuming that the intent of the item is that a correct response is that Trump did ask for such an investigation. But, if this item refers to only Trump asking the Ukrainian president to look into a specific thing that Joe Biden did, it's inaccurate to phrase the item as if Trump asked the Ukrainian president to investigate Trump's political rivals *in general*, which is what the plural "rivals" indicates.

---

4

I think that the best available evidence indicates that immigrants do not increase the crime rate in the United States (pre-2020 citation) and that illegal immigration reduces the crime rate in the United States (pre-2020 citation). Here is an "agree strongly" to "disagree strongly" item from ANES 2020:

Immigrants increase crime rates in the United States.

Another ANES 2020 item:

Does illegal immigration increase, decrease, or have no effect on the crime rate in the U.S.?

I think that the correct responses to these items are the responses that a stereotypical liberal would be more likely to *want* to be true, compared to a stereotypical Trump supporter.

But I don't think that the U.S. violent crime statistics by race reflect the patterns that a stereotypical liberal would be more likely to want to be true, compared to a stereotypical Trump supporter.

Perhaps coincidentally, instead of an item about racial differences in violent crime rates for which responses could be correctly described as consistent or inconsistent with available mainstream research, ANES 2020 has stereotype items about how "violent" different racial groups are in general, which I think survey researchers will be much less likely to perceive to be addressed in mainstream research and will instead use to measure racism.

---

The above examples of what I think are political biases are relatively minor in comparison to the value that ANES 2020 looks like it will provide. For what it's worth, I think that the ANES is preferable to the CCES Common Content.

Tagged with: , , , ,

Participants in studies reported on in Regina Bateson's 2020 Perspectives on Politics article "Strategic Discrimination" were asked to indicate the percentage of other Americans that the participant thought would not vote for a woman for president and the percentage of other Americans that the participant thought would not vote for a black person for president.

Bateson 2020 Figure 1 reports that, in the nationally representative Study 1 sample, mean participant estimates were that 47% of other Americans would not vote for a woman for president and that 42% of other Americans would not vote for a black person for president. I was interested in the distribution of responses, so I plotted in the histograms below participant estimates to these items, using the Bateson 2020 data for Study 1.

This first set of histograms is for all participants:

This second set of histograms is for only participants who passed the attention check:

---

I was also interested in estimates from participsnts with a graduate degree, given that so many people in political science have a graduate degree. Bateson 2020 Appendix Table 1.33 indicates that, among participants with a graduate degree, estimates were that 58.3% of other Americans would not vote for a woman for president and that 56.6% of other Americans would not vote for a black person for president.

But these estimates differ depending on whether the participant correctly responded to the attention check item: for the item about the percentage of other Americans who would not vote for a woman for president, the mean estimate was 47% [42, 52] for the 84 graduate degree participants who correctly responded to the attention check and was 68% [63, 73] for the 97 graduate degree participants who did not correctly respond to the attention check; for the item about the percentage of other Americans who would not vote for a black person for president, respective estimates were 44% [39, 49] and 67% [62, 73].

Participants who reported having a graduate degree were 20 percentage points more likely to fail the attention check than participants who did not report having a graduate degree, p<0.001.

---

These data were collected in May 2019, after Barack Obama had been elected president twice and after Hillary Clinton won the popular vote for president, and each aforementioned mean estimate seems to be a substantial overestimate of discrimination against women presidential candidates and Black presidential candidates, compared to point estimates from relevant list experiments reported in Carmines and Schmidt 2020 and compared to point estimates from list experiments and direct questions cited in Bateson 2020 footnote 8.

---

NOTES

1. Stata code for my analysis.

2. R code for the first histogram.

Tagged with: , ,