you’re doing it wrong

Comments on Deckman and Cassese 2021 "Gendered nationalism and the 2016 US presidential election"

By L.J Zigerell Posted on November 14, 2022 Posted in Sex No Comments

Politics & Gender published Deckman and Cassese 2021 "Gendered nationalism and the 2016 US presidential election", which, in 2022, shared an award for the best article published in Politics & Gender the prior year.

---

So what is gendered nationalism? From Deckman and Cassese 2021 (p. 281):

Rather than focus on voters' sense of their own masculinity and femininity, we consider whether voters characterized American society as masculine or feminine and whether this macro-level gendering, or gendered nationalism as we call it, had political implications in the 2016 presidential election.

So how is this characterization of American society as masculine or feminine measured? The Deckman and Cassese 2021 online appendix indicates that gendered nationalism is...

Measured with a single survey item asking whether "Society as a whole has become too soft and feminine." Responses were provided on a four-point Likert scale ranging from strongly disagree to strongly agree.

So the measure of "whether voters characterized American society as masculine or feminine" (p. 281) ranged from the characterization that American society is (too) feminine to the characterization that American society is...not (too) feminine. The "(too)" is because I suspect that respondents might interpret the "too" in "too soft and feminine" as also applying to "feminine", but I'm not sure it matters much.

Regardless, there are at least three potential relevant characterizations: American society is feminine, masculine, or neither feminine nor masculine. It seems like a poor research design to combine two of these characterizations.

---

Deckman and Cassese 2021 also described gendered nationalism as (p. 278):

Our project diverges from this work by focusing on beliefs about the gendered nature of American society as a whole—a sense of whether society is 'appropriately' masculine or has grown too soft and feminine.

But disagreement with the characterization that "Society as a whole has become too soft and feminine" doesn't necessarily indicate a characterization that society is "appropriately" masculine, because a respondent could believe that society is too masculine or that society is neither feminine nor masculine.

Omission of a response option indicating a belief that American society is (too) masculine might have made it easier for Deckman and Cassese 2021 to claim that "we suppose that those who rejected gendered nationalism were likely more inclined to vote for Hillary Clinton" (p. 282), as if only the measured "too soft and feminine" characterization is acceptance of "gendered nationalism" and not the unmeasured characterization that American society is (too) masculine.

---

Regression results in Table 2 of Deckman and Cassese 2021 indicate that gendered nationalism predicts a vote for Trump over Clinton in 2016, net of controls for political party, a single measure of political ideology, and demographics such as class, race, and education.

Gendered nationalism is the only specific belief in the regression, and Deckman and Cassese 2021 reports no evidence about whether "beliefs about the gendered nature of American society as a whole" has any explanatory power above other beliefs about gender, such as gender roles and animus toward particular genders.

---

Deckman and Cassese 2021 reported on four categories of class: lower class, working class, middle class, and upper class. Deckman and Cassese 2021 hypothesis H2 is that:

Gendered nationalism is more common among working-class men and women than among men and women with other socioeconomic class identifications.

For such situations, in which the hypothesis is that one of four categories is distinctive, the most straightforward approach is to omit from the regressions the hypothesized distinctive category, because then the p-values and coefficients for each of the three included categories will provide information about the evidence that that included category differs from the omitted category.

But the regressions in Deckman and Cassese 2021 omitted middle class, and, based on the middle model in Table 1, Deckman and Cassese 2021 concluded that:

Working-class Democrats were significantly more likely to agree that the United States has grown too soft and feminine, consistent with H2.

But the coefficients and standard errors were 0.57 and 0.26 for working class and 0.31 and 0.40 for lower class, so I'm not sure that the analysis in Table 1 contained enough evidence that the 0.57 estimate for working class differs from the 0.31 estimate for lower class.

---

I think that Deckman and Cassese 2021 might have also misdescribed the class results in the Conclusions section, in the passage below, which doesn't seem limited to Democrat participants. From p. 295:

In particular, the finding that working-class voters held distinctive views on gendered nationalism is compelling given that many accounts of voting behavior in 2016 emphasized support for Donald Trump among the (white) working class.

For that "distinctive" claim, Deckman and Cassese 2021 seemed to reference differences in statistical significance (p. 289, footnote omitted):

The upper- and lower-class respondents did not differ from middle-class respondents in their endorsement of gendered nationalism beliefs. However, people who identified as working class were significantly more likely to agree that the United States has grown too soft and feminine, though the effect was marginally significant (p = .09) in a two-tailed test. This finding supports the idea that working-class voters hold a distinctive set of beliefs about gender and responded to the gender dynamics in the campaign with heightened support for Donald Trump’s candidacy, consistent with H2.

In the Table 1 baseline model predicting gendered nationalism without interactions, ologit coefficients are 0.25 for working class and 0.26 for lower class, so I'm not sure that there is sufficient evidence that working class views on gendered nationalism were distinctive from lower class views on gendered nationalism, even though the evidence is stronger that the 0.25 working class coefficient differs from zero than the 0.26 lower class coefficient differs from zero.

Looks like the survey's pre-election wave had at least twice as many working class respondents as lower class respondents. If that ratio was similar for the post-election wave, that would explain the difference in statistical significance and explain why the standard error was smaller for the working class (0.15) than for the lower class (0.23). Search for "class" at the PRRI site and use the PRRI/The Atlantic 2016 White Working Class Survey.

---

At least Deckman and Cassese 2021 interpreted the positive coefficient on the interaction of college and Republican as an estimate of how the association of college and the outcome among Republicans differed from the association of college and the outcome among the omitted category.

But I'm not sure of the justification for "largely" in Deckman and Cassese 2021 (p. 293):

Thus, in accordance with our mediation hypothesis (H5), gender differences in beliefs that the United States has grown too soft and feminine largely account for the gender gap in support for Donald Trump in 2016.

Inclusion of the predictor for gendered nationalism pretty much only halves the logit coefficient for "female", from 0.80 to 0.42, and, in Figure 3, the gender gap in predicted probability of a Trump vote is pretty much only cut in half, too. I wouldn't call about half "largely", especially without addressing the obvious confound of attitudes about men and women that have nothing to do with "gendered nationalism".

---

Deckman and Cassese 2021 was selected for a best article award by the editorial board of Politics & Gender. From my prior posts on publications in Politics & Gender: p < .000, misinterpreted interaction terms, and an example of the difference in statistical signifiance being used to infer an difference in effect.

---

NOTES

1. Prior post mentioning Deckman and Cassese 2021.

2. Prior post on deviations from a preregistration plan, for Cassese and Barnes 2017.

3. "Gendered nationalism" is an example of use of a general term when a better approach would be specificity, such as a measure that separates "masculine nationalism" from "feminine nationalism". Another example is racial resentment, in which a general term is used to describe only the type of racial resentment directed at Blacks. Feel free to read through participant comments in the Kam and Burge survey, in which plenty of comments from respondents who score low on the racial resentment scale indicate resentment directed at Whites.

Tagged with: methods, public opinion, sex, you're doing it wrong

Comments on Young et al 2022 "…misperceptions of COVID-19 and the 2020 U.S. presidential election"

By L.J Zigerell Posted on November 13, 2022 Posted in Misperceptions No Comments

The Journal of Social and Political Psychology recently published Young et al 2022 "'I feel it in my gut:' Epistemic motivations, political beliefs, and misperceptions of COVID-19 and the 2020 U.S. presidential election", which reported in its abstract that:

Results from a US national survey from Nov-Dec 2020 illustrate that Republicans, conservatives, and those favorable towards President Trump held greater misperceptions about COVID and the 2020 election.

Young et al 2022 contains two shortcomings of too much social science: bias and error.

---

In Young et al 2022, the selection of items measuring misperceptions is biased toward things that the political right is more likely than the political left to indicate a misperception about, so that the most that we can conclude from Young et al 2022 is that the political right more often reported misperceptions about things that the political right is more likely to report misperceptions about.

Young et al 2022 seems to acknowledge this research design flaw in the paragraph starting with:

Given the political valence of both COVID and election misinformation, these relationships might not apply to belief in liberal-serving misinformation.

But it's not clear to me why some misinformation about covid can't be liberal-serving. At least, there are misperceptions about covid that are presumably more common among the political left than among the political right.

For example, the eight-item Young et al 2022 covid misperceptions battery contains two items that permit respondents to underestimate the seriousness of covid-19: "Coronavirus (COVID-19 is a hoax" [sic for the unmatched parenthesis], and "The flu is more lethal than coronavirus (COVID-19)". But the battery doesn't contain corresponding items that permit respondents to overestimate the seriousness of covid-19.

Presumably, a higher percentage of the political left than the political right overestimated the seriousness of covid-19 at the time of the survey in late 2020, given that, in a different publication, a somewhat different Young et al team indicated that:

Results from a national survey of U.S. adults from Nov-Dec 2020 suggest that Trump favorability was...negatively associated with self-reported mask-wearing.

Another misperception measured in the survey is that "Asian American people are more likely to carry the virus than other people", which was not a true statement at the time. But, from what I can tell, at the time of the survey, covid rates in the United States were higher among Hispanics than among Whites, which presumably means that Hispanic Americans were more likely to carry the virus than White Americans. It's not clear to me why misinformation about the covid rate among Asians should be prioritized over misinformation about the covid rate among Hispanics, although, if someone wanted to bias the research design against the political right, that priority would make sense.

---

Similar flaw with the Young et al 2022 election 2020 misperceptions battery, which had an item that permits overestimation of the detected voter fraud ("There was widespread voter fraud in the 2020 Presidential election"), but had no item that would permit underestimation of voter fraud in 2020 (e.g., "There was no voter fraud in the 2020 Presidential election"), which is the type of error that the political left would presumably be more likely to make.

For another example, Young et al 2022 had a reverse-coded misperceptions item for "We can never be sure that Biden's win was legitimate", but had no item about whether we can be sure that Trump's 2016 win was legitimate, which would be an obvious item to pair with the Biden item to assess whether the political right and the political left are equally misinformed or at least equally likely to give insincere responses to surveys that have items such as "The coronavirus (COVID-19) vaccine will be used to implant people with microchips".

---

So I think it's less, as Young et al 2022 suggested, that "COVID misinformation and election misinformation both served Republican political goals", and more that the selection of misinformation items in Young et al 2022 was biased toward a liberal-serving conclusion.

Of course, it's entirely possible that the political right is more misinformed than the political left in general or on selected topics. But it's not clear to me how Young et al 2022 can provide a valid inference about that.

---

For error, Young et al 2022 Table 3 has an unstandardized coefficient for Black race, indicating that, in the age 50 and older group, being Black corresponded to higher levels of Republicanism. I'm guessing that this coefficient is missing a negative sign, given that there is a negative sign on the standardized coefficient...The Table 2 income predictor for the age 18-49 group has an unstandardized coefficient of .04 and a standard error of .01, but no statistical significance asterisk, and has a standardized coefficient of .00, which I think might be too low...And the appendix indicates that "The analysis yielded two factors with Eigenvalues < 1.", but I think that should be a greater than symbol.

None of those potential errors are particularly important, except perhaps for inferences about phenomena such as the rigor of the peer and editorial review that Young et al 2022 went through.

---

NOTES

1. Footnotes 3 and 4 of Young et al 2022 indicate that:

Consistent with Vraga and Bode (2020), misperceptions were operationalized as COVID-related beliefs that contradicted the "best available evidence" and/or "expert consensus" at the time data were gathered.

If the purpose is to assess whether "I feel it in my gut" people are incorrect, then the perceptions should be shown to be incorrect and not merely in contradiction to expert consensus or, for that matter, in contradiction to the best available evidence.

2. The funding statement for Young et al 2022 indicates that the study was funded by the National Institute of Aging.

3. Prior posts on politically biased selection of misinformation items, in Abrajano and Lajevardi 2021 and in the American National Election Studies 2020 Time Series Study.

4. After I started drafting the above post, Social Science Quarterly published Benegal and Motta 2022 "Overconfident, resentful, and misinformed: How racial animus motivates confidence in false beliefs", which used the politically biased ANES misinformation items, in which, for example, respondents who agree that "World temperatures have not risen on average over the last 100 years" get coded as misinformed (an error presumably more common on the political right) but respondents who wildly overestimate the amount of climate change over the past 100 years don't get coded as misinformed (an error presumably more common on the political left).

5. I might be crazy, but I think that research about the correlates of misperceptions should identify respondents who have correct perceptions instead of merely identifying respondents who have particular misperceptions.

And I don't think that researchers should place particular misperceptions into the same category as the correct perception, such as by asking respondents merely whether world temperatures have risen on average over the last 100 years, any more than researchers should ask respondents merely whether world temperatures have risen on average by at least 3 degrees Celsius over the last 100 years, for which agreement would be the misperception.

Tagged with: methods, misperceptions, public opinion, you're doing it wrong

More bad peer/editorial review

By L.J Zigerell Posted on October 24, 2022 Posted in Peer review No Comments

I reached ten new publications to comment on that I didn't think were worth a separate blog post, so here goes:

---

The Twitter account for the journal Politics, Groups, and Identities retweeted R.G. Cravens linking to two of his articles in Politics, Groups, and Identities. I blogged about one of these articles, discussing, among other things, the article's erroneous interpretation of interaction terms. The other article that R.G. Cravens linked to in that tweet ("The view from the top: Social acceptance and ideological conservatism among sexual minorities") also misinterpreted an interaction term:

However, the coefficient estimate for the interaction term between racial minority identity and racial identity group consciousness (β = −.312, p = .000), showing the effect of racial identity group consciousness only among racial minority respondents, indicates a negative relationship between racial minority group consciousness and conservatism at the 99% confidence level.

The corresponding Table 1 coefficient for RI Consciousness is 0.117, indicating the estimated effect of racial identity consciousness when the "Racial minority" variable is set to zero. The -0.312 interaction term indicates how much the estimated effect of racial identity consciousness *differs* between non-racial minorities and racial minorities, so that the estimated effect of racial identity consciousness among racial minorities is 0.117 plus -0.312, which is -0.195.

Two articles by one author in the same journal within three years, and each article misinterpreted an interaction term.

---

PS: Political Science & Politics published another article about student evaluations of teaching: Foster 2022 "Instructor name preference and student evaluations of instruction". The key finding seems plausible, that "SEIs were higher for instructors who preferred going by their first name...than for instructors who preferred going by 'Dr. Moore'" (p. 4).

But a few shortcomings about the reporting on the experiment in Study 2, which manipulated the race of an instructor, the gender of the instructor, and the instructor's stated preference for using his/her first name versus using his/her title and last name:

* Hypothesis 5 is about conservative Republicans:

Moderated mediation: We predict that female instructors who express a preference for going by "Dr. Moore" will have lower teacher ratings through decreased perceived immediacy, but only for students who identify as conservative and Republican.

But, as far as I can tell, the article doesn't report any data about Hypothesis 5.

* Table 2 indicates a positive p<0.05 correlation between the race of the instructor and SEIs (student evaluations of instruction) and a positive p<0.05 correlation between the race of the instructor and course evaluations. But, as far as I can tell, the article doesn't report how the race variable was coded, so it's not clear whether the White instructors or the Black instructors had the higher SEIs and course evaluations.

* The abstract indicates that:

Study 2 found the highest SEIs for Black male instructors when instructors asked students to call them by their first name, but there was a decrease in SEI scores if they went by their professional title.

But, as far as I can tell, the article doesn't report sufficient evidence about whether the estimated influence of the name preference among the Black male instructor targets differed from the estimated influence of the name preference among any of the comparison instructors. The p-value being under p=0.05 for the Black male instructor targets and not being under p=0.05 for the other instructor targets isn't enough evidence to infer at p<0.05 that participants treated the Black male instructor targets differently than participants treated the comparison instructor targets, so that the article doesn't report sufficient evidence to permit an inference of racial discrimination.

---

The Journal of Race, Ethnicity, and Politics published Enders and Thornton 2022 "Biased interviewer assessments of respondent knowledge based on perceptions of skin tone", which used data from the 2012 American National Election Studies Time Series Study, which seems really similar to the analysis in Hannon 2015 "White colorism". From Hannon 2015:

Using data from the 2012 American National Election Study, an example is presented on white interviewers' perceptions of minority respondent skin tone and intelligence (N = 223). Results from ordinal logistic regression analyses indicate that African American and Latino respondents with the lightest skin are several times more likely to be seen by whites as intelligent compared with those with the darkest skin.

Enders and Thornton 2022 cited three publications that had Hannon as a co-author: Hannon L and Deffna R 2014, Hannon L and DeFina R 2016, and Hannon L, Defina R and Bruch S 2013. But no citation to Hannon 2015.

And if it seems coincidental that Hannon would have "Deffna R", "DeFina R", and "Defina R" as co-authors, these all seem to be Robert DeFina.

---

From the abstract of Evans et al 2022 "Regions of discrimination: felony records, race, and expressed college admissions policies", in the Journal of Crime and Justice, about an audit experiment:

Findings indicate that admissions departments are more likely to tell an interested applicant with a stereotypical Black name and a non-violent felony record that their criminal histories will be considered in the application process compared to another prospective applicant with a stereotypical White name and non-violent felony record.

But coded responses in Table 3 indicate that 41% of responses to "Tyrone" indicated that a felony conviction is not considered and that 41% of responses to "Christopher" indicated that a felony conviction is not considered.

Table 4 indicates that, in an analysis with statistical control, "Tyrone" was more likely than "Christopher" to be told that a felony conviction is considered in a holistic review than to be told that a felony conviction is not considered, but that Table 4 analysis split the coding into three categories, of "Not considered", "Considered in a holistic review", and "Considered in a discretionary review", so that analysis doesn't seem to support the general abstract claim about "considered in the application process".

And I'm not sure that there is an important distinction between holistic and discretionary. Here is an example that Evans et al 2022 indicates is "discretionary":

Thank you for your interest and for reaching out. ____ is a residential college and students are required to live on campus all four years. We must ensure the safety of all of our constituents so we do take into account criminal history when reviewing for admission (coded as discretionary)

So it's not like "discretionary" means only that the criminal history might be taken into account.

---

I wasn't the only person to notice this next one (see tweets from Tom Pepinsky and Brendan Nyhan), but Politics & Gender recently published Forman-Rabinovici and Mandel 2022 "The prevalence and implications of gender blindness in quantitative political science research", which indicated that:

Our findings show that gender-sensitive analysis yields more accurate and useful results. In two out of the three articles we tested, gender-sensitive analysis indeed led to different outcomes that changed the ramifications for theory building as a result.

But the inferential technique in the analysis reflected a common error.

For the first of the three aforementioned articles (Gandhi and Ong 2019), Table 1a of Forman-Rabinovici and Mandel 2022 reported results with a key coefficient that was -.308 across the sample, was -.294 (p=.003) among men in the sample, and was -.334 (p=.154) among women in the sample. These estimates are from a linear probability model predicting a dichotomous "Support PH" outcome, so the point estimates were 29 percentage points among men and 33 percentage points among women.

The estimate was more extreme among women than among men, but the estimate was less precise among women than among men, at least partly because the sample size among men (N=1902) was about three times the sample size among women (N=652).

Figure 1 of Forman-Rabinovici and Mandel 2022 described these results as:

Male voters leave PH coalition

Female voters continue to vote for PH coalition

But, in my analysis of the data, the ends of the 95% confidence interval for the estimate among women indicated an 82 percentage point decrease and a 15 percentage point increase [-0.82, +0.15], so that's not nearly enough evidence to infer a lack of an effect among women.

---

Politics & Gender published another article that has at least a misleading interpretation of interaction terms: Kreutzer 2022 "Women's support shaken: A study of women's political trust after natural disasters".

Table 1 reports results for three multilevel mixed-effects linear regressions, with coefficients on a "Number of Disasters Present" predictor of 0.017, 0.009, and 0.022. The models have a predictor for "Female" and an interaction of "Female" and "Number of Disasters Present" with interaction coefficients of –0.001, –0.002, and –0.001. So the combination of coefficients indicates that the associations of "Number of Disasters Present" and the "trust" outcomes are positive among women, but not as positive as the associations are among men.

Kreutzer 2022 discusses this correctly in some places, such as indicating that the interaction term "allows a comparison of how disasters influence women's political trust compared with men's trust" (p. 15). But in other places the interpretation is, I think, incorrect or at least misleading, such as in the abstract (emphasis added):

I investigate women's trust in government institutions when natural disasters have recently occurred and argue that because of their unique experiences and typical government responses, women's political trust will decline when there is a natural disaster more than men's. I find that when there is a high number of disasters and when a larger percentage of the population is affected by disasters, women's political trust decreases significantly, especially institutional trust.

Or from page 23:

I have demonstrated that natural disasters create unique and vulnerable situations for women that cause their trust in government to decline.

And discussing Figure 5, referring to a different set of three regressions (reference to footnote 12 omitted):

The figure shows a small decline in women's trust (overall, institutional, organizational) as the percentage of the population affected by disasters in the country increases. The effect is significantly different from 0, but the percentage affected seems not to make a difference.

That seems to say that the percentage of the population affected has an effect that is simultaneously not zero and does not seem to make a difference. I think Figure 5 marginal effects plots indicate that women have lower trust than men (which is why each point estimate line falls in the negative range), but that this gender difference in trust does not vary much by the percentage of the population affected (which is why the each point estimate line is pretty much flat).

---

The "Women's Political Empowerment Index" coefficient and standard error are –0.017 and 0.108 in Model 4, so maybe the ** indicating a two-tailed p<0.01 is an error.

Tweet to the author (Oct 3). No reply yet.

---

7, 8.

Let's return to Politics, Groups, and Identities, for Ditonto 2019 "Direct and indirect effects of prejudice: sexism, information, and voting behavior in political campaigns". From the abstract:

I also find that subjects high in sexism search for less information about women candidates...

At least in the reported analyses, the comparison for "less" is to participants low in sexism instead of to male candidates. So we get this result discussing Table 2 (pp. 598-599):

Those who scored lowest in sexism are predicted to look at approximately 13 unique information boxes for the female candidate, while those who scored highest are predicted to access about 10 items, or almost 1/3 less.

It should be obvious to peer reviewers and any editors that a comparison to the male candidates in the experiment would be a more useful comparison for assessing the effect of sexism, because, for all we know, respondents high in sexism might search for less information than respondents low in sexism search for, no matter the gender of the candidate.

Ditonto has another 2019 article in a different journal (Political Psychology) based on the same experiment: "The mediating role of information search in the relationship between prejudice and voting behavior". From that abstract:

I also find that subjects high in prejudice search for less information about minority candidates...

But, again, Table 2 in that article merely indicates that symbolic racism negatively associates with information search for a minority candidate, with no information provided about information search for a non-minority candidate.

---

And I think that the Ditonto 2019 abstracts include claims that aren't supported by results reported in the article. The PGI abstract claims that "I find that subjects with higher scores on items measuring modern sexism...rate female candidates more negatively than their male counterparts", and the PP abstract claims that "I find that subjects higher in symbolic racism...rate minority candidates more negatively than their white counterparts".

By the way, claims about respondents high in sexism or racism should be assessed using data only from respondents high in sexism or racism, because the association of a sexism or racism measure with an outcome might be completely due to respondents low in sexism or racism.

Tweet to the author (Oct 9). No reply yet.

---

Below is a passage from "Lower test scores from wildfire smoke exposure", by Jeff Wen and Marshall Burke, published in 2022 in Nature Sustainability:

When we consider the cumulative losses over all study years and across subgroups (Fig. 4b), we estimate the net present value of lost future income to be roughly $544 million (95% CI: −$999 million to −$100 million) from smoke PM2.5 exposure in 2016 for districts with low economic disadvantage and low proportion of non-White students. For districts with high economic disadvantage and high proportion of non-White students, we estimate cumulative impacts to be $1.4 billion (95% CI: −$2.3 billion to −$477 million) from cumulative smoke PM2.5 exposure in 2016. Thus, of the roughly $1.7 billion in total costs during the smokiest year in our sample, 82% of the costs we estimate were borne by economically disadvantaged communities of colour.

So, in 2016, the lost future income was about $0.5 billion for low economic disadvantage / low non-White districts and $1.4 billion for high economic disadvantage / high non-White districts; that gets us to $1.9 billion, without even including the costs from low/high districts and high/low districts. But total costs were cited as roughly $1.7 billion.

From what I can tell from Figure 4b, the percentage of total costs attributed to economically disadvantaged communities of color (the high / high category) is 59%. It's not a large inferential difference from 82%, in that both estimates are a majority, but it's another example of an error that could have been caught by careful reading.

Tweet to the authors about this (Oct 17). No reply yet.

---

10.

Political Research Quarterly published "Opening the Attitudinal Black Box: Three Dimensions of Latin American Elites' Attitudes about Gender Equality", by Amy Alexander, Asbel Bohigues, and Jennifer M. Piscopo.

I was curious about the study's measurement of attitudes about gender equality, and, not unexpectedly, the measurement was not good, using items such as "In general, men make better political leaders than women", in which respondents can agree that men make better political leaders, can disagree that men make better political leaders, and can be neutral about the claim that men make better political leaders...but respondents cannot report the belief that, in general, women make better political leaders than men do.

I checked the data, in case almost no respondent disagreed with the statement that "In general, men make better political leaders than women", in which case presumably no respondent would think that women make better political leaders than men do. But disagreement with the statement was pretty high, with 69% strongly disagreeing, another 15% disagreeing, and another 11% selecting neither agree nor disagree.

I tweeted a question about this to some of the authors (Oct 21). No reply yet.

Tagged with: bad peer review, inequality, race, sex, you're doing it wrong

The scientific study of White people's biases

By L.J Zigerell Posted on October 4, 2022 Posted in Race No Comments

Social Science & Medicine published Skinner-Dorkenoo et al 2022 "Highlighting COVID-19 racial disparities can reduce support for safety precautions among White U.S. residents", with data for Study 1 fielded in September 2020. Stephens-Dougan had a similar Time-sharing Experiments for the Social Sciences study "Backlash effect? White Americans' response to the coronavirus pandemic", fielded starting in late April 2020 according to the TESS page for the study.

You can check tweets about Skinner-Dorkenoo et al 2022 and what some tweeters said about White people. But you can't tell from the Skinner-Dorkenoo et al 2022 publication or the Stephens-Dougan 2022 APSR article whether any detected effect is distinctive to White people.

Limiting samples to Whites doesn't seem to be a good idea if the purpose is to understand racial bias. But it might be naive to think that all social science research is designed to understand.

---

There might be circumstances in which it's justified to limit a study of racial bias to White participants, but I don't think such circumstances include:

* The Kirgios et al 2022 audit study that experimentally manipulated the race and gender of an email requester, but for which "Participants were 2,476 White male city councillors serving in cities across the United States". In late April, I tweeted a question to the first author of Kirgios et al 2022 about why the city councilor sample was limited to White men, but I haven't yet gotten a reply.

* Studies that collect sufficient data on non-White participants but do not report results from these data in the eventual publications (examples here and here).

* Proposals for federally funded experiments that request that the sample be limited to White participants, such as in the Stephens-Dougan 2020 proposal: "I want to test whether White Americans may be more resistant to efforts to curb the virus and more supportive of protests to reopen states when the crisis is framed as disproportionately harming African Americans".

---

One benefit of not limiting the subject pool by race is to limit unfair criticism of entire racial groups. For example, according to the analysis below from Bracic et al 2022, White nationalism among non-Whites was at least as influential as White nationalism among Whites in predicting support for a family separation policy net of controls:So, to the extent that White nationalism is responsible for support for the family separation policy, that applies to White respondents and to non-White respondents.

Of course, Bracic et al. 2022 doesn't report how the association for White nationalism compares to the association for, say, Black nationalism or Hispanic nationalism or how the association for the gendered nationalist belief that "the nation has gotten too soft and feminine" compares to the association for the gendered nationalist belief that, say, "the nation is too rough and masculine".

---

And consider this suggestion from Rice et al 2022 to use racial resentment items to screen Whites for jury service:

At the practical level, our research raises important empirical and normative questions related to the use of racial resentment items during jury selection in criminal trials. If racial resentment affects jurors' votes and reasoning, should racial resentment items be used to screen white potential jurors?

Given evidence suggesting that Black juror bias is on average at least as large as White juror bias, I don't perceive a good justification to limit this suggestion to White potential jurors, although I think that the Rice et al decision to not report results for Black mock jurors makes it easier to limit this suggestion to White potential jurors.

---

NOTES

1. I caught two flaws in Skinner-Dorkenoo et al 2022, which I discussed on Twitter: [1] For the three empathy items, more than 700 respondents selected "somewhat agree", more than 500 selected "strongly agree", but no respondent selected "agree", suggesting that the data were miscoded. [2] The p-value under p=0.05 for the empathy inference appears to be because the analysis controlled for a post-treatment measure; see the second model referred to by the lead author in the Twitter thread. I didn't conduct a full check of the Skinner-Dorkenoo et al 2022 analysis. Stata code and output for my analyses of Skinner-Dorkenoo et al 2022, with data here. Note the end of the output, indicating that the post-treatment control was affected by the treatment.

2. I have a prior post about the Stephens-Dougan TESS survey experiment reported on in the APSR that had substantial deviations from the pre-analysis plan. On May 31, I contacted the APSR about that and the error discussed at the post. I received an update in September, but the Stephens-Dougan 2022 APSR article hasn't been corrected as of Oct 2.

Tagged with: public opinion, race, racial resentment, you're doing it wrong

Comments on Dietrich and Hayes 2022 "Race and symbolic politics in the US Congress"

By L.J Zigerell Posted on August 16, 2022 Posted in Race No Comments

PS: Political Science & Politics published Dietrich and Hayes 2022 "Race and Symbolic Politics in the US Congress" as part of a "Research on Race and Ethnicity in Legislative Studies" section with guest editors Tiffany D. Barnes and Christopher J. Clark.

---

Dietrich and Hayes 2022 reported on an experiment in which a representative was randomized to be White or Black, the representative's speech was randomized to be about civil rights or renewable energy, and the representative's speech was randomized to include or not include symbolic references to the Civil Rights Movement. Dietrich and Hayes 2022 noted (p. 283) that:

When those same symbols were used outside of the domain of civil rights, however, white representatives received a significant punishment. That is, Black respondents were significantly more negative in their evaluations of white representatives who (mis-)used civil rights symbolism to advance renewable energy than in any other experimental condition.

The only numeric results that Dietrich and Hayes 2022 reported for this in the main text are in Figure 1, for an approval rating outcome. But the data file seems to have at least four potential outcomes: the symbolic_approval outcome (strongly disapprove to strongly approve), and the next three listed variables: symbolic_vote (extremely likely to extremely unlikely), symbolic_care (none to a lot), and symbolic_thermometer (0 to 100). The supplemental files have a figure that reports results for a dv_therm variable, but that figure doesn't report results for renewable energy separately by symbolic and non-symbolic.

---

Another result reported in Dietrich and Hayes 2022 involved Civil Rights Movement symbolism in U.S. House of Representatives floor speeches that mentioned civil rights:

In addition to influencing African Americans' evaluation of representatives, our research shows that symbolic references to the civil rights struggle are linked to Black voter turnout. Using an analysis of validated voter turnout from the 2006–2018 Cooperative Election Study, our analyses suggest that increases in the number of symbolic speeches given by a member of Congress during a given session are associated with an increase in Black turnout in the subsequent congressional election. Our model predicts that increasing from the minimum of symbolic speeches in the previous Congress to the maximum in the current Congress is associated with a 65.67-percentage-point increase in Black voter turnout compared to the previous year.

This estimated 66 percentage point increase is at the congressional district level. Dietrich and Hayes 2022 calculated this estimate using a linear regression that predicted the change in Black turnout in a congressional district with a lagged symbolism percentage of 0 and a symbolism percentage of 1. From their code:

mod1<-lm(I(black_turnout-lag_black_turnout)~I(symbolic_percent-lag_symbolic_percent),data=cces)

print(round(predict(mod1,data.frame(symbolic_percent=1,lag_symbolic_percent=0))*100,2))

The estimated change in Black turnout was 85 percentage points when I modified the code to have a lagged symbolism percentage of 1 and a symbolism percentage of 0.

---

These estimated changes in Black turnout of 66 and 85 percentage points seemed implausible as causal estimates, and I'm not even sure that these are correct correlational estimates, based on the data in the "cces_turnout_results.csv" dataset in the hayes_dietrich_replication.zip file.

For one thing, the dataset lists symbolic_percent values for Alabama's fourth congressional district by row as 0.017857143, 0.047619048, 0.047619048, 0.013157895, 0.013157895, 0.004608295, 0.004608295, 0.00990099, 0.00990099, 1, 1, 1 , and 1. For speeches that mentioned civil rights, that's a relatively large jump in the percentage of such speeches that used Civil Rights Movement symbolism, from several values under 5% all the way to 100%. And this large jump to 100% is not limited to this congressional district: the mean symbolic_percent values across the full dataset were 0.14 (109th Congress), 0.02 (110th), 0.02 (111th), 0.03 (112th), 0.09 (113th), 1 (114th), and 1 (115th).

Moreover, the repetition in symbolic_percent within a congressional district is also consistent across the data that I checked. So, for the above district, 0.017857143 is for the 109th Congress, the first 0.047619048 is for one year of the 110th Congress, and the second 0.047619048 is for the other year of the 110th Congress, the two 0.013157895 values are for the two years of the 111th Congress, and so forth. From what I can tell, each dataset case is for a given district-year, but symbolic_percent is calculated only every two years, so that a large percentage of the "I(symbolic_percent-lag_symbolic_percent)" predictors are zero because of a research design decision to calculate the percentage of symbolic speeches per Congress and not per year; from what I can tell, these zeros might not be true zeros in which the percentage of symbolic speeches was the same in the given year and the lagged year.

---

For another thing, the "inline_calculations.R" file in the Dietrich and Hayes 2022 replication materials indicates that Black turnout values were based on CCES surveys and indicates that survey sample sizes might be very low for some congressional districts. The file describes a bootstrapping process that was used to produce the Black turnout values, which were then standardized to range from 0 to 1, but, from the description, I'm not sure how that standardization process works.

For instance, if, in one year the CCES has 2 Black participants for a certain congressional district and neither voted (0% turnout), and the next year is a presidential election year and the CCES had 3 Black participants in that district and all three voted (100% turnout), I'm not sure what the bootstrapping process would do to adjust that to get these congressional district Black turnout estimates to be closer to their true values, which presumably are between 0% and 100%. For what it's worth, of the 4,373 rows in the dataset, black_turnout is NA in 545 rows (12%), is 0 in 281 rows (6%), and is 1 in 1,764 rows (40%).

So I'm not sure how the described bootstrapping process adequately addresses the concern that the range of Black turnout values for a congressional district in the dataset is more extreme than the range of true Black turnout values for the congressional district. Maybe the standardization process addresses this in a way that I don't understand, so that 0 and 1 for black_turnout don't represent 0% turnout and 100% turnout, but, if that's the case, then I'm not sure how it would be justified for Dietrich and Hayes 2022 to interpret the aforementioned output of 65.67 as a 65.67 percentage-point increase.

---

NOTES

1. Dietrich and Hayes 2022 indicated that, in the survey experiment, participants were asked "to evaluate a representative on the basis of his or her floor speech", and Dietrich and Hayes 2022 indicated that the experimental manipulation for the representative's race involved "accompanying images of either a white or a Black representative". But the use of "his or her" makes me curious if the representative's gender was also experimentally manipulated.

2. Dietrich and Hayes 2022 Figure 1 reports [approval of the representative in the condition involving Civil Rights Movement symbolism in a speech about civil rights] in the same panel as [approval of the representative in the condition involving Civil Rights symbolism in a speech about renewable energy]. However, for assessing a penalty for use of Civil Rights Movement symbolism in the renewable energy speech, I think that it is more appropriate to compare [approval of the representative in the condition in which the renewable energy speech used Civil Rights Movement symbolism] to [approval of the representative in the condition in which the renewable energy speech did not use Civil Rights Movement symbolism].

If there is a penalty for using Civil Rights Movement symbolism in the speech about renewable energy, that penalty can be compared to the difference in approval between using and not using Civil Rights Movement symbolism in the speech about civil rights, to see whether the penalty in the renewable energy speech condition reflects a generalized penalty for the use of Civil Rights Movement symbolism.

3. On June 27, I emailed Dr. Dietrich and Dr. Hayes a draft of this blog post with an indication that "I thought that, as a courtesy, I would send the draft to you, if you would like to indicate anything in the draft that is unfair or incorrect". I have not yet received a reply, although it's possible that I used incorrect email addresses or my email went to a spam box.

Tagged with: cces, race, you're doing it wrong

Comments on Peay and McNair II 2022 "…The dual influences of #BlackLivesMatter on state-level policing reform adoption"

By L.J Zigerell Posted on August 6, 2022 Posted in Race No Comments

I'll hopefully at some point write a summary that refers to a lot of my "comments" posts. But I have at least a few more to release before then, so here goes...

---

Politics, Groups, and Identities recently published Peay and McNair II 2022 "Concurrent pressures of mass protests: The dual influences of #BlackLivesMatter on state-level policing reform adoption". Peay and McNair II 2022 reported regressions that predicted a count of the number of police reform policies enacted by a state from August 2014 through 2020, using a key predictor of the number of Black Lives Matter protests in a state in the year after the killing of Michael Brown in August 2014.

An obvious concern is that the number of protests in a state is capturing the population size of the state. That's a concern because it's plausible that higher population states have legislatures that are more active than smaller population states, so that we would expect these high population states to tend to enact more policies per se, and not merely to enact more police reform policies. But the Peay and McNair II 2022 analysis does not control for the population size of the state.

I checked the correlation between [1] the number of Black Lives Matter protests in a state in the year after the killing of Michael Brown in August 2014 (data from Trump et al. 2018) and [2] the first list of the number of bills enacted by a state that I happened upon, which was the number of bills a state enacted from 2006 to 2009 relating to childhood obesity. The R-squared was 0.22 for a bivariate OLS regression using the state-level count of BLM protests to predict the state-level count of childhood obesity bills enacted. In comparison, Peay and McNair II 2022 Table 2 indicated that the R-squared was 0.19 in a bivariate OLS regression that used the state-level count of BLM protests to predict the state-level count of police reform policies enacted. So the concern about population size seems at least plausible.

---

This is a separate concern, but Figure 6 of Peay and McNair II 2022 reports predicted probabilities, with an x-axis of the number of protests. My analysis indicated that the number of protests ranged from 0 to 87, with only three states having more than 40 protests: New York at 67, Missouri at 74, and California at 87. Yet the widest the 95% confidence interval gets in Figure 6 is about 1 percentage point, at 87, which is a pretty precise estimate given data for only 50 states and only one state past 74.

Maybe the tight 95% confidence interval is a function of the network analysis for Figure 6, if the analysis, say, treats each potential connection between California and the other 49 states as 49 independent observations. Table 2 of Peay and McNair II 2022 doesn't have a sample size for this analysis, but reports 50 as the number of observations for the other analyses in that table.

---

NOTES

1. Data for my analysis.

2. No reply yet from the authors on Twitter.

Tagged with: bad peer review, race, you're doing it wrong

Comments on Schildkraut and Turanovic 2022 "A New Wave of Mass Shootings? Exploring the Potential Impact of COVID-19"

By L.J Zigerell Posted on July 29, 2022 Posted in Race 2 Comments

Homicide Studies recently published Schildkraut and Turanovic 2022 "A New Wave of Mass Shootings? Exploring the Potential Impact of COVID-19". From the abstract:

Results show that total, private, and public mass shootings increased following the declaration of COVID-19 as a national emergency in March of 2020.

I was curious how Schildkraut and Turanovic 2022 addressed the possible confound of the 25 May 2020 killing of George Floyd.

---

Below is my plot of data used in Schildkraut and Turanovic 2022, for total mass shootings:

My read of the plot is that, until after the killing of George Floyd, there is insufficient evidence that mass shootings were higher in 2020 than in 2019.

Table 1 of Schildkraut and Turanovic 2022 reports an interrupted time series analysis that does not address the killing of George Floyd, with a key estimate of 0.409 and a standard error of 0.072. Schildkraut and Turanovic 2022 reports a separate analysis about George Floyd...

However, since George Floyd's murder occurred after the onset of the COVID-19 declaration, we conducted ITSA using only the post-COVID time period (n = 53 weeks) and used the week of May 25, 2020 as the point of interruption in each time series. These results indicated that George Floyd's murder had no impact on changes in overall mass shootings (b = 0.354, 95% CI [−0.074, 0.781], p = .105) or private mass shootings (b = 0.125, 95% CI [−0.419, 0.669], p = .652), but that Floyd's murder was linked to increases in public mass shootings (b = 0.772, 95% CI [0.062, 1.483], p = .033).

...but Schildkraut and Turanovic 2022 does not report any attempt to assess whether there is sufficient evidence to attribute the increase in mass shootings to covid once the 0.354 estimate for Floyd is addressed. The lack of statistical significance for the 0.354 Floyd estimate can't be used to conclude "no impact", especially given that the analysis for the covid declaration had data for 52 weeks pre-declaration and 53 weeks post-declaration, but the analysis for Floyd had data for only 11 weeks pre-Floyd and 42 weeks post-Floyd.

Schildkraut and Turanovic 2022 also disaggregated mass shootings into public mass shootings and private mass shootings. Corresponding plots by me are below. It doesn't look like the red line for the covid declaration is the break point for the increase in 2020 relative to 2019.

Astral Codex Ten discussed methods used to try to disentangle the effect of covid from the effect of Floyd, such as using for reference prior protests and other countries.

---

NOTES

1. In the Schildkraut and Turanovic 2022 data, some dates appeared in different weeks, such as 2019 Week 11 running from March 11 to March 17, but 2020 Week 11 running from March 9 to March 15.

2. The 13 March 2020 covid declaration occurred in the middle of Week 11, but the Floyd killing occurred at the start of Week 22, which ran from 25 May 2020 to May 31 2020.

3. Data. R code for the "total" plot above.

Tagged with: bad peer review, methods, race, you're doing it wrong

Tag: you’re doing it wrong

Comments on Deckman and Cassese 2021 "Gendered nationalism and the 2016 US presidential election"

Comments on Young et al 2022 "…misperceptions of COVID-19 and the 2020 U.S. presidential election"

More bad peer/editorial review

The scientific study of White people's biases

Comments on Dietrich and Hayes 2022 "Race and symbolic politics in the US Congress"

Comments on Peay and McNair II 2022 "…The dual influences of #BlackLivesMatter on state-level policing reform adoption"

Comments on Schildkraut and Turanovic 2022 "A New Wave of Mass Shootings? Exploring the Potential Impact of COVID-19"