This Brian Schaffner post at Data for Progress indicates that, on 9 June during the 2020 protests over the death of George Floyd, only 57% of Whites and about 83% of Blacks agreed that "White people in the U.S. have certain advantages because of the color of their skin". It might be worth considering why not everyone agreed with that statement.

---

Let's check data from the Nationscape survey, focusing on the survey conducted 11 June 2020 (two days from the aforementioned Data for Progress survey) and the items that ask: "How much discrimination is there in the United States today against...", with response options of "A great deal", "A lot", "A moderate amount", "A little", and "None at all".

For rating discrimination against Blacks, 95% of Whites selected a level from "A great deal" through "A little", including missing responses in the 5%. It could be that the difference between this 95% and the Data for Progress 57% is because about 38% of Whites think that discrimination against Blacks favors only non-White non-Black persons. But the 57% Data for Progress estimate was pretty close to the 59% of Whites in the Nationscape data who rated the discrimination against Blacks higher than they rated the discrimination against Whites.

The pattern is similar for Blacks: about 83% of Blacks in the Data for Progress data agreed that "White people in the U.S. have certain advantages because of the color of their skin", and 85% of Blacks in the Nationscape data rated the discrimination against Blacks higher than the discrimination against Whites. But, in the Nationscape data, 98% of Blacks selected a level from "A great deal" through "A little" for the amount of discrimination that Blacks face in the United States today.

---

So this seems to be suggestive evidence that many people who do not agree that "White people in the U.S. have certain advantages because of the color of their skin" might not be indicating a lack of "acknowledgement of racism" in Schaffner's terms, but are rather signaling a belief closer to the idea that the discrimination against Blacks does not outweigh the discrimination against Whites, at least as measured on a five-point scale.

---

NOTES:

[1] The "certain advantages" item has appeared on the CCES; here is evidence that another CCES item does not well measure what the item presumably is supposed to measure.

[2] Data citation:

Chris Tausanovitch and Lynn Vavreck. 2020. Democracy Fund + UCLA Nationscape, October 10-17, 2019 (version 20200814). Retrieved from: https://www.voterstudygroup.org/downloads?key=e6ce64ec-a5d0-4a7b-a916-370dc017e713.

Note: "the original collectors of the data, UCLA, LUCID, and Democracy Fund, and all funding agencies, bear no responsibility for the use of the data or for interpretations or inferences based upon such issues".

[3] Code for my analysis:

* Stata code for the Data for Progress data

tab acknowledgement_1
tab starttime if wave==8
svyset [pw=nationalweight]
svy: prop acknowledgement_1 if ethnicity==1 & wave==8
svy: prop acknowledgement_1 if ethnicity==2 & wave==8

* Stata code for the Nationscape data [ns20200611.dta]

recode discrimination_blacks (1/4=1) (5 .=0), gen(discB)
recode discrimination_whites (1/4=1) (5 .=0), gen(discW)
tab discrimination_blacks discB, mi
tab discrimination_whites discW, mi

gen discBW = 0
replace discBW = 1 if discrimination_blacks < discrimination_whites & discrimination_blacks!=. & discrimination_whites!=.
tab discrimination_blacks discrimination_whites if discBW==1, mi
tab discrimination_blacks discrimination_whites if discBW==0, mi

svyset [pw=weight]

svy: prop discB if race_ethnicity==2
svy: prop discBW if race_ethnicity==2

svy: prop discB if race_ethnicity==1
svy: prop discBW if race_ethnicity==1

Tagged with: , ,

The Ellis and Faricy 2020 Political Behavior article "Race, Deservingness, and Social Spending Attitudes: The Role of Policy Delivery Mechanism" discussed results from Figure 2:

This graph illustrates that while the mean support for this program does not differ significantly by spending mode, racial attitudes strongly affect the type of spending that respondents would prefer: those lowest in symbolic racism are expected to prefer the direct spending program to the tax expenditure program, while those high in symbolic racism are expected to prefer the opposite (p. 833).

Data for Study 2 indicated that, based on a linear regression using symbolic racism to predict nonBlack participant support for the programs, controlling for party identification, income, trust, egalitarianism, White race, and male, as coded in the Ellis and Faricy 2020 analyses, the predicted level of support at the lowest level of symbolic racism with other predictors at their means was 3.37 for the tax expenditure program and 3.87 for the direct spending program, but the predicted level of support at the highest level of symbolic racism was 3.44 for the tax expenditure program and 3.24 for the direct spending program.

However, linear regression can misestimate treatment effects. Below is a plot of the treatment effect estimated at individual levels of symbolic racism, with no controls (left panel) and with the aforementioned controls (right panel).

There does not appear to be much evidence in these data that participants high in symbolic racism preferred one program to the other. For example, in the left panel, at the highest level of symbolic racism, the estimated support was 2.76 for the tax expenditure program and was 2.60 for the direct spending program (p=0.41 for the difference). Moreover, the p-value for the difference did not drop under p=0.4 if participants from adjacent high levels of symbolic racism are included (7 and 8, or 6 through 8, or 5 through 8, or 4 through 8), with or without the controls.

---

NOTES

1. Code for my analyses and plot. Data for the plot.

Tagged with: , , ,

The plot below is from the Burge et al. 2020 Journal of Politics article "A Certain Type of Descriptive Representative? Understanding How the Skin Tone and Gender of Candidates Influences Black Politics":

I thought that the plot could be improved. Some superficial shortcomings of the plot:

[1] Placing dependent variable information in the legend unnecessary causes readers to need to decipher the dot, triangle, and X symbols.

[2] The y-axis text is unnecessarily vertical, and vertical text is more difficult to read than horizontal text.

[3] The panels are a lot taller than needed, so the top estimate is farther from the x-axis labels than needed.

Some other flaws are better understood with information about the experiment. Black participants were randomly assigned to groups and asked to rate a candidate, in which candidate characteristics varied, such as being female and dark skinned (Dark Julie) or male and light skinned (Light James). Participants responded to items about the candidate, such as reporting their willingness to vote for the candidate. The key result, indicated in the abstract, is that "darker-skinned candidates are evaluated more favorably than lighter-skinned candidates" (p. 1).

[4] The estimates of interest unnecessarily consume too little of the plot space. The dependent variables were placed on a 0-to-1 scale, and the plotted estimates are differences on this scale, so that -1 and +1 are potential estimates; the x-axes thus do not need to run from -0.5 to +0.5. The estimate of interest is the difference in responses between candidates and not the absolute values of the responses, so I think that it is fine to zoom in on the estimates and to not show the full potential scale on the x-axis.

Below is a plot that addresses these points:

I also changed the dependent variables from a 0-to-1 scale to a 0-to-100 scale, to avoid decimals in the x-axis, because decimals involve unnecessary periods and sometimes involve unnecessary zeros. For example, for the difference between Dark James and Light James in the middle panel, I would prefer to have the relevant tick labeled "5" than ".05" or "0.05".

And I removed what I thought was information that could be placed into a figure note or dropped altogether from the figure (such as sample size and model numbers). The note on the data source could also be placed into the figure note for journal publication, but I'm including it in this plot, in case I tweet the plot.

---

Another potential improvement is to revise the plot to emphasize the key finding, about the skin tone difference. The original Burge et al. 2020 plot includes a comparison of Dark Julie to Dark James, but does not include a comparison of Light Julie to Light James (all three comparisons of Light Julie to Light James are nulls). But the inclusion of the third panel in the original Burge et al. 2020 plot dilutes the focus on the skin color comparison. Here is a plot focusing on only the dark/light comparison:

Potential shortcomings of the above plot are the absence of the absolute values for the estimates and an inability to make across-sex comparisons of, say, Light Julie to Dark James. The plot below includes absolute values, permits comparisons across sex, and still permits the key finding about skin color to be relatively easily discerned:

The plot below uses shading to encourage by-color comparison of candidate pairs within panel:

Maybe it would be better to emphasize the dark/light finding by using a light dot for the "Light" targets and a dark dot for the "Dark" candidates. And, for a stand-alone plot, maybe it would be better to add a title summarizing the key pattern, such as "Black participants tended to prefer the darker-skinned Black candidates". Feel free to comment on any other improvements that can be made.

---

NOTES

1. Code and data for the 3-panel plot.

2. Code and data for the 2-panel plot.

3. Code and data for the unshaded 1-panel plot.

4. Code and data for the shaded 1-panel plot.

Tagged with: ,

1.

Researchers reporting results from an experiment often report estimates of the treatment effect at particular levels of a predictor. For example, a panel of Figure 2 in Barnes et al. 2018 plotted, over a range of hostile sexism, the estimated difference in the probability of reporting being very unlikely to vote for a female representative target involved in a sex scandal relative to the probability of reporting being very unlikely to vote for a male representative target involved in a sex scandal. For another example, Chudy 2020 plotted, over a range of racial sympathy, estimated punishments for a Black culprit target and a White culprit target. Both of these plots report estimates derived from a regression. However, as indicated in Hainmueller et al. 2020, regression can nontrivially misestimate a treatment effect at particular levels of a predictor.

This post presents another example of this phenomenon, based on data from the experiment in Costa et al. 2020 "How partisanship and sexism influence voters' reactions to political #MeToo scandals" (link to a correction to Costa et al. 2020).

---

2.

The Costa et al. 2020 experiment had a control condition, two treatment conditions, and multiple outcome variables, but my illustration will focus on only two conditions and only one outcome variable. Participants were asked to respond to four items measuring participant sexism and to rate a target male senator on a 0-to-10 scale. Participants who were then randomized to the "sexual assault" condition were provided a news story indicating that the senator had been accused of groping two women without consent. Participants who were instead randomized to the control condition were provided a news story about the senator visiting a county fair. The outcome variable of interest for this illustration is the percent change in the favorability of the senator, from the pretest to the posttest.

Estimates in the left panel of Figure 1 are based on a linear regression predicting the outcome variable of interest, using predictors of a pretest measure of participant sexism from 0 for low sexism to 16 for high sexism, a dichotomous variable coded 1 for participants in the sexual assault condition and 0 for participants in the control, and an interaction of these predictors. The panel plots the point estimates and 95% confidence intervals for the estimated difference in the outcome variable, between the control condition and the sexual assault condition, at each observed level of the participant sexism index.

The leftmost point indicates that the "least sexist" participants in the sexual assault condition were estimated to have a value of the outcome variable that was about 52 units less than the "least sexist" participants in the control condition; the "least sexist" participants in the control were estimated to have increased their rating of the senator by 4.6 percent, and the "least sexist" participants in the sexual assault condition were estimated to have reduced their rating of the senator by 47.6 percent.

The rightmost point of the plot indicates that the "most sexist" participants in the sexual assault condition were estimated to have a value of the outcome variable that was about 0 units less than did the "most sexist" participants in the control condition; the "most sexist" participants in the control were estimated to have increased their rating of the senator by 1.7 percent, and the "least sexist" participants in the sexual assault condition were estimated to have increased their rating of the senator by 2.1 percent. Based on this rightmost point, a reader could conclude about the sexual assault allegations, as Costa et al. 2020 suggested, that:

...the most sexist subjects react about the same way to sexual assault and sexist jokes allegations as they do to the control news story about the legislator attending a county fair.

However, the numbers at the inside bottom of the Figure 1 panels indicate the sample size at that level of the sexism index, across the control condition and the sexual assault condition. These numbers indicate that the regression-based estimate for the "most sexist" participants was nontrivially based on the behavior of other participants.

Estimates in the right panel of Figure 1 are instead based on t-tests conducted for participants at only the indicated level of the sexism index. As in the left panel, the estimate for the "least sexist" participants falls between -50 and -60, and, for the next few higher observed values of the sexism index, estimates tend to rise and/or tend to get closer to zero. But the tendency does not persist above the midpoint of the sexism index. Moreover, the point estimates in the right panel for the three highest values of the sexism index do not fall within the corresponding 95% confidence intervals in the left panel.

The p-value fell below p=0.05 for the 28 participants at 15 or 16 on the sexism index, with a point estimate of -22. The sample size was 1,888 across these two conditions, so participants at 15 or 16 on the sexism index represent the top 1.5% of participants on the sexism index across these two conditions. Therefore, the sexual assault treatment appears to have had an effect on these "very sexist" participants.

---

3.

Regression can reveal patterns in data. For example, linear regression estimates correctly indicated that, in the Costa et al. 2020 experiment, the effect of the sexual assault treatment relative to the control was closer to zero for participants at higher levels of a sexism index than for participants at lower level of the sexism index. However, as indicated in the illustration above, regression can produce misestimates of an effect at particular levels of a predictor. Therefore, inferences about an estimated effect at a particular level of a predictor should be based only on cases at or around that level of the predictor and should not be influenced by other cases.

---

NOTES

1. Costa et al. 2020 data.

2. Stata code for the analysis.

3. R code for the plot. CSV file for the R plot.

4. The interflex R package (Hainmueller et al. 2020) produced the plot below, using six bins. The leveling off at higher values of the sexism index also appears in this interflex plot:

R code to add to the corrected Costa et al. 2020 code:

dat$sexism16 <- (dat$pre_sexism-1)*4

summary(dat$sexism16)

p1 <- inter.binning(data=dat, Y="perchange_vote", D="condition2", X="sexism16", nbins=6, base="Control")

plot(p1)

Tagged with: ,

In a Monkey Cage post and Chapter 6 of their Ignored Racism book, Mark D. Ramirez and David A.M. Peterson reported on a conjoint experiment, in which White adult U.S. citizens were given a profile of two target persons and were asked "Which of these citizens do you prefer to keep registered to vote?". The experiment manipulated profile target characteristics such as race, gender, and criminal status.

Latina/o racism-ethnicism (LRE) was measured with responses to four "modern racism"-type items, such as "Many other ethnic groups have successfully integrated into American culture. Latinos and Hispanics should do the same without any special favors".

Results in Figure 6.7 indicated that high LRE participants favored White targets over Hispanic targets. But Figure 6.7 results also indicated that low LRE participants favored Hispanics targets over White targets. This experiment thus provided further evidence that a nontrivial percentage of participants at low levels of modern racism / modern sexism items have racial bias and/or gender bias. Here is prior post on a study indicating that persons at low levels of hostile sexism discriminated against men.

Tagged with: , ,

In "Gendered Nationalism and the 2016 US Presidential Election: How Party, Class, and Beliefs about Masculinity Shaped Voting Behavior" (Politics & Gender 2019), Melissa Deckman and Erin Cassese reported a Table 2 model that had a sample size of 750 and a predictor for college degree that had a logit coefficient of -0.57 and a standard error of 0.28, so the associated t-statistic is -0.57/28, or about -2.0, which produces a p-value of about 0.05.

The college degree coefficient fell to -0.27 when a "gendered nationalism" predictor was added to the model, and Deckman and Cassese 2019 indicated (pp. 17-18) that:

A post hoc Wald test comparing the size of the coefficients between the two models suggests that the coefficient for college was significantly reduced by the inclusion of the mediator [F(1,678) = 7.25; p < .0072]...

From what I can tell, this means that there is stronger evidence for the -0.57 coefficient differing from the -0.27 coefficient (p<0.0072) than for the -0.57 coefficient differing from zero (p≈0.05).

This type of odd result has been noticed before.

---

For more explanation, below are commands that can be posted into Stata to produce a similar result:

clear all
set seed 123
set obs 500
gen Y = runiform(0,10)
gen X1 = 0.01*(Y + runiform(0,10)^2)
gen X2 = 0.01*(Y + 2*runiform(0,10))
reg Y X1
egen weight = fill(1 1 1 1 1)
svyset [pw=weight]
svy: reg Y X1
estimates store X1alone
svy: reg Y X1 X2
estimates store X1paired
suest X1alone X1paired
lincom _b[X1alone:X1] - 0
di _b[X1paired:X1]
lincom _b[X1alone:X1] - 0.4910762
lincom _b[X1alone:X1] - _b[X1paired:X1]

The X1 coefficient is 0.8481948 in the "reg Y X1" model and is 0.4910762 in the "reg Y X1 X2" model. Results for the "lincom _b[X1alone:X1] - _b[X1paired:X1]" command indicate that the p-value is 0.040 for the test that the 0.8481948 coefficient differs from the 0.4910762 coefficient. But results for the "lincom _b[X1alone:X1] - 0.4910762" command indicate that the p-value is 0.383 for the test that the 0.8481948 coefficient differs from the number 0.4910762.

So, from what I can tell, there is stronger evidence that the 0.8481948 X1 coefficient differs from an imprecisely estimated coefficient that has the value of 0.4910762 than from the value of 0.4910762.

---

As indicated in the link above, this odd result appears attributable to the variance sum law:

Variance(X-Y) = Variance(X) + Variance(Y) - 2*Covariance(X,Y)

For the test of whether the 0.8481948 X1 coefficient differs from the 0.4910762 X1 coefficient, the formula is:

Variance(X-Y) = Variance(X) + Variance(Y) - 2*Covariance(X,Y)

But for the test of whether the -0.57 coefficient differs from zero, the formula reduces to:

Variance(X-Y) = Variance(X) + 0 - 0

For the simulated data, subtracting 2*Covariance(X,Y) reduces Variance(X-Y) more than adding the Variance(Y) increases Variance(X-Y), which explains how the p-value can be lower for comparing the two coefficients to each other than for comparing one coefficient to the value of the other coefficient.

See the code below:

suest X1alone X1paired
matrix list e(V)
di (.8481948-.4910762)/sqrt(.16695974)
di (.8481948-.4910762)/sqrt(.16695974+.14457114-2*.14071065)
test _b[X1alone:X1] = _b[X1paired:X1]

Stata output here.

Tagged with: