The Monkey Cage published a post, "Racial prejudice is driving opposition to paying college athletes. Here's the evidence." I tweeted about this post in several threads, but I'm posting the information here for possible future reference and for anyone who reads the blog.

Here's the key figure from the post. The left side of the post indicates that white respondents expressed more opposition to paying college athletes after exposure to a picture of black athletes than in a control condition with no picture.

After reading the post, I noted two oddities about the figure. First, based on the logic of an experiment -- change one thing only to assess the effect of that thing -- the proper comparison for assessing racial bias among white respondents would have been comparing the effect of a photo of black athletes to the effect of a photo of white athletes; that comparison would have removed the alternate explanations that respondents expressed more opposition because a photo was shown or because a photo of athletes was shown, and not necessarily because a photo of *black* athletes was shown. Second, the data were from the CCES, which typically has team samples of 1,000 respondents; these samples are presumably intended to be a representative of the national population, so there should be more than 411 whites in a 1,000-respondent sample.

Putting two and two together suggested that there was an unreported condition in which respondents were shown a photo of white athletes. I emailed the three authors of the blog post, and to their credit I received substantive replies to my questions about the experiment. Based on the team's responses, the experiment did have a condition in which respondents were shown a photo of white athletes, and opposition to paying college athletes in this "white athletes" photo condition did not differ at p<0.05 (two-tailed test) from opposition to paying college athletes in the "black athletes" photo condition.

Tagged with: , , ,

There is a common practice of discussing inequality in the United States without reference to Asian Americans, which permits the suggestion that the inequality is due to race or racial bias. Here's a recent example:

The graph reported results for Hispanics disaggregated into Cubans, Puerto Ricans, Mexicans, and other Hispanics, but the graph omitted results for Asians and Pacific Islanders, even though the note for the graph indicates that Asians/Pacific Islanders were included in the model. Here are data on Asian American poverty rates (source):

ACS

The omission of Asian Americans from discussions of inequality is a common enough practice [1, 2, 3, 4, 5] that it deserves a name. The Asian American Exclusion is as good as any.

Tagged with: , , ,

Here is the manuscript that I plan to present at the 2015 American Political Science Association conference in September: revised version here. The manuscript contains links to locations of the data; a file of the reproduction code for the revised manuscript  is here.

Comments are welcome!

Abstract and the key figure are below:

Racial bias is a persistent concern in the United States, but polls have indicated that whites and blacks on average report very different perceptions of the extent and aggregate direction of this bias. Meta-analyses of results from a population of sixteen federally-funded survey experiments, many of which have never been reported on in a journal or academic book, indicate the presence of a moderate aggregate black bias against whites but no aggregate white bias against blacks.

Metan w mcNOTE:

I made a few changes since submitting the manuscript: [1] removing all cases in which the target was not black or white (e.g., Hispanics, Asians, control conditions in which the target did not have a race); [2] estimating meta-analyses without removing cases based on a racial manipulation check; and [3] estimating meta-analyses without the Cottrell and Neuberg 2004 survey experiment, given that that survey experiment was more about perceptions of racial groups instead of a test for racial bias against particular targets.

Numeric values in the figure are for a meta-analysis that reflects [1] above:

* For white respondents: the effect size point estimate was 0.039 (p=0.375), with a 95% confidence interval of [-0.047, 0.124].
* For black respondents: the effect size point estimate was 0.281 (p=0.016), with a 95% confidence interval of [0.053, 0.509].

---

The meta-analysis graph includes five studies for which a racial manipulation check was used to restrict the sample: Pager 2006, Rattan 2010, Stephens 2011, Pedulla 2011, and Powroznik 2014. Inferences from the meta-analysis were the same when these five studies included respondents who failed the racial manipulation checks:

* For white respondents: the effect size point estimate was 0.027 (p=0.499), with a 95% confidence interval of [-0.051, 0.105].
* For black respondents: the effect size point estimate was 0.268 (p=0.017), with a 95% confidence interval of [0.047, 0.488].

---

Inferences from the meta-analysis were the same when the Cottrell and Neuberg 2004 survey experiment was removed from the meta-analysis. For the residual 15 studies using the racial manipulation check restriction:

* For white respondents: the effect size point estimate was 0.063 (p=0.114), with a 95% confidence interval of [-0.015, 0.142].
* For black respondents: the effect size point estimate was 0.210 (p=0.010), with a 95% confidence interval of [0.050, 0.369].

---

For the residual 15 studies not using the racial manipulation check restriction:

* For white respondents: the effect size point estimate was 0.049 (p=0.174), with a 95% confidence interval of [-0.022, 0.121].
* For black respondents: the effect size point estimate was 0.194 (p=0.012), with a 95% confidence interval of [0.044, 0.345].

Tagged with: , ,

Here is a passage from Pigliucci 2013.

Steele and Aronson (1995), among others, looked at IQ tests and at ETS tests (e.g. SATs, GREs, etc.) to see whether human intellectual performance can be manipulated with simple psychological tricks priming negative stereotypes about a group that the subjects self-identify with. Notoriously, the trick worked, and as a result we can explain almost all of the gap between whites and blacks on intelligence tests as an artifact of stereotype threat, a previously unknown testing situation bias.

Racial gaps are a common and perennial concern in public education, but this passage suggests that such gaps are an artifact. However, when I looked up Steele and Aronson (1995) to discover the evidence for this result, I discovered that the black participants and the white participants in the study were all Stanford undergraduates and that the students' test performances were adjusted by the students' SAT scores. Given that the analysis contained both sample selection bias and statistical control, it does not seem reasonable to make an inference about populations based on that analysis. This error in reporting results for Steele and Aronson (1995) is apparently common enough to deserve its own article.

---

Here's a related passage from Brian at Dynamic Ecology:

A neat example on the importance of nomination criteria for gender equity is buried in this post about winning Jeopardy (an American television quiz show). For a long time only 1/3 of the winners were women. This might lead Larry Summers to conclude men are just better at recalling facts (or clicking the button to answer faster). But a natural experiment (scroll down to the middle of the post to The Challenger Pool Has Gotten Bigger) shows that nomination criteria were the real problem. In 2006 Jeopardy changed how they selected the contestants. Before 2006 you had to self-fund a trip to Los Angeles to participate in try-outs to get on the show. This required a certain chutzpah/cockiness to lay out several hundred dollars with no guarantee of even being selected. And 2/3 of the winners were male because more males were making the choice to take this risk. Then they switched to an online test. And suddenly more participants were female and suddenly half the winners were female. [emphasis added]

I looked up the 538 post linked to in the passage, which reported: "Almost half of returning champions this season have been women. In the year before Jennings's streak, fewer than 1 in 3 winners were female." That passage provides two data points: this season appears to be 2015 (the year of the 538 post), and the year before Jennings's streak appears to be 2003 (the 538 post noted that Jennings's streak occurred in 2004). The 538 post reported that the rule change for the online test occurred in 2006.

So here's the relevant information from the 538 post:

  • In 2003, fewer than 1 in 3 Jeopardy winners were women.
  • In 2006, the selection process was changed to an online test.
  • Presumably in 2015, through early May, almost half of Jeopardy winners have been women.

It does not seem that comparison of a data point from 2003 to a partial data point from 2015 permits use of the descriptive term "suddenly."

It's entirely possible -- and perhaps probable -- that the switch to an online test for qualification reduced gender inequality in Jeopardy winners. But that inference needs more support than the minimal data reported in the 538 post.

Tagged with: , , ,

I left this as a comment here.

For what it's worth, here are questions that I ask when evaluating research:

1. Did the researchers preregister their research design choices so that we can be sure that the research design choices were not made based on the data? If not, are the research design choices consistent with the choices that the researcher has previously made in other research?

2. Have the researchers publicly posted documentation and all the data that were collected, so that other researchers can check the analysis for errors and assess the robustness of the reported results?

3. Did the researchers declare that there are no unreported file drawer studies, unreported manipulations, and unreported variables that were measured?

4. Were the data collected by an independent third party?

5. Is the sample representative of the population of interest?

Tagged with:

Here's a tweet that I happened upon:

The graph is available here. The idea of the graph appears to be that the average 2012 science scores on the PISA test were similar for boys and girls, so the percentage of women should be similar to the percentage of men among university science graduates in 2010.

The graph would be more compelling if STEM workers were drawn equally from the left half and the right half of the bell curve of science and math ability. But that's probably not what happens. It's more likely that college graduates who work in STEM fields have on average more science and math ability than the average person. If that's true, then it is not a good idea to compare average PISA scores for boys and girls in this case; it would be a better idea to compare PISA scores for boys and girls in the right tail of science and math ability because that is where the bulk of STEM workers likely come from.

Stoet and Geary 2013 reported on sex distributions in the right tail of math ability on the PISA:

For the 33 countries that participated in all four of the PISA assessments (i.e., 2000, 2003, 2006, and 2009), a ratio of 1.7–1.9:1 [in mathematics performance] was found for students achieving above the 95th percentile, and a 2.3–2.7:1 ratio for students scoring above the 99th percentile.

So there is a substantial sex difference in mathematics scores to the advantage of boys in the PISA data. There is also a substantial sex difference in reading scores to the advantage of girls in the PISA data, but reading ability is less useful than math ability for success in most or all STEM fields.

There is a smaller advantage for boys over girls in the right tail of science scores on the 2012 PISA, according to this report:

Across OECD countries, 9.3% of boys are top performers in science (performing at Level 5 or 6), but only 7.4% of girls are.

I'm not sure what percentile a Level 5 or 6 score is equivalent to. I'm also not sure whether math scores or science scores are more predictive for future science careers. But I am sure that it's better to examine right tail distributions than mean distributions for understanding representation in STEM.

Tagged with: , ,

Here are posts on R graphs, some of which are for barplots. I recently graphed a barplot with labels on the bars, so I'll post the code here. I cropped the y-axis numbers for the printed figure; from what I can tell, it takes a little more code and a lot more trouble (for me, at least) to code a plot without the left set of numbers.

Heritage

---

This loads the Hmisc package, which has the mgp.axis command that will be used later:

require(Hmisc)

---

This command tells R to make a set of graphs with 1 row and 0 columns (mfrow), to have margins of 6, 6, 2, and 1 for the bottom, left, top, and right (mar), and to have margins for the axes with characteristics of 0, 4, and 0, for the location of the labels, tick-mark labels, and tick marks (mgp):

par(mfrow=c(1,0), mar=c(6, 6, 2, 1), mgp=c(0,4,0))

---

This command enters the data for the barplot:

heritage <- c(66, 0, 71, 10, 49, 36)

---

This command enters the colors for the barplot bars:

colors <- c("royalblue4", "royalblue4", "cornflowerblue", "cornflowerblue", "navyblue", "navyblue")

---

This command enters the labels for the barplot bars, with \n indicating a new line:

names <- c("Flag reminds of\nSouthern heritage\nmore than\nwhite supremacy", "Flag reminds of\nwhite supremacy\nmore than\nSouthern heritage", "Proud of what\nthe Confederacy\nstood for", "Not proud of what\nthe Confederacy\nstood for", "What happens to\nSoutherners\naffects my life\na lot", "What happens to\nSoutherners\naffects my life\nnot very much")

---

This command plots a barplot of heritage with the indicated main title, no y-axis labels, from 0 to 90 on the y-axis, with horizontal labels, with colors from the "colors" set and names from the "names" set.

bp <- barplot (heritage, main="Percentage who Preferred the Georgia Flag with the Confederate Battle Emblem", ylab=NA, ylim=c(0,90), las=1, col=colors, names=names)

---

This command plots a y-axis (2) with the indicated labels being horizontal (las=2):

mgp.axis(2, at=c(0, 20, 40, 60, 80), las=2)

The above code is for the rightmost set of y-axis labels in the figure.

---

This command enters the text for the barplot bars:

labels <-c("66%\n(n=301)", "0%\n(n=131)", "71%\n(n=160)", "10%\n(n=104)", "49%\n(n=142)", "36%\n(n=122)")

---

This command plots the labels at the coordinates (barplot value, heritage value+ + 2):

text(bp, heritage+2, labels, cex=1, pos=3)

---

Full code is here.

Tagged with: ,