Here's a tweet that I happened upon:

The graph is available here. The idea of the graph appears to be that the average 2012 science scores on the PISA test were similar for boys and girls, so the percentage of women should be similar to the percentage of men among university science graduates in 2010.

The graph would be more compelling if STEM workers were drawn equally from the left half and the right half of the bell curve of science and math ability. But that's probably not what happens. It's more likely that college graduates who work in STEM fields have on average more science and math ability than the average person. If that's true, then it is not a good idea to compare average PISA scores for boys and girls in this case; it would be a better idea to compare PISA scores for boys and girls in the right tail of science and math ability because that is where the bulk of STEM workers likely come from.

Stoet and Geary 2013 reported on sex distributions in the right tail of math ability on the PISA:

For the 33 countries that participated in all four of the PISA assessments (i.e., 2000, 2003, 2006, and 2009), a ratio of 1.7–1.9:1 [in mathematics performance] was found for students achieving above the 95th percentile, and a 2.3–2.7:1 ratio for students scoring above the 99th percentile.

So there is a substantial sex difference in mathematics scores to the advantage of boys in the PISA data. There is also a substantial sex difference in reading scores to the advantage of girls in the PISA data, but reading ability is less useful than math ability for success in most or all STEM fields.

There is a smaller advantage for boys over girls in the right tail of science scores on the 2012 PISA, according to this report:

Across OECD countries, 9.3% of boys are top performers in science (performing at Level 5 or 6), but only 7.4% of girls are.

I'm not sure what percentile a Level 5 or 6 score is equivalent to. I'm also not sure whether math scores or science scores are more predictive for future science careers. But I am sure that it's better to examine right tail distributions than mean distributions for understanding representation in STEM.

Tagged with: , ,

Here are posts on R graphs, some of which are for barplots. I recently graphed a barplot with labels on the bars, so I'll post the code here. I cropped the y-axis numbers for the printed figure; from what I can tell, it takes a little more code and a lot more trouble (for me, at least) to code a plot without the left set of numbers.

Heritage

---

This loads the Hmisc package, which has the mgp.axis command that will be used later:

require(Hmisc)

---

This command tells R to make a set of graphs with 1 row and 0 columns (mfrow), to have margins of 6, 6, 2, and 1 for the bottom, left, top, and right (mar), and to have margins for the axes with characteristics of 0, 4, and 0, for the location of the labels, tick-mark labels, and tick marks (mgp):

par(mfrow=c(1,0), mar=c(6, 6, 2, 1), mgp=c(0,4,0))

---

This command enters the data for the barplot:

heritage <- c(66, 0, 71, 10, 49, 36)

---

This command enters the colors for the barplot bars:

colors <- c("royalblue4", "royalblue4", "cornflowerblue", "cornflowerblue", "navyblue", "navyblue")

---

This command enters the labels for the barplot bars, with \n indicating a new line:

names <- c("Flag reminds of\nSouthern heritage\nmore than\nwhite supremacy", "Flag reminds of\nwhite supremacy\nmore than\nSouthern heritage", "Proud of what\nthe Confederacy\nstood for", "Not proud of what\nthe Confederacy\nstood for", "What happens to\nSoutherners\naffects my life\na lot", "What happens to\nSoutherners\naffects my life\nnot very much")

---

This command plots a barplot of heritage with the indicated main title, no y-axis labels, from 0 to 90 on the y-axis, with horizontal labels, with colors from the "colors" set and names from the "names" set.

bp <- barplot (heritage, main="Percentage who Preferred the Georgia Flag with the Confederate Battle Emblem", ylab=NA, ylim=c(0,90), las=1, col=colors, names=names)

---

This command plots a y-axis (2) with the indicated labels being horizontal (las=2):

mgp.axis(2, at=c(0, 20, 40, 60, 80), las=2)

The above code is for the rightmost set of y-axis labels in the figure.

---

This command enters the text for the barplot bars:

labels <-c("66%\n(n=301)", "0%\n(n=131)", "71%\n(n=160)", "10%\n(n=104)", "49%\n(n=142)", "36%\n(n=122)")

---

This command plots the labels at the coordinates (barplot value, heritage value+ + 2):

text(bp, heritage+2, labels, cex=1, pos=3)

---

Full code is here.

Tagged with: ,

The post is here.

Data for the Hutchings and Walton study are here and code is here.

Data for the Southern Focus Poll are here and code is here.

Here are factor analysis results for the Hutchings and Walton study and for the Southern Focus Poll.

---

UPDATE (July 7, 2015)

Corrected the code link for the Southern Focus Poll.

The Monkey Cage post is discussed in a scatterplot post.

More code to support this claim about Southern choice for words to describe food, as mentioned here.

Tagged with: