The plot below is from Strickler and Lawson 2020 "Racial conservatism, self-monitoring, and perceptions of police violence":

I thought that the plot might be improved:

---

Key differences between the plots:

1. The original plot has a legend, which requires readers to match colors in a legend to colors of estimates. The revised plot labels the estimates without using a legend.

2. The original plot reports treatment effects on a relative scale. The revised plot reports estimates on an absolute scale, so that readers can directly see the mean percentages that rated the shooting justified, for each group in each condition.

3. The revised plot uses 83% confidence intervals, so that readers can use non-overlaps in the confidence intervals to get a sense of whether the p-value is p<0.05 for a given comparison.

4. The revised plot reverses the axes and stacks the plots vertically, so that, for instance, it's easier to perceive that the percentage of nonWhite respondents in the control that rated the shooting as justified is lower than the percentage of White respondents in the control that rated the shooting as justified, at about p=0.05.

---

The plot below repeats the plot above (left) and adds the same plot but with x-axes for each panel (right):

---

NOTES

1. Thanks to Ryan Strickler for sending me data and code for the article.

2. Code for the paired plot. Data for the plots.

3. Prior discussion of Strickler and Lawson 2020.

4. Other plot improvement posts.

Tagged with: , ,

Here are posts on R graphs, some of which are for barplots. I recently graphed a barplot with labels on the bars, so I'll post the code here. I cropped the y-axis numbers for the printed figure; from what I can tell, it takes a little more code and a lot more trouble (for me, at least) to code a plot without the left set of numbers.

Heritage

---

This loads the Hmisc package, which has the mgp.axis command that will be used later:

require(Hmisc)

---

This command tells R to make a set of graphs with 1 row and 0 columns (mfrow), to have margins of 6, 6, 2, and 1 for the bottom, left, top, and right (mar), and to have margins for the axes with characteristics of 0, 4, and 0, for the location of the labels, tick-mark labels, and tick marks (mgp):

par(mfrow=c(1,0), mar=c(6, 6, 2, 1), mgp=c(0,4,0))

---

This command enters the data for the barplot:

heritage <- c(66, 0, 71, 10, 49, 36)

---

This command enters the colors for the barplot bars:

colors <- c("royalblue4", "royalblue4", "cornflowerblue", "cornflowerblue", "navyblue", "navyblue")

---

This command enters the labels for the barplot bars, with \n indicating a new line:

names <- c("Flag reminds of\nSouthern heritage\nmore than\nwhite supremacy", "Flag reminds of\nwhite supremacy\nmore than\nSouthern heritage", "Proud of what\nthe Confederacy\nstood for", "Not proud of what\nthe Confederacy\nstood for", "What happens to\nSoutherners\naffects my life\na lot", "What happens to\nSoutherners\naffects my life\nnot very much")

---

This command plots a barplot of heritage with the indicated main title, no y-axis labels, from 0 to 90 on the y-axis, with horizontal labels, with colors from the "colors" set and names from the "names" set.

bp <- barplot (heritage, main="Percentage who Preferred the Georgia Flag with the Confederate Battle Emblem", ylab=NA, ylim=c(0,90), las=1, col=colors, names=names)

---

This command plots a y-axis (2) with the indicated labels being horizontal (las=2):

mgp.axis(2, at=c(0, 20, 40, 60, 80), las=2)

The above code is for the rightmost set of y-axis labels in the figure.

---

This command enters the text for the barplot bars:

labels <-c("66%\n(n=301)", "0%\n(n=131)", "71%\n(n=160)", "10%\n(n=104)", "49%\n(n=142)", "36%\n(n=122)")

---

This command plots the labels at the coordinates (barplot value, heritage value+ + 2):

text(bp, heritage+2, labels, cex=1, pos=3)

---

Full code is here.

Tagged with: ,

This R lesson is for confidence intervals on point estimates. See here for other lessons.

---

Here's the first three lines of code:

pe <- c(2.48, 1.56, 2.96)
y.axis <- c(1:3)
plot(pe, y.axis, type="p", axes=T, pch=19, xlim=c(1,4), ylim=c(1,3))

The first line places 2.48, 1.56, and 2.96 into a vector called "pe" for point estimates; you can call the vector anything that you want, as long as R recognizes the vector name.

The second line sends the integers from 1 to 3 into the vector "y.axis"; instead of y.axis <- c(1:3), you could have written y.axis <- c(1,2,3) to do the same thing.

The third line plots a graph with pe on the x-axis and y.axis on the y-axis; type="p" tells R to plot points, axes=T tells R to draw axes, pch=19 indicates what type of points to draw, xlim=c(1,4) indicates that the x-axis extends from 1 to 4, and ylim=c(1,3) indicates that the y-axis extends from 1 to 3.

Here's the graph so far:

ci1---

Let's make the points a bit larger by adding cex=1.2 to the end of the plot command.

Let's also add a title, using a new line of code: title(main="Negative Stereotype Disagreement > 3").

ci2---

Let's add the 95% confidence interval lines.

lower <- c(2.26, 1.17, 2.64)
upper <- c(2.70, 1.94, 3.28)
segments(lower, y.axis, upper, y.axis, lwd= 1.3)

The first line indicates the lower ends of the confidence intervals; the second line indicates the upper ends of the confidence intervals; and the segments command draws line segments from the coordinate (lower, y.axis) to the coordinate (upper, y.axis), with lwd=1.3 indicating that the line should be slightly thicker than the default.

Here's what we have so far:

ci3---

Let's replace the x-axis and y-axis. First, change axes=T to axes=F in the plot command; then add the code axis(1, at=seq(1,4,by=1)) to tell R to draw an axis at the bottom from 1 to 4 with tick marks every 1 unit. Here's what we get:

ci4Let's get rid of the "pe" and "y.axis" labels. Add to the plot command: xlab="", ylab="". Here's the graph now:

ci5---

Let's work on the y-axis now:

names <- c("Baseline", "Black\nFamily", "Affirmative\nAction")
axis(2, at=y.axis, label=names)

The first line sends three phrases to the vector "names"; the \n in the phrases tells R to place "Family" and "Action" on a new line. Here's the result:

ci6Let's make the y-axis labels perpendicular to the y-axis by adding las=2 to the axis(2 line. [las=0 would keep the labels parallel.]

ci7Now we need to add a little more space to the left of the graph to see the y-axis labels. Add par(mar=c(4, 6, 2, 0)) above the plot command to tell R to make the margins 4, 6, 2, and 0 for the bottom, left, top, and right margins.

ci8---

Let's say that I decided that I prefer to have the baseline on top of the graph and Affirmative Action at the bottom of the graph. I could use the rev() function to reverse the order of the points in the plot, segments, and axis functions to get:

ci9---

Here is the whole code for the above graph. By the way, the graph above can be found in my article on social desirability in the list experiment, "You Wouldn't Like Me When I'm Angry."

Tagged with: ,

This R lesson is for the plot command. See here for other lessons.

---

The start of this code is a bit complex. It's from R Commander, which is a way to use R through a graphical interface without having to write code.

library(foreign)

The library function with the foreign package is used to import data from SPSS, Stata, or some other software.

DWHouse <- read.dta("C:/house_polarization46_113v9.dta", convert.dates=TRUE, convert.factors=TRUE, missing.type=TRUE, convert.underscore=TRUE, warn.missing.labels=TRUE)

The above command reads data from Stata (.dta extension) and places the data into DWHouse. The house_polarization46_113v9.dta dataset is from Voteview polarization data, located here. [The v9 on the end of the dataset indicates that I saved the dataset as Stata version 9.]

---

Here's the plot command:

plot(repmean1~year, type="p", xlim=c(1900,2012), ylim=c(-1,1), xlab="Year", ylab="Liberal - Conservative", pch=19, col="red", main="House", data=DWHouse)

Here are what the arguments mean: the tilde in repmean1~year plots repmean1 as a function of year, type="p" indicates to plot points, xlim=c(1900,2012) indicates the limits for the x-axis, ylim=c(-1,1) indicates the limits for the x-axis, xlab="Year" and ylab="Liberal - Conservative" respectively indicate labels for the x-axis and y-axis, pch=19 indicates to use the 19 plotting character [see here for a list of pchs], col="red" indicates the color for the pchs [see here for a list of colors], main="House" indicates the main title, and data=DWHouse indicates the data to plot.

Here's what the graph looks like so far:

plotgop---

The repmean1 plotted above is the Republican Party mean for the first-dimension DW-Nominate scores among members of the House of Representatives. Let's add the Democrats. Instead of adding a new plot command, we just add points:

points(demmean1~year, type="p", pch=19, col="blue", data=DWHouse)

Now let's add some labels:

text(1960,0.4,labels="GOP mean", col="red")
text(1960,-0.4,labels="Dem mean", col="blue")

The first command adds text at the coordinate x=1960 and y =0.4; the text itself is "GOP mean," and the color of the text is red. I picked x=1960 and y =0.4 through trial and error to see where the text would look the nicest.

Here's the graph now:

plotgopdem---

Notice that the x-axis is labeled in increments of 20 years (1900, 1920, 1940, ...). This can be changed as follows. First, add axes=F to the plot command to shut off axes; you could also write axes=FALSE); then add these axis lines below the plot command:

axis(1, at=seq(1900, 2020, 10))
axis(2, at=seq(-1, 1, 0.5))

The above lines tell R to plot axes at the indicated intervals. The first line arguments are: 1 tells R to plot an axis below [1=below, 2=left, 3=above, and 4=right], and the (1900, 2020, 10) sequence tells R to plot from 1900 to 2020 and place tick marks every 10 years. Here's the resulting graph:

plotgopdem20---

Notice that the x-axis and y-axis do not touch in the graph above. There's a few extra points plotted that I did not intend to plot: I meant to start the graph at 1900 so that the first point was 1901 (DW-Nominate scores are provided in the dataset every two years starting with 1879). To get the x-axis and y-axis to touch, add xaxs="i", yaxs="i" to the plot command. Let's also add box() to get a box around the graph, like we had in the first two graphs above.

plotgopdem20i

---

Here is the whole code for the plot above.

Tagged with: ,

The first graph in this series is a barplot. This post will show how to add error bars to a barplot.

Here's the data that we want to plot, from a t-test conducted in Stata:

ttest---

Here's the first part of the code:

library(Hmisc)

The code above opens the Hmisc library, which has the error bar function that we will use.

means <- c(2.96, 3.59)

The code above places 2.96 and 3.59 into the vector "means".

bp = barplot(means, ylim=c(0,6), names.arg=c("Black", "White"), ylab="Support for life in prison without parole", xlab="Race of the convicted teen", width=c(0.2,0.2), xlim=c(0,1), space=c(1,1), las=1, main="Black Non-Hispanic Respondents")

The code above is similar to the barplot code that we used before, but notice that in this case the barplot is = bp. The remainder of the arguments are: means indicates what data to plot, ylim=c(0,6) indicates that the limits of the y-axis are 0 and 6, names.arg=c("Black", "White") indicates the names for the bars, ylab="Support for life in prison without parole" indicates the label for the y-axis, xlab="Race of the convicted teen" indicates the label for the x-axis, width=c(0.2,0.2) indicates the width of the bars, xlim=c(0,1) indicates that the limits of the x-axis are 0 and 1, space=c(1,1) indicates the spacing between bars, and main="Black Non-Hispanic Respondents" indicates the main title for the graph.

Here's the graph so far:

95a

---

Here's how to add the error bars:

se <- c(0.2346, 0.2022)
lower = c(means-1.96*se, means-1.96*se)
upper = c(means+1.96*se, means+1.96*se)
errbar(bp, means, upper, lower, add=T)

The first line sends the values for the standard errors into the vector "se". The second and third lines are used to calculate the ends of the error bars. The fourth line tells R to plot error bars; the add=T option tells R to keep the existing graph; without add=T, the graph will show only the error bars.

Finally, add the code box(bty="L") so that there is a line on the bottom of the graph. The bty="L" tells R to make the axis look like the letter L. Other options include C, O, 7, and U.

Here is the graph now:

95b---

It's not necessary to use the 1.96 multiplier for the error bars. The following code plugged in the lower and upper limits directly from the Stata output.

library(Hmisc)

means <- c(2.96, 3.59)

bp = barplot(means, ylim=c(0,6), names.arg = c("Black", "White"), ylab="Support for life in prison without parole", xlab="Race of the convicted teen", xpd=T, width=c(0.2,0.2), xlim=c(0,1), space=c(1,1), main="Black Non-Hispanic Respondents")

se <- c(0.2346, 0.2022)
lower = c(2.48, 3.19)
upper = c(3.42, 4.00)
errbar(bp, means, upper, lower, add=T)

box(bty="O")

---

Here's what the graph looks like for the above, shortened code, with the bty="O":

95c

---

Data from this post were drawn from here, with the article here. Click here for the graph code.

Tagged with: ,

If I remember correctly, my first introduction to R came when fellow Pitt graduate student Hirokazu Kikuchi requested that R be installed on the polisci lab computers. I looked into R and found this webpage based on a 2007 Perspectives on Politics article by Jonathan Kastellec and Eduardo Leoni. That link is a good place to start, but in this post I'll introduce a few lines of code to illustrate how nice and easy R can be. (Not that R is always easy.)

I'll indicate lines of R code in bold.

---

less5 <- c(40.91, 7.67, 7.11, 6.19, 15.65, 6.4, 4.57, 4.43, 2.42, 4.66)

The above command assigns the ten numbers (from 40.91 to 4.66) to a vector called "less5." c() is a concatenation function. The following command does the same thing:

 c(40.91, 7.67, 7.11, 6.19, 15.65, 6.4, 4.57, 4.43, 2.42, 4.66) -> less5

---

barplot (less5, main="Countries with a mean < 5", ylab="Percent", ylim=c(0, 40), names=c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))

The barplot function tells R to plot a bar chart. These are the arguments: less 5 indicates the vector to plot, main="Countries with a mean < 5" indicates the main plot title, ylab="Percent" indicates the label for the y-axis, ylim=c(0, 40) indicates that the y-axis should run from 0 to 40, and names=c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10")) indicates the set of names that should be placed below the set of bars.

Here's what this graph looks like, based on the two lines of code:

barplot1---

Let's plot three graphs together. Here's the code for graphs 2 and 3:

from56 <- c(18.35, 4.41, 5.68, 4.61, 22.63, 9.31, 7.63, 8.65, 4.99, 13.75)

barplot (from56, main="Countries with a mean > 5 and < 6", ylab="Percent", ylim=c(0, 40), names=c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))

more6 <- c(7.99, 2.26, 3.37, 3.62, 17.29, 9.46, 8.95, 12.83 ,8.93, 25.3)

barplot (more6, main="Countries with a mean > 6", ylab="Percent", ylim=c(0, 40), names=c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))

Let's put the par function at the top of the code to tell R how to plot these three graphs:

par(mfrow=c(1, 3))

The above line of code tells R to plot 1 row and 3 columns of plots. Here's the output:

barplot3This is the output for par(mfrow=c(3, 1)):

barplot3v---

That's it for this post: here is a text file of the code. By the way, the graph above can be found in my article on midpoint misperceptions.

Tagged with: ,