Finding bias in discussions of campus sexual assault data

When science is used to support proposed changes to public policy, it isn’t uncommon for opponents of the policy changes to question the legitimacy of the studies cited. This often leads to rejection of scientific studies for completely unscientific reasons, and can even devolve into outright scientific denialism.

Earlier this year, the Obama administration proposed controversial policy changes related to sexual assault prevention on college campuses. As evidence of the need for reform, the White House Task Force to Protect Students from Sexual Assault cited the statistic that one in five women attending college are sexually assaulted at some point during their time on campus. Unsurprisingly, those opposed to the sexual assault policy changes are questioning the legitimacy of both the statistic and the study that produced it.

Recently, Emily Yoffe published an article in which she argues that the statistics on sexual assault presented by the Obama administration are misleading. Yoffe describes herself as “bringing some healthy skepticism to the hard work of putting a number on the prevalence of campus rape.” The thing is, skepticism in and of itself isn’t really that helpful unless you understand how to think critically about scientific studies. Yoffe’s article presents a good example of how misconceptions about research methodology and statistics can derail an otherwise productive conversation and steer it towards the territory of science denialism.

Before we get into the details of Yoffe’s article, I need to give you some background on current research related to campus sexual assault. Two of the most highly cited studies with regards to rates of sexual assault on college campuses are the Campus Sexual Assault Study (CSA) and the National Crime Victimization Survey (NCVS). The statistics reported in the two studies are quite different; the CSA found an incidence of sexual assault of nearly 20% among participants (or one in five, as mentioned above), while the NCVS reports that around 0.6% of college females experience sexual assault.

The research methodologies of the two studies differ in a number of important ways. The CSA surveyed over 5,000 women from two large public universities via an anonymous email questionnaire. The NCVS is a survey of 160,000 people across the country that uses interviews conducted in person to collect data. This article gives a detailed explanation of why, based on these and other differences in methodology between the studies, we should not be surprised that the specific findings of the studies are different.

Okay, back to Yoffe’s article. In order to make the point that the new policies proposed by the Obama administration are based on research that is “problematic,” Yoffe contacted Christopher Krebs (the lead author of the CSA) for an interview. Yoffe asked Krebs whether the study “represents the experience” of the 12 million female college students currently in the US. According to Yoffe’s article, Krebs responded by saying that the study sampled two schools, which does not make their data nationally representative.

Based on Krebs’ answer that the data are not “representative,” Yoffe argues that the results of the CSA study must be totally irrelevant to the current discussion of nationwide campus sexual assault. In fact, if you listen to this podcast where Yoffe is interviewed about her article, it is clear that she is convinced she uncovered some sort of bombshell- she says she was “floored” to hear Krebs’ answer, and that it was “bold of him to admit.”

The truth is, Krebs’ answer is completely unsurprising from a research point of view, and here’s why: the idea of data being “representative” has a specific meaning in statistics that isn’t the same as it’s everyday colloquial usage. In statistics, a population is a set of all individuals that share a characteristic a researcher is interested in studying (e.g., all college females). A sample is the subset of individuals from a population that a researcher has measured (e.g., in the case of the CSA study, survey respondents from two universities). A representative sample is obtained by randomly sampling individuals from the population, which is the most statistically sound way to ensure that the characteristics of the sample match those of the population.

Truly random samples are incredibly difficult to come by, especially in the case of survey-based studies. The CSA study is not a random sample of all college females because it was conducted at only two universities. But even if researchers had been able to send their surveys to a random sample of college women across the country (or to every single woman in college in the U.S., for that matter), they would still have almost no chance of sampling a truly random subset of the population. This is because they have very little control over who actually responds to the email and takes their survey.

So, yes, the CSA sample isn’t a truly representative sample of the entire population of female college students. That doesn’t mean (as Yoffe implies) that we can’t learn anything from it. The field of statistics allows researchers to analyze data collected from a sample in order to learn something about a population. This is really good news, because if scientists had to measure every single individual in a population in order to learn something about that population, it would be completely impractical. We wouldn’t know anything about anything!

Yoffe’s mission- to introduce “some healthy skepticism” to the discussion- is one that I respect. As a scientist, I support skepticism 100%. That being said, she didn’t do a very good job of it. There are much more productive ways to exercise skepticism in order to learn as much as possible from an existing body of research.

Step one for being a well-balanced skeptic of scientific research is to think about potential sources of bias in a study. Instead of dismissing the CSA study because the researchers couldn’t ensure completely random sampling of a population of 12 million women spread across the country, it is perhaps more productive to consider whether the researcher’s sampling design was likely to introduce bias in one direction or the other. For example, would you expect the incidence of sexual assault to be predictably higher/lower at the two universities surveyed in comparison with colleges throughout the rest of the country? If the answer is yes, it may not be appropriate to draw broad conclusions about the population from that study.

Second, it is important to be balanced in your consideration of the strengths and limitations of different studies. One of the most perplexing things about Yoffe’s article is that her skepticism seems completely one-sided. She attacks the CSA study for being unrepresentative, without exploring the representativeness of the other studies she mentioned in the article. Despite its limited sample size, the CSA might actually be a more representative sample than NCSV in some respects (depending on the population we are interested in learning about). It is unscientific to hold one study to a higher standard than another just because it doesn’t support your side of the story.

Articles like Yoffe’s purport to be less biased, but ultimately end up taking a flawed approach to interpreting the research in an effort to drive a point home. Science can help inform decisions about public policy, but only if approached in a scientific way. This means making an effort to understand the strengths and limitations of the studies we are considering, the reasons why we would expect studies to differ in their statistics, and the kinds of broader conclusions that are appropriate to draw from the data currently available. Otherwise, when it comes to discussing the science, we run the risk of making about as much sense as GOB Bluth.