scientific replication | Faith Seeking Understanding

A 2014 study by a well known researcher from Columbia University indicated that “Sexual minorities living in communities with high levels of anti-gay prejudice experienced a higher hazard of mortality than those living in low-prejudice communities.” The press release for the study said it was the first study to look at the consequences of anti-gay prejudice for mortality. The study’s lead author, Mark Hatzenbuehler, said: “The results of this study suggest a broadening of the consequences of prejudice to include premature death.” The authors thought their study’s results highlighted the importance of examining structural forms of stigma and prejudice as social determinants of health and longevity among minority populations. A significant and potentially important finding—except it may not be true.

The original study, “Structural Stigma and All-Cause Mortality in Sexual Minority Populations” by Hatzenbuehler et al. was published in the February 2014 issue of Social Science & Medicine. Another researcher, Mark Regnerus, set out to replicate the Hatzenbuehler et al. study, but was not able to do so. Regenerus included a more refined imputation strategy in his replication, but still failed to find any significant results. “No data imputation approach yielded parameters that supported the original study’s conclusions.” Regenerus said:

Ten different approaches to multiple imputation of missing data yielded none in which the effect of structural stigma on the mortality of sexual minorities was statistically significant. Minimally, the original study’s structural stigma variable (and hence its key result) is so sensitive to subjective measurement decisions as to be rendered unreliable.

Writing for the National Review, Maggie Gallagher said that Regenerus’s failure to replicate the Hatzenbuehler et al. study amounted to a repudiation of that study. She also thought the study was faked. “When social justice displaces truth as the core value of academics, bad things happen to science.” She implied Hatzenbuehler might have slipped a bogus study into a major social-science journal, “confident that nobody would want to review and contest its findings, which so please the overwhelmingly liberal academy.”

Gallagher then referred to Mark Regenerus as an emerging scientific hero; a “modern-day Galileo standing up to the new theology of the Left.” But I think she misses the point. Both Hatzenbuehler and Regenerus are doing exactly what they are supposed to do in science: publishing their results and attempting to replicate the research of others. Henry Bauer, a professor of Chemistry & Scientific Studies at Virginia Polytechnic Institute and State University, describes how the “knowledge filter” in science can help uncover the real failures and confirm the true successes.

Bauer asks what would happen if most scientists rounded off or fudged their findings. What if they thought more about who wanted results and less about what an experiment actually showed? “To understand why science may be reliable or unreliable, you have to recognize that science is done by human beings, and that how they interact with one another is absolutely crucial.” He then went on to describe how frontier science leads to publication in the primary literature.

If those [findings] seem interesting enough to others, they’ll be used and thereby tested and perhaps modified or extended – or found to be untrue. Whatever survives as useful knowledge gets cited in other articles and eventually in review articles and monographs, the secondary literature, which is considerably more consensual and reliable than the primary literature.

Regenerus’s findings themselves have to be replicated; by more than one additional study before Gallagher’s assessment that Regenerus repudiated Hatzenbuehler et al. is confirmed. Concluding the study was faked or bogus based just upon his findings is irresponsible and goes beyond what Regenerus himself said.

Regenerus said the findings of the Hatzenbuehler et al. study seemed to be very sensitive to subjective decisions made about the imputation of missing data, “decisions to which readers are not privy.” He also thought the structural stigma variable itself was questionable, “Hence the original study’s claims that such stigma stably accounts for 12 years of diminished life span among sexual minorities seems unfounded, since it is entirely mitigated in multiple attempts to replicate the imputed stigma variable.” He thought his study highlighted the importance of cooperation and transparency in science.

The unavailability of the original study’s syntax and the insufficient description of multiple imputation procedures leave unclear the reasons for the failed replication. It does, however, suggest that the results are far more contingent and tenuous than the original authors conveyed. This should not be read as a commentary on missing data or on the broader field of the study of social stigma on physical and emotional health outcomes, but rather as a call to greater transparency in science (Ioannidis, 2005). While the original study is not unique in its lack of details about multiple imputation procedures, future efforts ought to include supplementary material (online) enabling scholars elsewhere to evaluate and replicate studies’ central findings (Rezvan et al., 2015). This would enhance the educational content of studies as well as improve disciplinary rigor across research domains.

Regenerus is not a scientific hero and Hatzenbuehler is not a research villain. But two other individuals identified by Gallagher in her article may fit within those categories.

Michael LaCour co-authored a paper along with Donald Green that was published in the prestigious journal Science in December of 2014. The original article abstract said: “LaCour and Green demonstrate that simply a 20-minute conversation with a gay canvasser produced a large and sustained shift in attitudes toward same-sex marriage for Los Angeles County residents.” Green is a highly respected political science professor now at Columbia. LaCour was a political science grad student at UCLA.

Back in September of 2013, Michael LaCour met with David Broockman at the annual meeting of the American Political Science Association and showed him some of the early results of his study. Writing for NYMag.com, Jesse Singal noted how Broockman was “blown away” by some of the results LaCour shared with him. LaCour also told him he was looking to get Donald Green as a coauthor on the paper. Coincidentally, Green happened to be Broockman’s undergraduate advisor when they were both at Yale.

Singal pointed out that LaCour’s results were so noteworthy because they contradicted every established belief about political persuasion. “The sheer magnitude of effect LaCour had found in his study simply doesn’t happen — piles of previous research had shown that.” In early 2015, Broockman decided to replicate LaCour’s findings. The first clue there was something wrong was when he realized the estimated cost for a replication would be a cool million dollars. Where would a grad student like LaCour get the money or funding for a study like that? That first anomaly eventually led to: “Irregularities in LaCour (2014),” a 27 page report he coauthored with Josh Kalla and Yale University political scientist, Peter Arnow.

“Irregularities” is diplomatic phrasing; what the trio found was that there’s no evidence LaCour ever actually collaborated with uSamp, the survey firm he claimed to have worked with to produce his data, and that he most likely didn’t commission any surveys whatsoever. Instead, he took a preexisting dataset, pawned it off as his own, and faked the persuasion “effects” of the canvassing. It’s the sort of brazen data fraud you just don’t see that often, especially in a journal like Science.

Green quickly emailed the journal and asked for a retraction, which he received. When contacted about comments that he had failed in his supervisory role for the study, Green said that assessment was entirely fair: “I am deeply embarrassed that I did not suspect and discover the fabrication of the survey data and grateful to the team of researchers who brought it to my attention.”

LaCour had a job offer as an incoming assistant professor at Princeton rescinded. He also reportedly lied about several items on his curriculum vitae, including grants and a teaching award. You can review a post mortem of the LaCour controversy by Neuroskeptic for Discover Magazine here. Neuroskeptic thought LaCour’s objections to Broockman et al. were weak. He also thought Lacour’s objections to the findings of Broockman et al. failed to refute their central criticism.

Cases of seeming scientific fraud, like that of LaCour, draw attention to themselves when they are discovered. Writing for STAT News, Ivan Orlansky and Adam Marcus described a survey by researchers in the Netherlands of working scientists. They were asked to score 60 research misbehaviors according to their impressions of how often the misbehaviors occur, their preventability, the impact on truth (validity), and the impact of trust between scientists. The respondents were more concerned with sloppy science than scientific fraud. Fraud, when it occurred, has a significant impact on truth and public trust. But those cases are rare; and detected cases are even rarer. They concluded:

Our ranking results seem to suggest that selective reporting, selective citing, and flaws in quality assurance and mentoring are the major evils of modern research. A picture emerges not of concern about wholesale fraud but of profound concerns that many scientists may be cutting corners and engage in sloppy science, possibly with a view to get more positive and more spectacular results that will be easier to publish in a high-impact journal and will attract many citations. In the fostering of responsible conduct of research, we recommend to develop interventions that actively discourage the high-ranking misbehaviors from our study.

So it would seem that problems with the Hatzenbuehler et al. study are not fraud, but could be due to smaller more pervasive issues in its research, such as a shoddy methodology. The LaCour case catches more attention and generates mistrust because of its apparent fraud. Orlansky and Marcus are right. Although not as flashy as fraudulent research, the smaller, less outrageous research sins are a greater threat to scientific credibility. Gallagher may have let her own ideology influence how she emphasized these two cases, but she was unquestionably right in her concluding remarks: “Science is not right-wing or left-wing. But to work, it needs scientists fearlessly committed to truth over their preferred outcomes.”

Faith Seeking Understanding

Charles Sigler, D.Phil.

Tag Archives: scientific replication

“Political” Science?