The Reproducibility Problem

Copyright : Fernando Gregory (Follow)

Copyright : Fernando Gregory (Follow)

In January of 2014, a Japanese stem cell scientist published what looked like groundbreaking research in the journal Nature that suggested stem cells could be made quickly and easily. But as James Gallagher of the BBC noted, “the findings were too good to be true.” Her work was investigated by the center where she conducted her research amid concern within the scientific community that the results had been fabricated. In July, the Riken Institute wrote a retraction of the original article, noting the presence of “multiple errors.” The scientist was later found guilty of misconduct. In December of 2014, Riken announced that their attempts to reproduce the results had failed. Dr. Obokata resigned saying, “I even can’t find the words for an apology.”

The ability to repeat or replicate someone’s research is the way scientists can weed out nonsense, stupidity and pseudo-science from legitimate science. In Scientific Literacy and the Scientific Method, Henry Bauer described a ‘knowledge filter’ that illustrated this process. The first stage of this process was research or frontier science. The research is then presented to editors and referees of scientific journals for review, in hopes of being published. It may also be presented to other interested parties in seminars or at conferences. If the research successfully passes through this first filter, it will be published in the primary literature of the respective scientific field—and pass into the second stage of the scientific knowledge filter.

The second filter consists of others trying to replicate the initial research or apply some modification or extension of the original research. This is where the reproducibility problem occurs. The majority of these replications fail. But if the results of the initial research can be replicated, these results are also published as review articles or monographs (the third stage). After being successfully replicated, the original research is seen as “mostly reliable,” according to Bauer.

So while the stem cell research of Dr. Obokata made it through the first filter to the second stage, it seems that it shouldn’t have. The implication is that Nature didn’t do a very good job reviewing the data submitted to it for publication. However, when the second filtering process began, it detected the errors that should have been caught by the first filter and kept what was poor science from being accepted as reliable science.

A third filter occurs where the concordance of the research results with other fields of science is explored. There is also continued research by others who again confirm, modify and extend the original findings. When the original research successfully comes through this filter, it is “mostly very reliable,” and will get included into scientific textbooks.

Francis Collins and Lawrence Tabak of The National Institute of Health (NIH) commented that: “Science has long been regarded as ‘self-correcting’, given that it is founded on the replication of earlier work.” But they noted how the checks and balances built into the process of doing science—that once helped to ensure its trustworthiness—have been compromised. This has led to the inability of researchers to reproduce the initial research findings.  Think here of how Obokata’s stem cell research was approved for publication in Nature, one of the most prestigious science journals.

The reproducibility problem has become a serious concern within research conducted into psychiatric disorders. Thomas Insel, the Director of the National Institute of Mental Health (NIMH), wrote a November 14, 2014 article in his blog on the “reproducibility problem” in scientific publications. He said that “as much as 80 percent of the science from academic labs, even science published in the best journals, cannot be replicated.” Insel said this failure was not always because of fraud or the fabrication of results. Perhaps his comment was made with the above discussion of Dr. Obokata’s research in mind. Then again, maybe it was made in regard to the following study.

On September 16, 2014, the journal Translational Psychiatry published a study done at Northwestern University that claimed it was the “First Blood Test Able to Diagnose Depression in Adults.”  Eva Redei, the co-author of the study said: “This clearly indicates that you can have a blood-based laboratory test for depression, providing a scientific diagnosis in the same way someone is diagnosed with high blood pressure or high cholesterol.” A surprise finding of the study was that the blood test also predicted who would benefit from cognitive behavioral therapy. The study was supported by grants from the NIMH and the NIH.

The Redei et al. study received a good bit of positive attention in the news media.  It was even called a “game changing” test for depression. WebMD, Newsweek, Huffington Post, US News and World Report, Time and others published articles on the research—all on the Translational Psychiatry publication date of September 16th.  Then James Coyne, PhD published a critique of the press coverage and the study in his “Quick Thoughts” blog. Coyne systematically critiqued the claims of the Redei et al. study. Responding to Dr. Rediei’s quote in the above paragraph, he said: “Maybe someday we will have a blood-based laboratory test for depression, but by themselves, these data do not increase the probability.”

He wondered why these mental health professionals would make such “misleading, premature, and potentially harmful claims.” In part, he thought it was because it was fashionable and newsworthy to claim progress in an objective blood test for depression. “Indeed, Thomas Insel, the director of NIMH is now insisting that even grant applications for psychotherapy research include examining potential biomarkers.” Coyne ended with quotes that indicated that Redei et al. were hoping to monetize their blood test. In an article for Genomeweb.com, Coyne quoted them as saying: “Now, the group is looking to develop this test into a commercial product, and seeking investment and partners.”

Coyne then posted a more thorough critique of the study, which he said would allow readers to “learn to critically examine the credibility of such claims that will inevitably arise in the future.” He noted how the small sample size contributed to its strong results—which are unlikely to be replicated in other samples. He also cited much larger studies looking for biomarkers for depression that failed to find evidence for them. His critique of the Redie et al. study was devastating. The comments from others seemed to agree. But how could these researchers be so blind?

Redie et al. apparently believed unquestionably that there is a biological cause for depression. As a result, their commitment to this belief effected how they did their research to the extent that they were blind to the problems pointed to by Coyne. Listen to the video embedded in the link “First Blood Test Able to Diagnose Depression in Adults” to hear Dr. Redie acknowledge she believes that depression is a disease like any other disease. Otherwise, why attempt to find a blood test for depression?

Attempts to replicate the Redei et al. study, if they are done, will raise further questions and (probably) refute what Coyne said was a study with a “modest sample size and voodoo statistics.” Before we go chasing down another dead end in the labyrinth of failed efforts to find a biochemical cause for depression, let’s stop and be clear about whether this “game changer” is really what it claims to be.


Psychiatry’s Mythical Phoenix

Prominent research psychiatrists are beginning to sound like their “antipsychiatric” critics. They are saying the current DSM diagnostic system isn’t valid; that something new, something scientifically sound and useful for treating patients is needed. One of these research psychiatrists is Thomas Insel, the Director of the Director of the National Institute of Mental Health (NIMH). He dropped a bombshell last year when he announced that the NIMH would be “re-orienting its research away from DSM categories.” The New York Times quoted Insel as saying: “As long as the research community takes the D.S.M. to be a bible, we’ll never make progress. . . . People think that everything has to match D.S.M. criteria, but you know what? Biology never read that book.”

So the NIMH has developed a new research strategy to classify mental disorders based upon “dimensions of observable behavior and neurobiological measures.” This strategic plan is known as: Research Domain Criteria (RDoC). The long-term goal is for RDoC to be “a framework to guide classification of patients for research studies.” It was not meant to be a useful clinical tool. “It is hoped that by creating a framework that interfaces directly with genomics, neuroscience, and behavioral science, progress in explicating etiology and suggesting new treatments will be markedly facilitated.”

RDoC is in search of the holy grail of psychiatry: reliable biomarkers (measurable indicators of a biological state or condition) for mental disorders. This search for biomarkers has been going on for decades. David Kupfer, the chair of the DSM-5 Task Force said: “We’ve been telling patients for several decades that we are waiting for biomarkers. We’re still waiting.” Susan Kamens suggested that the imminent discovery of biomarkers has been “the driving expectation of psychiatry since its birth in the 18th century.” But there are some problems with the RDoC quest.

What RDoC proposes is to replace the DSM diagnoses used currently to frame mental health research with broad categories based upon cognitive, behavioral and neural mechanisms. This means that the NIMH will be supporting research projects that look across or sub-divide existing DSM categories. But this very same DSM is what is used to assess the potential of future NIMH-funded research under RDoC.

In an article found in Nature, “Psychiatry Framework Seeks to Reform Diagnostic Doctrine,” Nassir Ghaemi said: “It is very hard for people who have been following the DSM their entire professional lives to suddenly give it up.” Ghaemi has felt shackled by the DSM. He wanted to do some research that cut across DSM categories. But his colleagues warned him against straying too far from the DSM structure when he applied for funding from the NIMH, because peer reviewers tended to insist on research structured by the DSM. So he held off from applying.

Steven Hyman, a former NIMH director, blames the DSM for hampering research into the biological or genetic basis of psychiatric illness. He said it was “a fool’s errand” to use symptom-based DSM diagnosis with little basis in nature to try and find a biomarker. Hyman urged the NIMH to think about how biomarkers identified by RDoC would be incorporated into mental health practice with the DSM. “It would be very problematic for the research and clinical enterprises to wake up in a decade to a yawning gulf.”

But Susan Kamens sees a deeper problem with blaming the DSM for hampering the search for biomarkers—it takes for granted that the biomarkers exist. In other words, it presumes what it seeks to find. According to Kamens:

“The main difference is belief versus doubt in the hypothesis that what we call mental disorder is primarily a disorder of biology. We treat that hypothesis as unfalsifiable, as if the proof [that mental disorder is biological] arrived before the evidence. We don’t test whether the hypothesis holds; we test whether and how to make the data fit it. When critics raise doubts, they’re often accused of ignoring the very same evidence that psychiatric researchers have recently declared to be utterly insufficient.”

Kamens noted that the RDoC “blueprint” is no less theoretical that the DSM-5. While the RDoC constructs are more measurable than the categories listed in the DSM, they are “essentially no more than basic human emotions and behaviors.”  She asked how RDoC would make clinically meaningful determinations into its “domains” and “constructs”? How would the research reveal anything beyond the coordinates of normal psychological processes? “In other words, how is RDoC anything beyond basic (nonclinical) neuroscience?”

RDoC is developing a new research model that will undoubtedly yield unprecedented data, but it focuses on the biogenetic correlates and normative mapping of basic psychological processes like visual perception, language, fear responses, and circadian rhythms. The idea is to create interventions for psychological and physiological processes that deviate from the norm. For this reason, RDoC is less likely to save psychiatry than it is to resurrect eugenics.

The quest for biomarkers in psychiatry can be likened to the legend of the phoenix, a mythological bird that repeatedly rises out of the ashes of its predecessor. The DSM seems to be near end of its life-cycle. Now psychiatry is building an RDoC “nest” that it will eventually ignite, reducing both the DSM and RDoC to ashes. And from these ashes, it is hoped, a new diagnostic system—a new phoenix—will arise.

Also see my blog post, “Psychiatry Has No Clothes.”


Psychiatry Has No Clothes

On April 29th of 2013, there was an astounding blog post by Thomas Insel, the Director of the National Institute of Mental Health (NIMH). He said that although the DSM-5 was due to be released in a few weeks, the NIMH would be “re-orienting its research away from DSM categories.” He noted that while the DSM has been referred to as a “Bible” for the field of mental health, “It is, at best, a dictionary, creating a set of labels and defining each.” Did you get that? The Director of the NIMH said the DSM was a “dictionary” that created “labels.” It was not, then functioning adequately, in his opinion, as its title suggests: as a Diagnostic and Statistical Manual of Mental Disorders! (emphasis added)

Insel said its strength had been “reliability”, meaning that it provided a way for clinicians to use the same terms in the same way. Its weakness was that it lacked validity. DSM diagnoses are based upon a consensus about clusters of symptoms and not any objective laboratory measure. “In the rest of medicine, that would be equivalent to creating diagnostic systems based on the nature of chest pain or the quality of fever.”

Insel was not using “reliability” in a statistical sense. In “The Myth of the Reliability of DSM,” Stuart Kirk and Herb Kutchins demonstrated conclusively that the DSM-III and DSM-IIIR were not statistically reliable. In fact, using the same statistic that Robert Spitzer used to justify the major changes to the DSM in the 1970s, they demonstrated that:

The reliability problem is much the same as it was 30 years ago [before the DSM-III]. Only now the current developers of the DSM-IV have de-emphasised the reliability problem and claim to be scientifically solving other problems.

Unfortunately, the tables in Figures 1 and 2 have been removed from the online version of their article. But the tables are still available in the original article found in the Journal of Mind and Behavior, 15 (1&2), 1994, p. 71-86. These tables plainly showed how the DSM statistical reliability was not what it was claimed to be. The Selling of the DSM (1992) by Stuart Kirk and Herb Kutchins has the tables. And there is a graphic comparison of the data within Mad Science (2013) by Stuart Kirk, Tomi Gomory, and David Cohen.

Insel went on in his blog to say that the NIMH will be supporting research projects that “look across current categories” or sub-divide them in order to begin to develop a better system. “We are committed to new and better treatments, but we feel this will only happen by developing a more precise diagnostic system.” In order to work towards that goal, the NIMH launched the Research Domain Criteria (RDoC). RDoC is only a research framework for now; a decade-long project that is just beginning. You can learn more about RDoC here (on the NIMH website).

Robert Whitaker, author of Anatomy of an Epidemic, said in a March 2014 interview that Insel stating that the DSM lacked validity was an acknowledgement the “disease model” has failed as a basis for making psychiatric diagnoses.

When Insel states that the disorders haven’t been validated, he is stating that the entire edifice that modern psychiatry is built upon is flawed, and unsupported by science. That is like the King of Psychiatry saying that the discipline has no clothes. If the public loses faith in the DSM and comes to see it as unscientific, then psychiatry has a real credibility problem on its hands.

Two weeks later on May 13, 2013, a joint press release was made by Thomas Insel and Jeffrey Liebermann, the President-elect of the American Psychiatric Association (APA). They said that the NIMH and the APA had a shared interest to ensure that patients and healthcare providers had “the best available tools and information” to identify and treat mental health issues.

Today, the American Psychiatric Association’s (APA) Diagnostic and Statistical Manual of Mental Disorders (DSM), along with the International Classification of Diseases (ICD) represents the best information currently available for clinical diagnosis of mental disorders. . . . The National Institute of Mental Health (NIMH) has not changed its position on DSM-5. As NIMH’s Research Domain Criteria (RDoC) project website states: “The diagnostic categories represented in the DSM-IV and the International Classification of Diseases-10 (ICD-10, containing virtually identical disorder codes) remain the contemporary consensus standard for how mental disorders are diagnosed and treated.”

The DSM and RDoC were said to be complementary, not competing frameworks. As research findings emerge from RDoC, they may be incorporated into future DSM revisions. “But this is a long-term undertaking. It will take years to fulfill the promise that this research effort represents for transforming the diagnosis and treatment of mental disorders.”

Saul Levine, the CEO and Medical Director of the APA said on May 5, 2014 that the DSM and the RDoC will “begin to come together” as the research from NIMH is included into the way they diagnose mental illness. They know that mental illness and substance use disorders are a bio-psycho-social illness. “We work very well together with NIMH. And I think that the whole field is looking to the science coming out of NIMH to include it as a way to help get better treatment for patients in this country.”

So the APA and NIMH affirm they are working towards the same goals as complementary research frameworks. Someday the research findings of RDoC may even be included into the DSM. But until then, the NIMH will have to continue to “ooh and aah” at the APA’s DSM and ignore the nay-sayers crying: “Look at the DSM; look at the DSM!”

Does it seem that psychiatry is trying to promote an unreliable, invalid—perhaps invisible—system of diagnosis?

Also see my blog post, “Psychiatry’s Mythical Phoenix.”