Dysfunctional fMRIs
Neuroscientists at Dartmouth placed a subject into an fMRI machine to do an open-ended mentalizing task. The subject was shown a series of photographs depicting human social situations with a specific emotional reaction. The test subject was to determine what emotion the individual in the photo must have been experiencing. When the researchers analyzed their fMRI data, it seemed like the subject was actually thinking about the pictures. What was unusual about this particular fMRI study was that the subject was a dead Atlantic salmon.
Craig Bennett and other researchers wrote up the study to warn about the dangers of false positives in fMRI data. They wanted to call attention to the need to improve statistical methods in the field of fMRI research. But Bennett’s paper was turned down by several publications. However, a poster on their work found an appreciative audience at the Human Brain Mapping conference and neuroscience researchers began forwarding it to each other. The whimsical choice of a test subject seems to have prevented publication of the study, but it effectively illustrated and important point regarding the potential for false positives in fMRI research. The discussion section of their poster said:
Can we conclude from this data that the salmon is engaging in the perspective-taking task? Certainly not. What we can determine is that random noise in the EPI time series may yield spurious results if multiple comparisons are not controlled for. Adaptive methods for controlling the FDR and FWER are excellent options and are widely available in all major fMRI analysis packages. We argue that relying on standard statistical thresholds (p < 0.001) and low minimum cluster sizes (k > 8) is an ineffective control for multiple comparisons. We further argue that the vast majority of fMRI studies should be utilizing multiple comparisons correction as standard practice in the computation of their statistics.
According to Alexis Madrigal of Wired, Bennett’s point was not to prove that fMRI research is worthless. Rather, researchers should use a set of statistical methods known as multiple comparisons correction “to maintain most of their statistical power while keeping the danger of false positives at bay.” Bennett likened the fMRI data problems to a kind of darts game and said: “In fMRI, you have 160,000 darts, and so just by random chance, by the noise that’s inherent in the fMRI data, you’re going to have some of those darts hit a bull’s-eye by accident.” So what, exactly, does fMRI measure and why is understanding this important?
The fundamental basis for neural communication in the brain is electricity. “At any moment, there are millions of tiny electrical impulses (action potentials) whizzing around your brain.” When most people talk about ‘brain activity,’ they are thinking about the activity maps generated by functional magnetic resonance imaging (fMRI). Mark Stokes, an associate professor in cognitive neuroscience at Oxford University said fMRI does not directly measure brain activity. Rather, fMRI measures the indirect consequences of neural activity, the haemodynamic response, which permits the rapid delivery of blood to active neuronal tissues. This indirect measurement is not necessarily a bad thing, if the two parameters (neural activity and blood flow) are tightly coupled together. The following figure from “What does fMRI Measure?” illustrates the pathway from neural activity to the fMRI.
A standard fMRI experiment generates thousands of measures in one scan (the 160,000 darts in Bennett’s analogy), leading to the possibility of false positives. This wealth of data in an fMRI dataset means that it is crucial to know how to interpret it properly. There are many ways to analyze an fMRI dataset, and the wealth of options may lead a researcher to choose one that seems will give him or her the best result. The danger here is that then the researcher may then only see what they want to see.
Anders Ecklund, Thomas Nichols and Hans Knutson said that while fMRI was 25 years old in 2016, its most common statistical methods have not been validated using real data. They found that the most commonly used software packages for fMRI analysis (SPM, FSL, AFNI) could result in false positive rates up to 70%, where 5% was expected. The illusion of brain activity in a dead salmon discussed above was a whimsical example of a false positive with fMRI imaging.
A neuroscientist blogging under the pen name of Neuroskeptic pointed out that a root problem uncovered by Ecklund’s research is spatial autocorrelation—“the fact that the fMRI signal tends to be similar (correlated) across nearby regions.” The difficulty is well known and has software tools to deal with it, but “these fixes don’t work properly.” The issue is the software assumes the spatial autocorrelation function has a Gaussian shape, when it fact, it has long tails, with more long-range correlations than expected. “Ultimately this leads to false positives.” See Neuroskeptic’s article for a graph illustrating this phenomena.
There is an easy fix to this problem. Ecklund and colleagues suggested using non-parametric analysis of fMRI data. “Software to implement this kind of analysis has been available for a while, but to date it has not been widely adopted.” So there is still value in doing fMRI research, but a proper analysis of the dataset is crucial if the results are to be trusted.
Neuroskeptic also discussed an analysis of 537 fMRI studies done by Sprooten et al. that compared task-related brain activation in people with a mental illness and healthy controls. The five diagnoses examined were schizophrenia, bipolar disorder, major depression, anxiety disorders and obsessive-compulsive disorder (OCD). The analysis showed very few differences between the disorders in terms of the distribution of the group differences across the regions of the brain. “In other words, there was little or no diagnostic specificity in the fMRI results. Differences between patients and controls were seen in the same brain regions, regardless of the patients’ diagnosis.”
Sprooten et al. speculated that the disorders examined in their study arose from largely overlapping neural network dysfunction. They cited another recent meat-analysis by Goodkind et al. that also found “shared neural substrates across psychopathology.” Sprooten et al. said: “Our findings suggest that the relationship between abnormalities in task-related networks to symptoms is both complex and unclear.”
Neuroskeptic didn’t think there was a need to assume this transdiagnostic trait was an underlying neurbiological cause of the various disorders. He wondered if something like anxiety or stress during the fMRI scan could have been captured by the scan.
It’s plausible that patients with mental illness would be more anxious, on average, than healthy controls, especially during an MRI scan which can be a claustrophobic, noisy and stressful experience. This anxiety could well manifest as an altered pattern of task-related brain activity, but this wouldn’t mean that anxiety or anxiety-related neural activity was the cause of any of the disorders. Even increased head movement in the patients could be driving some of these results, although I doubt it can account for all of them.
A paper in NeuroImage by Nord et al. commented how numerous researchers have proposed using fMRI biomarkers to predict therapeutic responses in psychiatric treatment. They had 29 volunteers do three tasks using pictures of emotional faces. Each volunteer did the tasks twice one day and twice about two weeks later. While the grouped activations were robust in the scanned brain areas, within-subject reliability was low. Neuroskeptic’s discussion of the study, “Unreliability of fMRI Emotional Biomarkers,” said these results could be a problem for researchers who want “to use these responses as biomarkers to help diagnose and treat disorders such as depression.”
Neuroskeptic asked one of the researchers if it was a “real” biological fact that activity in the brain areas studied actually varied within subject, or was the variability a product of the fMRI measurement? He didn’t know, but thought it wasn’t simply a measurement issue. He also thought it was perfectly possible “that the underlying neuronal responses are quite variable over time.”
Grace Jackson, a board certified psychiatrist, wrote an unpublished paper critiquing how fMRIs and other functional brain scans are being presented to the public as confirming that psychiatric disorders are real brain diseases. She pointed out the failure of media discussions to point out that functional imaging technologies, like fMRI, “are incapable of measuring brain activity.” They assess transient changes in blood flow. She also commented on the existing controversy of using this technology for diagnosis. “Due to theoretical and practical limitations, their application in the field of psychiatry is restricted to research settings at this time.”
She said even if abnormal mental activity could be objectively defined and reliably determined, “it remains unclear how any functional imaging technology could differentiate the brain processes which reflect the cause, rather than the consequence, of an allegedly impairing trait or state. She concluded with a quote from a position paper drafted by the American Psychiatric Association that said imaging research cannot yet be used to diagnose psychiatric illness and may not be useful in clinical practice for a number of years. “We conclude that, at the present time, the available evidence does not support the use of brain imaging for clinical diagnosis or treatment of psychiatric disorders…”
No current brain imaging technology, including fMRI, can be used to diagnose or treat psychiatric disorders. fMRI technology does not directly measure neural activity and has a demonstrated tendency to generate false positives, especially if the statistical analysis of the fMRI dataset is done incorrectly. Given the limitations of functional imaging technology, it is unclear how fMRI—or any other functional imaging technology—will be able to clearly distinguish between brain activity that causes an impaired neural trait or state, and brain activity that is a consequence of an impaired neural trait or state. And yet fMRI scans are presented to the public, by some individuals, as measuring brain activity and proving the existence of some psychiatric disorders. In their hands fMRI technology has become dysfMRI: dysfunctional MRI technology.