Saturday, December 29, 2012

The Antidepressant Wars, a Sequel: How the Media Distort Findings and Do Harm to Patients

Reposting of my December 26, 2012 PLoS guest blog on Mind the Brain.

Central to the perspective I present in this blog post is my work supervising psychiatric residents and medical students at a university-based psychiatry clinic where our patient population includes a good number of adults suffering from mild to moderate depression.

In 2010, the publication by JAMA of a single-study challenged and upended a major assumption that had guided clinical work like ours for over three decades (Barrett 2001; Qaseem 2008). This was the widely covered meta-analysis of antidepressant (AD) trials conducted by Fournier and colleagues(2010), which drew the far reaching conclusion that ADs show significant response in very severely depressed patients, but are not more effective than taking a placebo  in less severe cases.

Fournier was not the first study that took aim at the foundation of treatment guidelines for depression, which in essence recommend treating depression with antidepressants. In 2008 Kirsch et al. meta-analysis of clinical trial data submitted to the Food and Drug Administration ended with a rather strongly worded conclusion:
“Drug–placebo differences in antidepressant efficacy increase as a function of baseline severity, but are relatively small even for severely depressed patients” (emphasis added). (Kirsch et al., 2008)

After reading their findings a neutral conclusion for Kirsch et al. would be that

ADs are statistically better than placebo
Response correlates with patients’ severity of symptoms.

Not an earth shattering conclusion by any means as both results were already common knowledge for anyone who started prescribing ADs since 2002, the date when Kahn et al. published their 45 studies based meta-analysis of FDA submitted AD trial data. Their conclusion?

“The magnitude of symptom reduction was significantly related to [..] initial depression […] scores; the higher the […] initial […] score, the larger the change.” (Kahn et al. 2002)

Therefore, one can look at the Kirsch (2008) study findings as a replication of earlier findings, a continuation of a line of knowledge that has already been established. Which is most times the way scientific knowledge expands. Given this, one would be hard pressed to understand how a study that essentially replicated prior positive findings would become the poster child for the anti-antidepressant movement that followed. But that is exactly what happened.

Interestingly, Kahn et al. (2002) was not cited by Kirsch et al. (2008), in itself a remarkable oversight considering the similarities between the two studies. But I found even more troubling that instead of conservatively explaining their findings and providing as much of a neutral and tentative explanation as possible — the validated scientific communication tradition —  Kirsch et al. appeared to formulate their conclusion from a position of commitment to an anti-antidepressant view:

“The relationship between initial severity and antidepressant efficacy is attributable to decreased responsiveness to placebo [even] among very severely depressed patients, rather than to increased responsiveness to medication.” (Kirsch et al., 2008)

And that strongly worded conclusion made the Kirsch study an almost overnight media hit. Front-page newspaper and radio coverage followed, criticism was dismissed (Horder 2011).

To this date, the Kirsch study remains one of the most popular papers on the PLoS Medicine website, as reflected in the following metrics: 282,219 views, 631 citations, 300 academic bookmarks, and 404 social share (data as of December 20th, 2012).  A number of critical commentaries followed.  Some directly criticized Kirsch et al. (2008) for methodology or overstated conclusions (Kelly, 2008; Khan and Khan, 2008; McAllister-Williams, 2008a, 2008b; Moller, 2008; Nutt and Malizia 2008; Parker, 2009; Turner and Rosenthal, 2008). More interestingly, a few who decided to re-analyze Kirsch’s data found they could not replicate Kirsch et al. pessimistic view on AD’s efficacy (Fountoulakis , 2011; Horder et al., 2011). For unclear reasons these subsequent reports aimed at reestablishing the ADs respectability got much less media attention than Kirsch’s 2008 original.

“Déjà vu All Over Again”

In this context when Fournier at al. came along in 2010 I thought I had a déjà vu. Not as much in terms of the study’s conclusions but rather in terms of the emotional intensity and dramatic flavor with which it was greeted by the mass media. I first heard about it on NPR, and this surprised me as I usually get the studies I am interested in before the media does.

Over the next couple of days headlines such as these appeared in print and online media around the world:

The NY Times: “Popular Drugs May Help Only Severe Depression”(Carey B, 2010)

From the LA Times: “Antidepressant medications probably provide little or no benefit to people with mild or moderate depression” (Roan S, 2010)

Immediately following this media hoopla, I found that my students – a new generation who have not been part of the Kirsch antidepressants wars – began to routinely question the wisdom of continuing or starting antidepressant treatment for our patients suffering from mild or moderate depression.

And it did not take long for our patients themselves to express their doubts about the efficacy of antidepressants — even for severe depression.

I was troubled at the time by the unquestioning coverage of Fournier et al which inferred that this single study was in fact “settled science” on the subject of antidepressants when it was not; and by the inattention given (in either the professional literature or popular press) to either the complexities or long history of debate (as discussed above) or at least the serious flaws in the study’s methodology – as I’ve summarized below.

Two years later I’m equally concerned about the lack of media coverage given to a 2012 publication, also by JAMA, of a study by Gibbons and colleagues (2012) which, history aside, refutes Fournier’s claim that antidepressants are not more effective than placebo for mild to moderate depression. Similar to Fournier et al. (2010) Gibbons et al.’s (2012) findings are based on individual patient data and include longitudinal measurement which makes its conclusions a strong counterpoint to those of Fournier et al. (2010).

Among the points I now make to my students when questions arise about antidepressant efficacy as a result of the meta-analysis conducted by Fournier et al, are the following:

The individual patient-level data approach used by Fournier et al represented an improvement over standard meta-analyses; however their results were based on only 6 studies that met their criteria from more than 200 relevant studies. Reducing 2164 citations to 6 is hardly representative, especially when the 6 analyzed studies represent only two medications: paroxetine and imipramine, the latter not recommended for first line treatment of depressive disorders. Furthermore, of the 6 studies, 5 specifically excluded with very mild depression making the authors’ conclusions about lack of separation of ADs from placebo for mild depression weak.

Exclusion Criteria Raise Major Questions

The strength of a meta-analysis is based on applying a solid statistical approach to all studies meeting a set of relevant inclusion/exclusion criteria, and in this case it appeared that the authors excluded too many relevant studies.  Specifically, 228 studies were excluded based upon their exclusionary “placebo washout lead-in” requirement (a requirement that all study participants get a placebo to start with and only those who do not respond to the placebo continue in the study). The placebo washout/lead-in represents a common historical design used in antidepressant trials with the intent of excluding patients who do not demonstrate symptom stability thus are not likely to benefit from a truly effective AD. Fournier et al. (2010) acknowledge that “it is not clear that placebo washouts actually enhance the statistical power of antidepressant medication/placebo comparisons” nevertheless they proposed that in order to evaluate the rates of “true placebo response” one should exclude all studies using a placebo wash-out/ lead-in design.

While it is true that a placebo washout might limit accurate estimates of placebo-response and might not improve the probability of an AD being more effective that a placebo, this design for studies of depression would not affect the validity of an active AD – placebo separation, were one to be found. The exclusion of washout studies was especially problematic precisely because this represents acommon design for AD clinical trials, meaning that numerous relevant studies will be excluded. In other words Fournier et al. imposed a seemingly arbitrary (i.e. not evidence based) exclusionary criterion that effectively filtered out themajority of the relevant studies. This is a very bright red flag and potential source of bias, which greatly limits the validity of the authors’ conclusions.  Assuming these easily excluded studies were otherwise methodologically sound, the number of study investigators contacted would have increased from 23 to 251; and likely significantly more than 6 would have contributed to the final analysis.

Considering the potentially grave implications of either mental health providers or patients accepting the headlines generated by widespread publication of these results at face value, the study’s  methodological weaknesses  –which were not treated in any depth by comments accepted for publication by JAMA – warrant further critical review.

Overlooked and Highly Relevant Research

Likely because it received dramatically less coverage, far fewer of my students are aware of the 2012 study by Gibbons et al (2012) who, after reviewing 43 fluoxetine and venlafaxine trials, concluded that, contrary to the Fournier at al. (2010) findings, these two antidepressants are in fact efficacious for major depressive disorder in all age groups, regardless of the depression severity at baseline.

As noted, Gibbons et al. (2012), as Fournier et al. (2010), also used patient-level data – making the point against Fournier el al. even more significant. In addition, if you compare Gibbons et al. (2012) final set of 43 studies with a meta-analysis population of 4303 patients in the fluoxetine trials and 4882 patients in the venlafaxine trials (in total more 9000 patients) to the Fournier et al. (2010) final set of 6 studies (3 paroxetine and 3 imipramine trials) with a total of 718 patients, Gibbons et al. (2012) significantly larger number of studies makes for a more believable conclusion.

Both studies are limited in that they focused on only 2 ADs: paroxetine and imipramine for Fournier et al. (2010) versus fluoxetine and venlafaxine for Gibbons et al. (2012). At the same time Gibbons et al. (2012) used an all-inclusive set of studies, whereas,  as noted above, the Fournier et al. (2010) study used a highly selective group of studies. There are also important differences in data analytic methods that could explain the differences in results. For example, Gibbons et al. (2012) defined severity differently than Fournier et al. (2010).

To expert eyes, the main effects for the drug versus placebo differences can be actually seen as similar in the two data sets. And that is the very reason for engaging in this debate.

Which study is more convincing?

The Gibbons study reminds us that it is our duty as physicians and society at large to carefully screen and aggressively treat depression, including with medications if so recommended. The Fournier study makes us aware that there might be more to the story of AD response than a straightforward active ingredient effect.

We can all speculate about why the Gibbons study received so much less media coverage than did Fournier and colleagues.

The Sequel

In the antidepressant wars, we have seen the pendulum’s full swing from the early nineties when Elisabeth Wurtzel’s “Prozac Nation” was thrilled to be “Listening to Prozac” with Peter Kramer, and into the early millennium years when Healy’s tongue in cheek advice was to “Let Them Eat Prozac”. By the time Carl Elloit’s “Prozac as a Way of Life” hit the stands in 2003, some thought we were at the end of an era.  But ADs came back strong, only to engender renewed debate and, as argued above, uneven and thus inaccurate media coverage in the current decade.

Unintended Consequences of an Unevenly Covered Debate

As my esteemed colleague Michael Thase adeptly put it to me, “There is no ‘last word’ in the science of this debate.” He is undoubtedly correct. And, as a physician, I find relief in the fact that we continue to question engrained assumption and are reluctant to accept there is such a thing as a last word or simple explanation when it comes to complex issues. Depression, with its multidimensional tentacles equally anchored in nature and nurture will never be a good subject for simple explanations.

But, again, as a physician I am very concerned about major unintended consequences of uneven coverage of the competing major findings discussed above. Specifically, I fear that clinically depressed members of the public at large will refuse a likely efficacious treatment option.  And while all may be well if that depressed patient makes the informed alternative choice of starting treatment with cognitive behavioral therapy (CBT), a validated form of therapy for depression that compares well with SSRIs for mild or moderate depression, all is certainly NOT well if the patient’s decision not to accept treatment with antidepressants is based primarily on media delivered misinformation.

Given the stigma against acknowledging or treating a mental illness with a psychotropic medication, the media saturation given to one study only worsens an already difficult situation for many patients who fear the personal and social consequences of admitting their illness and seeking treatment.

In closing: my hope is that members of the media who cover this debate will realize that “first do no harm” is not only the duty of physicians; it is also the responsibility of anyone trusted with giving health information to the public at large.

Acknowledgements: I would like to thank Lawrence Faziola and Steven Potkin for critically discussing Fournier et al. and Michael Thase for his critical read of the draft to this article.


