Gary Schwitzer is the founder and publisher of HealthNewsReview. He has covered health care news almost exclusively since 1973. Here is his online bio. He tweets as @garyschwitzer or as @HealthNewsRevu.
Yesterday on Twitter, several clinician-researchers urged me to weigh in on the New York Times’ introduction of a new Coronavirus Drug and Treatment Tracker. They were troubled by it.
I was, too, when I saw it. But I had no time to address it yesterday and – more than 24 hours later – I still don’t have time to do anything more than post the criticisms that some other smart observers have made. (For anyone who has not read the banner across the top of this or any of page on this website, I am only able to keep the lights on with occasional blog posts like this one. But when so many strong voices rise up, as they did in this case, they are taking on some of the role of what HealthNewsReview.org did for so many years.)
First, let me point out that if you went to the Tracker yesterday you saw this: 20 treatments rated in 6 categories.
There is one line in the text that addresses changes: “We will update and expand the list as new evidence emerges.” But changing categories and numbers within those categories overnight seems like a pretty dramatic shift in evidence – or in editorial decision-making – within just a few hours.
Was this tracker ready for prime time? Some critics thought it wasn’t.
In response to one Tweep writing, “This is Fantastic!”, Zackary Berger, MD, of Johns Hopkins tweeted:
I don’t think this makes things clear. First, while outcomes are occasionally mentioned, they aren’t always. (Treatment *how*? Reducing what?) Disparate treatments are grouped.
I lol’d when I saw the writer for strong evidence: RCTs or…widespread use by doctors. There are lots of different kinds of evidence. An RCT hasn’t been the sole gold standard for a while in clinical epidemiology. Second, doctors don’t follow best evidence.
Note that the “Strong Evidence” category disappeared on Friday, with “Widely used” taking its place. Note also that yesterday’s “Mixed Evidence” category has morphed into a “Tentative or mixed evidence” category, and that the “Pseudoscience” category has broadened to “Pseudoscience or fraud.” And Thursday’s “Ineffective” category disappeared. There weren’t any ineffective treatments listed anyway.
Venk Murthy, MD, of the University of Michigan, tweeted questions about the “Strong Evidence” ratings for 5 therapies including remdesivir and dexamethasone. He continued:
I’m also concerned by what the real differences between promising and tentative is? When major medical societies feel it is hard to grade evidence into more than 3 categories (e.g. A, B, C in cardiology guidelines), I’m impressed that NYT can resolve 6 categories.
This needs to be retracted ASAP: this is ridiculous….I am frankly appalled, even for COVID times.
Then Dr. Murthy re-entered the tweet thread:
Isn’t this the logical conclusion of all of the cheering of data journalism? If journalists *without deep domain expertise* are cheered for making visualizations, then why shouldn’t they synthesize clinical study data and create practice guidelines?
We should spread the blame: the upstream cause of this is journals publishing flawed studies. Perhaps we expect too much from journalists? They might say: “this is a big study in a big journal, in this case with a supportive podcast from the (editor in chief).
Dr. Murthy responded:
There is plenty of blame to go around, but I can’t recall another instance where a newspaper conducted a pseudo-systematic review of a medical topic based on their own evaluation of the evidence….
Cardiology guidelines publish “level of evidence” next to any recommendations, & hours of debate & discussion go into these because of the nuances which have major societal implications; this @ article is more glitter fest than scientific analysis:adds to the data divide
And if they are going to rate the evidence why not use a methodologically sound approach like GRADE (gradeworkinggroup.org). Decades ago we left ratings based solely on study design.
Today, Dr. Murthy and others caught the overnight changes, but still were not pleased:
NYT reporters cobbling together something overnight without process or expertise is downright dangerous and seems from the outside to be arrogant.
Synthesizing studies like this is very hard work. It takes lots of time and study to build up the required expertise. Even then experts disagree and thus we often have many experts working together with special methods to build consensus and rate evidence.
The fact that there were such massive changes within first 24 hours shows how poorly thought out this was. Rather than updating, this should have been taken down with a serious mean culpa.
The New York Times has delivered terrific journalism on most days of this pandemic. Carl Zimmer, one of three journalists with bylines on this Tracker, is one of the finest science journalists of our era.
This was a creative effort, but a puzzling one to many observers. It appears that the Times is working out the bugs as they go – which is the way much of journalism and science and medicine is working – on quick turnaround – in this pandemic.
But sometimes, as Richard Saitz and I wrote for JAMA this week,
Trust in science, medicine, public relations, and journalism may be in jeopardy in the intersection where these professions meet.
Time—even a few moments daily—can help prevent harm. Any professional communicating about this pandemic should spend such time to reflect on how the words and the data matter, and then act accordingly.
I will continue to follow the development of, and reactions to, this Tracker.
Other expert perspectives have been posted on Twitter.
Cardiologist Sanjay Kaul, MD of Cedars-Sinai in Los Angeles tweeted:
The evidence rating appear to be well-anchored. Yes, they misclassified evidentiary strength for enoxaparin/AC, but so do journals and guidelines. But overall, a decent effort. In any case, we shouldn’t be taking medical advice from newspapers, including NYT. … Good to see NYT is corrigible!
UCLA cardiologist Gregg Fonarow, MD, wrote:
I give the @ a lot of credit for aiming to assist their readership sort through COVID-19 treatments; which therapies are efficacious, which are not, and which need more study. There are a few corrections needed, but overall helpful.
And University of Ottawa nephrologist, Swapnil Hiremath, MD, after some fiery criticisms that included “It’s mostly idiotic”, later wrote:
Ok so @ reached out and after a few emails and a phone convo I see that RRT has been removed & Cytosorb downgraded. Not perfect yet but a step in the right direction