The peer review drugs don’t work

A process at the heart of science is based on faith rather than evidence, says Richard Smith, and vested interests keep it in place

May 28, 2015
James Fryer illustration (28 May 2015)
Source: James Fryer

It is paradoxical and ironic that peer review, a process at the heart of science, is based on faith not evidence.

There is evidence on peer review, but few scientists and scientific editors seem to know of it – and what it shows is that the process has little if any benefit and lots of flaws.

Peer review is supposed to be the quality assurance system for science, weeding out the scientifically unreliable and reassuring readers of journals that they can trust what they are reading. In reality, however, it is ineffective, largely a lottery, anti-innovatory, slow, expensive, wasteful of scientific time, inefficient, easily abused, prone to bias, unable to detect fraud and irrelevant.

As Drummond Rennie, the founder of the annual International Congress on Peer Review and Biomedical Publication, says, “If peer review was a drug it would never be allowed onto the market.”

ADVERTISEMENT

Cochrane reviews, which gather systematically all available evidence, are the highest form of scientific evidence. A 2007 Cochrane review of peer review for journals concludes: “At present, little empirical evidence is available to support the use of editorial peer review as a mechanism to ensure quality of biomedical research.”

We can see before our eyes that peer review doesn’t work because most of what is published in scientific journals is plain wrong. The most cited paper in Plos Medicine, which was written by Stanford University’s John Ioannidis, shows that most published research findings are false. Studies by Ioannidis and others find that studies published in “top journals” are the most likely to be inaccurate. This is initially surprising, but it is to be expected as the “top journals” select studies that are new and sexy rather than reliable. A series published in The Lancet in 2014 has shown that 85 per cent of medical research is wasted because of poor methods, bias and poor quality control. A study in Nature showed that more than 85 per cent of preclinical studies could not be replicated, the acid test in science.

ADVERTISEMENT

I used to be the editor of the BMJ, and we conducted our own research into peer review. In one study we inserted eight errors into a 600 word paper and sent it 300 reviewers. None of them spotted more than five errors, and a fifth didn’t detect any. The median number spotted was two. These studies have been repeated many times with the same result. Other studies have shown that if reviewers are asked whether a study should be published there is little more agreement than would be expected by chance.

Peer review is anti-innovatory because it is a process that depends on approval by exponents of the current orthodoxy. Bruce Glick, Hans Krebs and the team of Solomon Berson and Rosalyn Yalow all had hugely important work – including Nobel prizewinning research – rejected by journals.

Many journals take months and even years to publish and the process wastes researchers’ time. As for the cost, the Research Information Network estimated the global cost of peer review at £1.9 billion in 2008.

Peer review is easily abused, and there are many examples of authors reviewing their own papers, stealing papers and ideas under the cloak of anonymity, deliberately rubbishing competitors’ work, and taking a long time to review competitors’ studies. Several studies have shown that peer review is biased against the provincial and those from low- and middle-income countries. Finally, it doesn’t guard against fraud because it works on trust: if a study says that there were 200 patients involved, reviewers and editors assume that there were.

ADVERTISEMENT

There have been many attempts to improve peer review through training reviewers, blinding them to the identity of authors and opening up the whole process, but none has shown any appreciable improvement.

Perhaps the biggest argument against the peer review of completed studies is that it simply isn’t needed. With the World Wide Web everything can be published, and the world can decide what’s important and what isn’t. This proposition strikes terror into many hearts, but with so much poor-quality science published what do we have to lose?

Yet peer review persists because of vested interests. Absurdly, academic credit is measured by where people publish, holding back scientists from simply posting their studies online rather than publishing in journals. Publishers of science journals, both commercial and society, are making returns of up to 30 per cent and journals employ thousands of people. As John Maynard Keynes observed, it is impossible to convince somebody of the value of an innovation if his or her job depends on maintaining the status quo.

Scrapping peer review may sound radical, but actually by doing so we would be returning to the origins of science. Before journals existed, scientists gathered together, presented their studies and critiqued them. The web allows us to do that on a global scale.

ADVERTISEMENT

Richard Smith was editor of the BMJ and chief executive of the BMJ Publishing Group from 1991 to 2004.

POSTSCRIPT:

Article originally published as: Ineffective at any dose? Why peer review simply doesn’t work (28 May 2015)

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Register
Please Login or Register to read this article.

Reader's comments (10)

As a medical representative of a pharmaceutical company I can tell you the situation is far worse than the above article suggests. One study to prove the efficacy of a GTN spray was actually produced by the manufacturers own sister company. Another drug Naftydrofurloxalate (Praxilene) was known to be a waste of time, but the marketing story was novel. Glucophage off restrictions was being sold as metformine to generic companies. The originating company told G.P's that generic metformine was not produced the same way as Glucophage and could lead to gastric upset. This was unlikely to be true since the bulk of metformine was and is Glucophage in a plain packet. When will the scientific community stop lying. The problem is the lies are then called world class by the corporate university. Does that then mean they are world class liars? Discuss.
The author's argument is weakened rather than strengthened by this claim: "With the World Wide Web everything can be published, and the world can decide what’s important and what isn’t." Peer review is flawed, but the alternative, of claiming that the web alone will bring scientists together, to present and critique their studies, is naive. This is demonstrated by spending some time with the arXiv pre-print server, for physics and mathematics research. arXiv is not peer reviewed. There is good and bad there, but one must know enough about the subject matter to distinguish between the two. The root of the problem is ethical integrity, as alluded to in a prior comment. In the absence of honesty, neither peer review nor vetting by the world wide web will be effective.
Excellent article. Of course the implications of the main assertion are so profound that the HE system would not be able to change itself even if there was general agreement. What we surely need is a way to publish articles freely and yet to get peer review. We have created such a journal www.oerj.org and it allows reviews to be given as ratings and for authors to argue online. Papers with negative reviews stay online if the authors so wish. The journal is about go be reprogrammed and reorganised. Any advice would be welcome
Richard Smith (THE 28/5/15, page 29) says that when he was editor of the BMJ, he deliberately inserted 8 errors into a paper and sent it to 300 reviewers. None of them spotted more than five errors. I was one of those reviewers, unaware at the time that this particular paper was part of an experiment. I sent a covering letter with my review that went something like this: “This paper is not only unpublishable in its present form; it contains a number of fatal flaws that suggest it will never be publishable. I list below three of those fatal flaws. If you would like me to take a closer look at the paper to provide a more comprehensive set of feedback for the authors, please let me know.” The BMJ’s staff – then as now – viewed peer review as a technical task (‘spotting errors’) rather than a scholarly one (interpretation and judgement). They never got back to me to ask me for a more detailed list of errors. Presumably, taking account of authors’ covering letters was not part of their study protocol. I am sure many reviewers would have done what I did – fire a quick paragraph to hole the paper below the water line and return to the things they were paid to do in their day job. Smith’s conclusion is therefore not the only one that can be drawn from his dataset. Like all experiments, artificial studies of peer review strip the process of its social context. They contain the ossified assumptions of the study designers – and for that reason they are biased in a way that will tend to confirm the authors’ initial hypothesis. Trisha Greenhalgh Professor of Primary Care Health Sciences University of Oxford
I believe this is the study Dr. Smith is referring to: http://jrs.sagepub.com/content/101/10/507.full It does list as a weakness the possibility, mentioned by Tricia above, that reviewers were proceeding only until they found enough critical errors to reject, and among the rejecting reviewers, they did report the highest discovery rates of the more obvious major errors. However, the accepting reviewers, who wouldn't have had the same stopping criteria, didn't find more errors. In fact, they universally found less. It could have been that the accepting editors felt like the study topic was important enough that they stopped quickly at "revise and resubmit" after identifying the major errors. It would be interesting to see the sentiments expressed in the actual reviews.
Prof. Trisha Greenhalgh makes an interesting comment, which seems to advocate peer review system. But in the end, it just makes things worse. From her words it seems that only reviewers purpose is to get rid of unwanted papers (which are not fashionable or, as authors say, 'sexy'). Actually, BMJ site says, that 'A high proportion of submissions are rejected without being sent out for external peer review on the grounds of PRIORITY...' etc. The peer review system works only for publishers and competitive reviewers who don't want their rivals get published. It doesn't work for science and scientists. It is no surprise that peer review has its 'social contexts' and sometimes this 'context' overweighs real scietific value of the paper.
Although I agree that peer review is a flawed system, the proposition of doing away with it completely might not make the situation any better. If there is no filtering of scientific papers and authors are allowed to publish freely on the Internet, it would be difficult for the scientific, and in particular, the non-scientific community to tell good research from bad. It might be safe to say that bad science is a product of bad research rather than bad peer review. Can newer ways of conducting peer review such as the post-publication peer review serve as an alternative to the traditional pre-publication peer review and overcome its flaws?
Posted on behalf of Richard Smith: Trish Greenhalgh boldly states in a Tweet that she disagrees with my opinion that prepublication peer review is a waste of time and provides a link to her comment on the THES website. I eagerly sought out the comment to see her arguments and was disappointed to see that her disagreement is based on a small technical point on one paper that I mentioned in support of one of the many arguments I advanced. Even if she was right in her one point, which she isn’t, it would hardly amount to a refutation of my arguments. She quotes one of the studies we at the BMJ did that showed that peer reviewers are poor at detecting important errors in papers. She says she was a reviewer in one of the studies and saw so many errors in the paper that she described only three. She says, without any evidence, that she is “sure many reviewers would have done what I did.” As far as I can remember, she’s wrong; and she may be unaware that we did similar studies more than once. She also says, again without any evidence, that “The BMJ’s staff – then as now – viewed peer review as a technical task (‘spotting errors’) rather than a scholarly one (interpretation and judgement).” I can’t answer for now, but I doubt that is the case, and certainly it wasn’t when I was the editor. Most of the discussion when peer reviewing the papers was around interpretation, and, indeed, as far as I can remember, some of the “errors” we inserted in the papers were to do with interpretation. It remains true that most reviewers don’t spot most errors, some of them egregious. I accept that there are problems in studying peer review, although I’m not sure what Trish means by “ossified assumptions,” but the case stands that we have very substantial evidence of the downside of peer review and virtually none of the upside. Perhaps Trish might be able to design some better studies to investigate the value or otherwise of peer review.
Having looked at the paper William Gunn linked to, I see Dr. Greenhalgh's point. The paper has a heavy emphasis on the low number of errors cited, with figures like "an average of 2.58 out of 9 errors found" featured prominently in the Abstract, Results, and the Discussion. This emphasis says, "finding specific errors was the most important part of the process." The number of reviewers who rejected each paper is relegated to a figure legend. Nowhere is it mentioned that rejection rates were very high: paper 1 was rejected by a 2.1 to 1 margin, with paper 2 at nearly 5 to 1, and paper 3 at 4.4 to 1. Overall, there were 1,000 decisions to reject vs. 300 to accept. I don't know about others, but when I'm reading something really bad, I reach a point where I decide that enough is enough. How many times do I need to whip a dead horse? I have more important things to do. That the authors ignored the simple statistics above strikes me as an over-focus on the wrong thing. Personally, I would have focused on the ones who wanted to accept the papers and how they viewed or missed all the problems. Moreover, the Methods say nothing about the details of the training and what was expected of the reviewers. Were they told that the purpose of the study was to examine error detection rates, or just to accept/reject and explain why? If sparse information was provided and given the above criticisms, as well as Dr. G.'s letter, it seems that the study ended up with a bias toward reducing the number of errors found. This is a major (fatal) flaw of the study. I could go on, but I'm done whipping this dead horse. Peer review has huge problems, but this study muddies the waters and makes informed debate even more difficult.
How can we know whether peer review is beneficial without a comparison to relevant alternatives? One alternative is suggested in the article: self-publishing on the web and voluntary ratings. Would a greater percentage of the highly rated such publications be replicable or make true claims? Would competitors in the field rate in a more objective and timely manner? Peer review may be beneficial relative to alternatives, even if the ratio of published findings that survive further scientific scrutiny is low. Finally, it is hard for me to see how, given the number of researchers today and the level of specialization, self-publication on the web would be similar to how science was conducted before peer-reviewed journals.

Sponsored

ADVERTISEMENT