In developing a national research assessment process such as the research excellence framework (REF), a key issue is the potential for unintended negative consequences.
This is why the four higher education funding bodies develop the REF, building on careful evaluation and with extensive consultation. So my attention was grabbed by a piece in Times Higher Education reporting that there was evidence of researchers "rushing out...poor quality research" in response to deadlines imposed by the REF.
The report is based on a new preprint that was posted last month. Looking at the transition point between the publication windows for the research assessment exercise 2008 and REF 2014, the paper claims to provide evidence that more articles are published in the year before the transition, and that those papers are cited less than those in the following year.
This sounds like compelling evidence that the assessment is encouraging undesirable behaviour, but a closer look at the data suggests there are problems with the interpretation.
The first claim is that more articles are published in the year before the deadline. The conclusion is based on the date of publication of articles submitted to RAE 2008 compared to REF 2014, and it is indeed the case that articles published closer to the end of the RAE period are more likely to be submitted. This is not a new observation, and was reported in 2016 in work that Hefce commissioned from Digital Science.
The pattern is discernible across a number of assessment cycles, although there are interesting disciplinary differences, with tendency to submit more recent material disappearing in the science and engineering disciplines over time, but remaining in the social sciences and humanities. Of course, all of this type of analysis comes with a caveat in that it is limited to journal articles, and so only considers a minority of the submitted outputs outside of the sciences.
The reasons for this pattern are not clear, but it is important to remember that the data are for submitted articles, and the data do not reflect total volumes. Elsewhere in the paper, the authors do investigate total volumes, but the evidence is much less compelling.
First, they express the results as the UK share of global output, not as absolute numbers. There is some fluctuation in this share that links to assessment cycles, but this is only apparent for articles published in journals with a low impact factor. Looking elsewhere, there is no evidence of significant shifts in total volume in the various reports that Elsevier have produced for the UK government. There have been changes in the share, but this is largely attributed to increases in production in other countries, notably China and India.
The second claim is that articles published at the end of the assessment window have a lower citation score than those published at the beginning. Again, this conclusion is based on differences in the articles submitted for assessment rather than the total pool. The Elsevier reports referenced suggest a steady increase in the field-weighted citation impact (FWCI) of the total UK output, with no evidence of discontinuities around assessment cycles.
So how to explain the fact that articles from 2007 submitted to RAE 2008 have lower citation scores than those from 2008 submitted to REF 2014?
We know that submission choices are made at the end of the cycle. Decisions on whether or not to include the 2007 articles in RAE 2008 were made in 2007. Because those articles were so new, very limited information about their citation rate was available to influence the submission decision.
In contrast, the decisions on whether to include the 2008 articles in the REF 2014 submission were made in 2013, and, although article level citation scores are a poor proxy for quality, it may be that citation information influenced the submission decisions. As a result, choices about old articles (2008 published, with decision in 2013) are likely to be biased towards those with high citation counts, whereas choices about new articles (2007 published, with decision in 2007) are not.
In the latter case, acknowledging the criticism that citations are a weak proxy for quality when applied to individual publications, you could draw the conclusion that different, and possibly better, judgements about quality are being used. In any event, if citations are used to inform selection it is not surprising that the selected articles have more citations – this is more an artefact than "an unintended negative consequence". Just perhaps it is a positive consequence that articles published late in the assessment cycle are selected using informed academic judgements on quality, rather than citation rates.
So, overall I don't think the conclusions are supported by the data in the paper. While there is evidence that, for some disciplines, recently published articles are more likely to be submitted for assessment, there is no evidence that this equates to increased publication volumes.
The reported citation differences can be explained by factors other than differences in quality. We need to always be vigilant for potential unintended effects of assessment, but this work doesn't provide cause for concern.
Steven Hill is head of policy (research) at the Higher Education Funding Council for England (Hefce).