Now that the REF submission window has closed, a small panel of academics are tasked with rating thousands of academic submissions, which will result in university departments being ranked and public money being distributed. Given the enormity of the task and the scarcity of the resources devoted to it, Daniel Sgroi discusses a straightforward procedure that might help, based on the Bayesian methods that academic economists study and teach when considering the problem of decision-making under uncertainty.
The UK is about to enter into one of the most important academic ranking exercises in its history. The Research Excellence Framework (or REF), starting in 2014, will determine how money is divided between departments and how the UK perceives the quality of its own universities and departments. As part of this process, university-based research-active academics throughout the UK have submitted work to the REF. There will be one panel for each discipline, each made up of a number of highly esteemed academics who will review the submissions. This will be a difficult process for the panel to manage, with many thousands of submissions which are by their nature relatively new (published between 2008 and 2013).
Of course, economists are experts at decision-making under uncertainty, so we are uniquely well-placed to handle this. However, there is a roadblock that has been thrown up that makes that task a bit harder – the REF guidelines insist that the panel cannot make use of journal impact factors or any hierarchy of journals as part of the assessment process. It seems perplexing that any information should be ignored in this process, especially when it seems so pertinent. Here I will argue that journal quality is important and should be used, but only in combination with other relevant data. Since we teach our own students a particular method (courtesy of the Reverend Thomas Bayes) for making such decisions, why not practise what we preach?
To cut to the punch-line, experts on REF panels can take a weighted average of the journal’s impact factor (representing ‘prior information’) and the accrued citations to the article (representing ‘news’). As the years pass, more and more weight can be placed on the accrued citations. In the early years, almost all the weight can be placed on the journal’s impact factor. Bayesian methods will tell us how to adjust the weights as time passes. There is plenty of room for expert opinion and sensible judgement to play a role, so this is not purely mechanistic.
Decision-making under uncertainty
Any modern Economics course includes decision-making under uncertainty, and the key method taught in universities throughout the world is based around the methods developed 300 years ago by the Reverend Thomas Bayes. This is how economists believe that economic agents merge any prior information they may have with any data they can acquire when trying to solve complex problems in an uncertain world. Any mainstream textbook will cover such a method, and any undergraduate student should be able to explain it. Moreover, a host of academic papers have been published within the discipline of Economics over the last decade about how to assess journal quality including Kalaitzidakis et al. (2003), Palacio-Huerta and Volij (2004), and Ritzberger (2009) which incorporate some serious academic scholarship. There are also a host of easily accessible sources of citations data. So there is data out there and an established method of how to use such information.
The current issue of the Economic Journal is partly devoted to the REF, with various papers offering advice to the panel members who will be undertaking the assessment. In that issue, an article by Sgroi and Oswald details an approach to rating submissions based on Bayesian methodology which makes use of journal quality and citations data (Sgroi and Oswald 2013). We argue that despite the guidelines, panel members will most likely be aware of journal rankings, and find it hard not to use them – at least to form some initial beliefs about the quality of a submission – and indeed using a mix of journal quality and citations data makes perfect sense.
How the REF works
The 2014 REF allows universities to nominate four outputs per person, which will usually be journal articles. These nominations will be examined by peer reviewers, and will be graded by the reviewers into categories from a low of 1* up to a high of 4*. The REF guidelines (here) present a clear distinction between the different levels of submission from unclassified through to 4*. Table 1 outlines the precise wording.
Table 1. Descriptions from the REF guidelines
The assigned star ratings will contribute 65% of the panel’s final assessment of the research quality of that department in that university. The remaining 35% are divided between ‘impact’ (20%) and ‘environment’ (15%). Impact here means the research work’s non-academic, practical applicability. We will focus throughout this article on the 65% that link to academic submissions, and the question of how to ascertain the value of any given submission.
To simplify the Bayesian calculations, one approach is to draw a line between 4* and the rest. The question then becomes: Is a submission ‘world-leading’ or not? The idea is to give each submission a probability of being world-leading. The average per department will be a simple number between 0 and 1 that reveals the probability that any random submission from that department is indeed world-leading. Departments can then be ranked according to that number.
A system of weights
The system we recommend is to construct a weighted average of journal quality and citations data. The intuition behind this is simple. We have an initial estimate (in Bayesian language, the ‘prior probability’) of the quality of a piece of work which comes from where it was published. The number we use for that prior needs to be built from as objective a measure of journal quality as we can find (and we will come to that soon). Next we can use the growing information from citations to update that initial estimate to form instead a considered, more informed estimate (in Bayesian language, the ‘posterior probability’) of its quality. The more an article is cited by others the more, in general, it can be said to be having important influence – and in turn the greater is the probability that it is world-leading and hitting the requirements for a 4* submission. The system suggested is no different from the standard process of ‘Bayesian updating’ taught in universities and used in a host of published articles within Economics. In particular, work exists indicating how firms should (or do, if they are assumed to behave optimally) use such rules to make complex decisions under uncertainty (Gill and Sgroi 2012).
In any problem under uncertainty, you can think of yourself as having some prior beliefs and then adding whatever data you can find to generate a better estimate. Consider, for instance deciding, which mobile phone to purchase. The best choice depends upon lots of unknowns (which new apps will appear, how much will you use it in your free time and how much for work, is screen size going to matter or battery life?). You have some prior idea based on your phone usage over the last few years, but then you will start to gain more information about your future use – perhaps as your boss reveals the sort of work you are going to undertake, or perhaps because more of your friends are using Facebook so a good mobile app seems more important than it did. So you update, and as you do your chances of getting that decision right will increase. You may not use Bayes’ rule consciously, but (if Economics is even close to correct) the rules you use internally are not far off, and are based on a lifetime of trying to get such decisions right. Moreover, if this was a multimillion pound decision by a firm, they would likely hire people who can incorporate all the available information and use optimal (Bayesian) rules to get the decision right since error is likely to cost money. The REF panel’s verdict will likewise be worth multimillions of pounds to the university sector, so getting it right matters.
Bayes rule itself is simple but powerful. It spells out the precise weights you need based on how you see the importance of the prior (journal quality) and the news (citations). Our published paper (Sgroi and Oswald 2013) demonstrates exactly how Bayes’ rule is used in this context, but first there is the matter of how to find the right data.
Journal quality and citations: Finding the right data
The practical issue is then how to find the right prior (the journal quality) and how to judge citations data. There are a growing number of sources of good citations data such as Google Scholar, the Web of Science (published by Thomson Reuters), and Scopus. The journal quality is more contentious, but probably the best approach is to use an existing measure. To that end, Table 2 presents some published measures from Kalaitzidakis et al. (2003), Palacio-Huerta and Volij (2004), and Ritzberger (2009) respectively for some example journals beginning with the letter ‘A’, and column (4) presents some numbers from the new Eigenfactors method from the Thomson Reuters ISI Web of Knowledge 2010 Social Sciences Database (subcategory Economics). A fuller list (going beyond the letter ‘A’!) and much more detail is available from the paper (Sgroi and Oswald 2013).
Table 2. Journals beginning with the letter ‘A’
These need to be converted into Bayesian priors, which requires a simple transformation to be applied. Table 3 gives some possible ways forward with columns (1)–(4) directly transforming the equivalent columns in Table 2 following a simple rule designed to generate numbers between 0.05 and 0.95 – with column (4) providing a simple average.
Table 3. Possible Bayesian priors for journals beginning with the letter ‘A’
Once we have our prior we need to update it to get to our more informed belief. Again, keeping things simple, we could take as ‘news’ how a particular submission is doing relative to some average. So if an article is getting more cites on Google Scholar or the Web of Science than average this is good news, otherwise bad news. Of course, more complex measures exist incorporating the precise number of cites and various controls for the field in question or the origin of the citation. However, even just the number relative to the average (a pretty coarse measure) should move us in the right direction – and over time get us to something close to the truth. There are limitations – sometimes a great work goes uncited for long periods of time (take Bachelier 1900 for example), and sometimes papers are cited merely to point out errors – so it may take a while before the truth is revealed.
The REF panel will have to review thousands of submissions in a short space of time, and rightly or wrongly will end up relying on rules. The best rule for the job is provided by the Reverend Thomas Bayes. Bayesian methods are ubiquitous within Economics as a discipline, and we assume that (subconsciously at least) the use of Bayes’ rule is widespread. To make sense of this, we need some prior beliefs about the quality of an academic submission, and some data to add to that prior to give us a sensible posterior belief. The obvious prior is where that submission was published. If we do not believe that is important, then why are some journals classified as better than others, and why do we try so hard to publish in an elite group of top-rated journals? Why do publications in some make careers while others go unnoticed? Clearly over time – as real data on the importance of a paper accrues – the location of the paper becomes less important, so we would expect to see the journal take a backseat to citations data as an indicator of quality.
The implications for the REF are clear – rather than pretending to ignore information, it would be better to embrace the data that is out there and do so in a measured, consistent, and sensible way. Beyond the REF panel into the broader world – where candidates for jobs, for promotions, or for prestigious awards need to be selected – the exact same methods make sense (just as economists assume they are being used by sensible optimising agents in the first place).
It remains to say that of course there is a central role for deliberation and for judgment, which can sit on top of any Bayesian approach – but that does not detract from the fact that with thousands of submissions and limited time and manpower, rules will inevitably be used, subconsciously if not consciously – so why not use the rules based on the methods we teach as optimal to our own students?
This originally appeared on Vox and is reposted with permission.
Note: This article gives the views of the author, and not the position of the Impact of Social Science blog, nor of the London School of Economics. Please review our Comments Policy if you have any concerns on posting a comment below.
Daniel Sgroi is Associate Professor of Economics at the University of Warwick.