There are many ways for research to be influential, not just citations
Mike Calver A *A Environmental and Conservation Sciences, Murdoch University, Murdoch, WA 6150, Australia.
Pacific Conservation Biology 28(6) 459-461 https://doi.org/10.1071/PC22041
Submitted: 25 October 2022 Accepted: 28 October 2022 Published: 29 November 2022
© 2022 The Author(s) (or their employer(s)). Published by CSIRO Publishing
Abstract
Research may be influential without stimulating researchers to cite it in a manuscript.
Recently, Paul Boon contacted me regarding an email he had received from a US colleague concerning a paper he published recently in Pacific Conservation Biology on the mental health of conservation biologists (Boon 2022). Paul’s correspondent thanked him for the paper, explaining that it was the basis of useful discussions at a lab meeting. It is great to see work used in such ways, although sadly such valuable contributions are invisible in a research evaluation environment geared to metrics. It set me thinking about why metricated evaluations are attractive, what can go wrong with them, and how feedback such as Paul received might be documented and fed into research evaluations, especially those used for appointments and career progression.
Why metricated evaluations are attractive
Metricated evaluations, such as citation counts or the Journal Impact Factor (JIF), are one means to evaluate the quality of research or the merits of researchers. Superficially, they have several seductive properties. According to their advocates, citations show networks of significant/influential thinkers and ideas (Davies and Calma 2019), ‘the number of citations reflects an article’s influence and therefore quality’ (Wade 1975, p. 429), citations are objective measures (Bavelas 1978), they ‘… reflect the dynamic interplay of interests of both scholars and their institutions’ (van Wesel 2016, p. 199) and allow researchers to screen papers for importance before reading them (Sud and Thelwall 2014). With regard to journal ranking, the JIF is held to quantify research trends and provide a valuable means of prioritising funds for library subscriptions (Jacso 2012), as well as having a simple, clear definition (Bollen et al. 2009). In sum, these measures answer regulatory demand for ensuring that public research money is spent responsibly, often more quickly and cost-effectively than could be achieved by peer review (Hodge and Lacasse 2011; Buela-Casal and Zych 2012), while providing the seeming objectivity and reliable quantification of a numerical system (Adler et al. 2008).
Nevertheless, even advocates acknowledge that assessments based on metrics are highly sensitive to the methods used in individual studies: ‘An analyst (sic) of the results should keep in mind that the identification of landmark papers depends on the used methods and data. Small differences in methods and or data may lead to other results’ (Thor et al. 2021, p. 419). As an example, Miccoli and Rumiati (2019) used metrics to claim a significant increase in Italian scientific productivity, whereas Baccini et al. (2019a, 2019b) and Abramo et al. (2021) argued that the results reflected manipulation of the metrics through self-citation or citation clubs. In another case, Butler (2003, 2017) and Martin (2017) claimed that output-based research funding in Australia led to more papers of lower quality, whereas van den Besselaar et al. (2017) disagreed, claiming that the data showed increases in both quantity and quality of outputs.
Problems with metricated evaluations
‘When a measure becomes a target, it ceases to be a good measure’ (Goodhart’s Law, as quoted by Crawford 2017). Whether or not this statement was actually made by Charles Goodhart (a British economist active in the mid 20th century), it goes to the core of the problem: academics anxious to maximise their prospects may concentrate on good scores, not on good research (Fire and Guestrin 2019). This shift in focus could change what topics they research, what they publish and where they publish. The main drivers for choosing research topics could be restricted to subjects/areas likely to garner citations or appeal to perceived top quality journals, with some authors already offering guides to publishing highly cited papers (Pyke 2013, 2014). Thus, according to some critics: ‘The race for higher rankings has resulted in blind pursuit of scientific publications, with the sole focus being the publication itself, with little or no significance given to the scientific part of it’ (Lamba 2021, p. 176), with ‘The consequence... that many papers have become competitive tokens for insertion into grant-dispensing gambling machines rather than bricks in the edifice of science’ (Lawrence and Locke 1997, p. 758).
Instead of positive outcomes in research outputs, the consequences can be a narrowing of research scope (Martin 2011, 2012), a preference for productive ‘safe research’ over genuine innovation (Martin 2000; Charlton and Andras 2008), an obsession with personal metrics with characteristics of a psychological disorder (Buela-Casal 2014), encouragement of aggressive, acquisitive and exploitative behaviour (Lawrence 2002; Fong and Wilhite 2017) and possibly an incentive to fraud (Chevassus-au-Louis 2019). Anyone who doubts these points need look no further than the attempted manipulation of statistics in journal evaluation (Falagas and Alexiou 2008), Fire and Guestrin’s (2019) critique of publication metrics as an example of Goodhart’s Law in action, or what Biagioli (2016, p. 201) called ‘metrics-enabled fraud’ or ‘post-production misconduct’ in which authors ‘use fraudulent means to secure their publication, enhance their impact and inflate … importance.’
There may also be personal costs for academics struggling to meet performance guidelines based on citation metrics (Parr 2014). Some critics have suggested that metrication is encouraging overproduction of papers at the expense of genuine quality and that there should be a move away from simple productivity metrics to close evaluation of subsets of total output (e.g. Pacchioni 2018; Chevassus-au-Louis 2019). Alan Finkel, Australia’s Chief Scientist in 2019, exhorted us ‘… to heed growing calls to abandon paper counting and similar metrics for evaluating researchers’ (Finkel 2019). The challenge is to create a framework that encourages genuine quality and productivity while minimising risks of misrepresentation (Fire and Guestrin 2019). We have good international guidelines in the Leiden Manifesto (Hicks et al. 2015) and the 2012 San Francisco Declaration on Research Assessment (DORA) (https://sfdora.org), which at least some Australian universities are embracing (https://www.unimelb.edu.au/newsroom/news/2020/july/university-signs-up-to-international-agreement-for-best-practice-in-research-assessment).
Where to next?
Irrespective of criticism and debate surrounding their use, traditional metrics such as the JIF remain the dominant force/criterion in academic review, promotion and tenure in at least North America (McKiernan et al. 2019) and possibly the UK (Else 2021), although very different assessments of academic work are emerging outside the English-speaking world. For example, the ‘Room for everyone’s talent’ initiative of several Dutch univerisities (https://www.universiteitenvannederland.nl/recognitionandrewards/wp-content/uploads/2019/11/Position-paper-Room-for-everyone’s-talent.pdf) ‘… calls for a system of recognition and rewards of academics and research that:
-
Enables the diversification and vitalisation of career paths, thereby promoting excellence in each of the key areas [education, research, impact, leadership and (for University Medical Centres) patient care];
-
Acknowledges the independence and the individual qualities and ambitions of academics as well as recognising team performances;
-
Emphasises quality of work over quantitative results (such as number of publications);
-
Encourages all aspects of open science; and
-
Encourages high quality academic leadership.’
Under such a system, measures of research success other than highly-cited papers in traditional journals could be advanced.
This brings us back to the comment Paul received on his paper, which shows that it set people reflecting even though they may not necessarily write a paper and cite Paul’s work. It is a comment worth making in a detailed assessment of a subset of papers submitted for review or assessment, although it would be missed in a larger, metricated analysis. Perhaps it is time to consider such alternative approaches more seriously.
Conflicts of interest
The author is the Editor-in-Chief of Pacific Conservation Biology.
Declaration of funding
This research did not receive any specific funding.
References
Abramo, G, D’Angelo, CA, and Grilli, L (2021). The effects of citation-based research evaluation schemes on self-citation behavior. Journal of Informetrics 15, 101204.| The effects of citation-based research evaluation schemes on self-citation behavior.Crossref | GoogleScholarGoogle Scholar |
Adler R, Ewing J, Taylor P (2008) Citation statistics. A report from the International Mathematical Union (IMU) in cooperation with the International Council of Industrial and Applied Mathematics (ICIAM) and the Institute of Mathematical Statistics (IMS). (International Mathematical Union)
Baccini, A, De Nicolao, G, and Petrovich, E (2019a). Citation gaming induced by bibliometric evaluation: a country-level comparative analysis. PLoS ONE 14, e0221212.
| Citation gaming induced by bibliometric evaluation: a country-level comparative analysis.Crossref | GoogleScholarGoogle Scholar |
Baccini, A, Petrovich, E, and De Nicolao, G (2019b). Evaluating Italy’s ranking boom. Nature 576, 213.
| Evaluating Italy’s ranking boom.Crossref | GoogleScholarGoogle Scholar |
Bavelas, JB (1978). The social psychology of citations. Canadian Psychological Review/Psychologie Canadienne 19, 158–163.
| The social psychology of citations.Crossref | GoogleScholarGoogle Scholar |
Biagioli, M (2016). Watch out for cheats in citation game. Nature 535, 201.
| Watch out for cheats in citation game.Crossref | GoogleScholarGoogle Scholar |
Bollen, J, Van de Sompel, H, Hagberg, A, and Chute, R (2009). A principal component analysis of 39 scientific impact measures. PLoS ONE 4, e6022.
| A principal component analysis of 39 scientific impact measures.Crossref | GoogleScholarGoogle Scholar |
Boon, PI (2022). Is poor mental health an unrecognised occupational health and safety hazard for conservation biologists and ecologists? Reported incidences, likely causes and possible solutions. Pacific Conservation Biology , .
| Is poor mental health an unrecognised occupational health and safety hazard for conservation biologists and ecologists? Reported incidences, likely causes and possible solutions.Crossref | GoogleScholarGoogle Scholar |
Buela-Casal, G (2014). Pathological publishing: a new psychological disorder with legal consequences? The European Journal of Psychology Applied to Legal Context 6, 91–97.
| Pathological publishing: a new psychological disorder with legal consequences?Crossref | GoogleScholarGoogle Scholar |
Buela-Casal, G, and Zych, I (2012). What do the scientists think about the impact factor? Scientometrics 92, 281–292.
| What do the scientists think about the impact factor?Crossref | GoogleScholarGoogle Scholar |
Butler, L (2003). Explaining Australia’s increased share of ISI publications – the effects of a funding formula based on publication counts. Research Policy 32, 143–155.
| Explaining Australia’s increased share of ISI publications – the effects of a funding formula based on publication counts.Crossref | GoogleScholarGoogle Scholar |
Butler, L (2017). Response to van den Besselaar et al.: what happens when the Australian context is misunderstood. Journal of Informetrics 11, 919–922.
| Response to van den Besselaar et al.: what happens when the Australian context is misunderstood.Crossref | GoogleScholarGoogle Scholar |
Charlton, BG, and Andras, P (2008). ‘Down-shifting’ among top UK scientists? – The decline of ‘revolutionary science’ and the rise of ‘normal science’ in the UK compared with the USA. Medical Hypotheses 70, 465–472.
| ‘Down-shifting’ among top UK scientists? – The decline of ‘revolutionary science’ and the rise of ‘normal science’ in the UK compared with the USA.Crossref | GoogleScholarGoogle Scholar |
Chevassus-au-Louis N (2019) ‘Fraud in the lab: the high stakes of scientific research.’ (Harvard University Press: Harvard)
Crawford, SM (2017). Goodhart’s law: when waiting times became a target, they stopped being a good measure. BMJ 359, j5425.
| Goodhart’s law: when waiting times became a target, they stopped being a good measure.Crossref | GoogleScholarGoogle Scholar |
Davies, M, and Calma, A (2019). Australasian Journal of Philosophy 1947–2016: a retrospective using citation and social network analyses. Global Intellectual History 4, 181–203.
| Australasian Journal of Philosophy 1947–2016: a retrospective using citation and social network analyses.Crossref | GoogleScholarGoogle Scholar |
Else, H (2021). Row erupts over university’s use of research metrics in job-cut decisions. Nature 592, 19.
| Row erupts over university’s use of research metrics in job-cut decisions.Crossref | GoogleScholarGoogle Scholar |
Falagas, ME, and Alexiou, VG (2008). The top-ten in journal impact factor manipulation. Archivum Immunologiae et Therapiae Experimentalis 56, 223–226.
| The top-ten in journal impact factor manipulation.Crossref | GoogleScholarGoogle Scholar |
Finkel, A (2019). To move research from quantity to quality, go beyond good intentions. Nature 566, 297.
| To move research from quantity to quality, go beyond good intentions.Crossref | GoogleScholarGoogle Scholar |
Fire, M, and Guestrin, C (2019). Over-optimization of academic publishing metrics: observing Goodhart’s Law in action. GigaScience 8, giz053.
| Over-optimization of academic publishing metrics: observing Goodhart’s Law in action.Crossref | GoogleScholarGoogle Scholar |
Fong, EA, and Wilhite, AW (2017). Authorship and citation manipulation in academic research. PLoS ONE 12, e0187394.
| Authorship and citation manipulation in academic research.Crossref | GoogleScholarGoogle Scholar |
Hicks, D, Wouters, P, Waltman, L, de Rijcke, S, and Rafols, I (2015). Bibliometrics: the Leiden Manifesto for research metrics. Nature 520, 429–431.
| Bibliometrics: the Leiden Manifesto for research metrics.Crossref | GoogleScholarGoogle Scholar |
Hodge, DR, and Lacasse, JR (2011). Ranking disciplinary journals with the Google Scholar h-index: a new tool for constructing cases for tenure, promotion, and other professional decisions. Journal of Social Work Education 47, 579–596.
| Ranking disciplinary journals with the Google Scholar h-index: a new tool for constructing cases for tenure, promotion, and other professional decisions.Crossref | GoogleScholarGoogle Scholar |
Jacso, P (2012). Grim tales about the impact factor and the h-index in the Web of Science and the Journal Citation Reports databases: reflections on Vanclay’s criticism. Scientometrics 92, 325–354.
| Grim tales about the impact factor and the h-index in the Web of Science and the Journal Citation Reports databases: reflections on Vanclay’s criticism.Crossref | GoogleScholarGoogle Scholar |
Lamba, I (2021). Losing the numbers game: revisiting quality metrics through the spectrum of Goodhart’s law. European Journal of Emergency Medicine 28, 176–177.
| Losing the numbers game: revisiting quality metrics through the spectrum of Goodhart’s law.Crossref | GoogleScholarGoogle Scholar |
Lawrence, PA (2002). Rank injustice. Nature 415, 835–836.
| Rank injustice.Crossref | GoogleScholarGoogle Scholar |
Lawrence, PA, and Locke, M (1997). A man for our season. Nature 386, 757–758.
| A man for our season.Crossref | GoogleScholarGoogle Scholar |
Martin, B (2000). Research grants: problems and options. Australian Universities’ Review 43, 17–22.
Martin, B (2011). ERA: adverse consequences. Australian Universities’ Review 53, 99–102.
Martin B (2012) Breaking the siege: guidelines for struggle in science. In ‘Science under siege: zoology under threat’. (Eds P Banks, D Lunney, C Dickmam) pp. 164–170. (Royal Zoological Society of NSW: Mosman, NSW)
Martin, BR (2017). When social scientists disagree: comments on the Butler-van den Besselaar debate. Journal of Informetrics 11, 937–940.
| When social scientists disagree: comments on the Butler-van den Besselaar debate.Crossref | GoogleScholarGoogle Scholar |
McKiernan, EC, Schimanski, LA, Nieves, CM, Matthias, L, Niles, MT, and Alperin, JP (2019). Meta-research: Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations. eLife 8, e47338.
| Meta-research: Use of the Journal Impact Factor in academic review, promotion, and tenure evaluations.Crossref | GoogleScholarGoogle Scholar |
Miccoli, P, and Rumiati, RI (2019). Italy’s evaluators: rankings boom is real. Nature 574, 486.
| Italy’s evaluators: rankings boom is real.Crossref | GoogleScholarGoogle Scholar |
Pacchioni G (2018) ‘The overproduction of truth: passion, competition and integrity in modern science.’ (Oxford University Press: Oxford)
Parr C (2014) ‘Imperial College London to ‘review procedures’ after death of academic.’ (Times Higher Education)
Pyke, GH (2013). Struggling scientists: please cite our papers!. Current Science 105, 1061–1066.
Pyke, GH (2014). Achieving research excellence and citation success: what’s the point and how do you do it? BioScience 64, 90–91.
| Achieving research excellence and citation success: what’s the point and how do you do it?Crossref | GoogleScholarGoogle Scholar |
Sud, P, and Thelwall, M (2014). Evaluating altmetrics. Scientometrics 98, 1131–1143.
| Evaluating altmetrics.Crossref | GoogleScholarGoogle Scholar |
Thor, A, Bornmann, L, Haunschild, R, and Leydesdorff, L (2021). Which are the influential publications in the Web of Science subject categories over a long period of time? CRExplorer software used for big-data analyses in bibliometrics. Journal of Information Science 47, 419–428.
| Which are the influential publications in the Web of Science subject categories over a long period of time? CRExplorer software used for big-data analyses in bibliometrics.Crossref | GoogleScholarGoogle Scholar |
van den Besselaar, P, Heyman, U, and Sandström, U (2017). Perverse effects of output-based research funding? Butler’s Australian case revisited. Journal of Informetrics 11, 905–918.
| Perverse effects of output-based research funding? Butler’s Australian case revisited.Crossref | GoogleScholarGoogle Scholar |
van Wesel, M (2016). Evaluation by citation: trends in publication behavior, evaluation criteria, and the strive for high impact publications. Science and Engineering Ethics 22, 199–225.
| Evaluation by citation: trends in publication behavior, evaluation criteria, and the strive for high impact publications.Crossref | GoogleScholarGoogle Scholar |
Wade, N (1975). Citation analysis: a new tool for science administrators. Science 188, 429–432.
| Citation analysis: a new tool for science administrators.Crossref | GoogleScholarGoogle Scholar |