Metric analysis

Reliability of the researcher metric, the h index is declining | News

Evolving paternity patterns mean the h-index is no longer an effective way to gauge a scientist’s impact, according to a new study by data scientists at tech giant Intel.

First created in 2005 by American physicist Jorge Hirsch, the h index is a measure of a researcher’s most cited articles. A scientist with an h index of 30 published 30 papers that were each cited over 30 times.

Due to its relative simplicity, the h-index has become a widely used tool to quantify the impact of scientists in their fields. But its use has always been controversial. “Since its introduction, it has been heavily criticized by professional bibliometricians,” says Lutz Bornmann, a research evaluation expert based at the Max Planck Society in Munich, Germany.

Critics of the h index report that it unfairly penalizes early-career researchers, who have had less time than their older colleagues to publish papers and accumulate citations. The metric also ignores different publication rates across academic fields and may even encourage poor publishing practices, such as excessive self-citation and inclusion of authors in articles that have contributed little. . The h-index also completely ignores important aspects of academic life beyond publication – for example leadership roles, teaching or outreach. “Nonetheless, it has become a popular indicator, especially among amateur bibliometricians,” says Bornmann.

Investigate h

Despite these problems, the h-index still appears in popular scholarly databases and, in some cases, can influence important recruitment and funding decisions that affect the careers of researchers. Vladlen koltun, chief scientist at Intel’s Intelligent Systems Lab, says he and his colleagues noticed inconsistencies when examining h indices from researchers in various fields.

“We set out to probe the h index and asked if this is really the best metric we can come up with – because it is being used whether we like it or not,” says Koltun. “It’s used for educational purposes the same way we used it, but also, perhaps more importantly, it’s used by various committees that assess scientists for awards, promotions and so on.”

Koltun and his colleague David hafner used computer tools to analyze citation data from millions of articles in four different scientific fields. “We collected data with temporal annotations, so that we could follow the evolution of a researcher’s h index over time – we know what the researcher’s h index was in 2010, 2019, 1998” , explains Koltun. “And we did it on a scale of thousands of researchers.”

They then crossed the data with the lists of winners of various science awards and winners from national academies, which Koltun says serves as proof of the scientists’ reputation within their community.

“So we can look at the correlation in real time – does the h index correlate with a reputation right now?” »Explains Koltun. “But even more interesting to me, we could ask questions like, ‘Does the h index predict reputation in the future? Because that’s actually how it’s used… the most important use of these metrics is to make decisions like who should we hire? “

Predictive power drops

According to Koltun’s analysis, when the h-index was first created, it was a reasonably good indicator of who might win future awards. But this “predictive power” has started to wane over the years. “So much so that now the correlation between the rankings induced by the index h in physics, for example, and the rankings induced by the rewards and recognition by this academic community – the correlation is zero, there is simply no correlation, ”Koltun explains.

One of the reasons for this is the growing number of large scientific collaborations, says Koltun. He points out that hyper-paternity – a growing phenomenon where global research consortia are producing papers with thousands of co-authors – allows people to build up huge h clues very quickly.

“What our data also shows is that hyperauthors are simply an extreme manifestation of a larger shift in patterns of authorship and publication. In general, people publish more, people co-write more, author lists are growing, ”says Koltun. “And if you don’t take that into account, you get parameter inflation and h-index inflation across the board.”

Koltun and Hafner propose a new metric, the “h-frac”, to solve this problem. The h-frac assigns a proportion of citations to each author, depending on the number of co-authors on an article. “It’s more reliable than the h-index… Even when we go back to 2005, when the h-index was introduced, the h-frac was already more reliable, but the gap has widened considerably because the reliability of the h-index fell off a cliff. ‘

Both the h-index and the h-frac seek to determine which researchers have made the greatest cumulative contribution to their field in their lifetime. But the Intel team is also keen to see if similar metrics can provide insight into which groups are currently doing the most innovative work or consistently delivering groundbreaking results. In their latest study, currently available ahead of peer review as pre-print, Koltun, and Hafner suggest another metric to address this problem, Cape Town, which assesses the impact of a researcher’s work against their publication volume.

Since 2005, more than 50 alternative measures to the h index have been offered without any practical significance, says Bornmann, who is not convinced that new variants will become important indicators. He points out that the Web of Science database recently adopted Beamplots – a data visualization tool the Bornmann team helped develop that illustrates a researcher’s publishing history over time. Clarivate, which runs Web of Science, hopes such tools “will take us away from narrowing down to a single point metric and force us to think about why citation performance is as it is.”

Koltun and Hafner recognize calls to abandon simplified citation-based measures and agree that ideal scenarios would involve a thorough evaluation of the researchers’ work. But with the use of such measures “more prevalent than ever”, they argue that there is a need for better measures. They hope their findings “can inform the science of science and support further quantitative analysis of research, publication and scientific achievement.”