Tuesday, April 15, 2014

Bibliometrics and Science - Failure to Understand the Basics of the Discovery Process

SO I've read yet another well meaning article on bibliometrics (like the H-Index) and why they might be ok for evaluating groups or sub-disciplines (disagree) but they are definitely not ok to evaluate individuals (agree, but for different reason).

This time, this paper on Bibliometric Indicators of Young Authors in Astrophysics: Can Later Stars be Predicted? hit the twittersphere, hence caught my attention.

Look, all these papers treat the research publications world like some high school statistics project. Ok, why not raise the game a bit.

Let's suppose that scientific discovery is a complex natural phenomenon. Let's suppose there is such a thing as progress :-)

OK so what would the time series of discovery look like? My simple minded hypothesis is that it is (like many other natural processes in a complex world) a self-similar arrival process.
So how do we characterise such a time series? well, it isn't captured in a single statistic like "mean", or even two (mean + variance) - the point of such, essentially fractal structures in time, is that they are characterised by very complex descriptors, and, crucially, prediction is hard - exactly why the weather, and associated phenonomena like flooding, and volcanic eruptions, are hard to predict on an individual basis, although, collectively, we can model broad trends. Surprise surprise (literally and figuratively:)

So science doesn't depend on a random walk in a well structured but sparse or even poisson point random space, where walking faster gets you more results. Nor does success depend on hard work (more sweat, more kudos). While a slightly more random walk might get you an inherently more surprising result, it isn't necessarily going to yield more results. And more work only pays off after the discovery, when you want to present it properly (I am sure history is littered with holes made out of discoveries that were cool, but so badly reported they were ignored and lost).

So predicting the next big discovery by a specific scientist is a bit like saying that a raindrop is going to fall on a particular rain gauge at a particular minute of a specific hour on a special day. OK if you are the bookie setting the odds, but I wouldn't bet on it.

No comments: