I’ve been thinking a lot about my h-index lately. This is the sort of thing that passes for vanity and ego-surfing among the academic community. For those not in the know, your h-index is computed by ranking all of your papers by how often they’ve been cited. In practice this can be fairly easily done by using the ISI Web of Knowledge (campus subscription is typically required). If you’re h-th most highly cited paper has been cited at least h times, then your h-index is h.

Example: Bob has written 4 papers. One was cited 12 times, one cited 3 times, one cited 2 times, and one cited once:

  1. 12
  2. 3
  3. 2
  4. 1

Since his second most cited paper was cited 3 times ( 3 > 2) , but his third most cited paper was only cited 2 times (2 < 3), his h-index is 2. This number has started to take on an almost magical significance in the scientific community, because at once it's a measure of the productivity and the influence of a scientist. And as ubiquitous as it's become, I was kind of surprised to learn that it's only been on the scene since 2005, when Jorge Hirsch proposed it as a metric for measuring the impact of a scientists. His perspective was perhaps somewhat skewed by considering his own field (and mine) — theoretical physics. For example, he showed that in general, the number will increase linearly with time — an argument that seems to be true. For someone in the field who’s been publishing for n years, he suggested:


where m is some multiplier. His own perspective was that a productive scientist should have m=1, and that a truly unique scientist might have m=3. The standard, at least in principle, is to determine to whom you should give a job, promote to tenure, or invite as your colloquium speaker. For what it’s worth, my first paper of any consequence was published in 1998 (11 years ago), and my h-index is 12.

It’s not without a number of problems, however. For one, a down-list author on a paper with a cast of thousands gets the same credit for a paper that the first author does, even though his/her contribution might have been minimal (or non-existent, in some collaborations).

Secondly, there’s nothing to prevent you from writing 1000 papers each of which only reference all of your previous papers. I’ll leave it as a high-school level mathematical exercise to figure out what the resulting h-index would be. The point is that there are ways to game the system. Hirsch was adamant that self-citations should be excised, but in practice, nobody does this. It would just be too time-consuming. Therefore, everybody’s h gets inflated just a bit.

Furthermore, a paper only gets to count once, no matter how often it’s cited, or how important it is. In principle, someone could publish a handful of groundbreaking papers and nothing more, and the h-index would indicate nothing more than that the researcher had phenomenally high signal-to-noise in his/her productivity.

Others (including Hirsch, actually), have noted that the h-index is almost redundant with the total number of papers published, T. To a very good approximation:


This happens to work pretty well for me. I have a total (according to the Astrophysics Data System, a pretty accurate reference for astronomers) of 456 citations, and sqrt(456)/2=10.7, a decent approximation to my true h-value.

I bring up these objections, but at the end of the day, I actually think it’s a pretty good metric of the influence of a scientist. I’ve looked up a bunch of people I respect, and lo and behond, many of them have h-indices in the 20’s or 30’s (Ed Witten — the inventor of “M Theory” has a phenomenal 112!).

On the other end, there are a number of famous scientists who have surprisingly low h-indices. Some have been active for 30 years or more, and yet have an h-index of less than 10 — some less than 5. I got to thinking about this as I foray into the world of popular science and realize that there is a definite disconnect between the scientists recognized by the world at large, and those recognized by their own field. There are some (e.g., Stephen Hawking, h-index=43), who clearly have enormous influence over both, but it’s something of a shame that for the most part it’s one or the other.

No conclusion, just a little something to think about.


This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *