A provocative piece was recently published in Science concerning the relationship between paylines (i.e, priority scores) for grant submissions to NIH for bread and butter RO1s and the resulting impact of publications arising from those grants. Now, I’m no NIH peer review wonk*, but I found some of the results a bit surprising.
In the research discussed, grants funded at NHLBI (National Heart, Lung, and Blood Institute) between 2001-2008 were divided into three tiers: better than 10th percentile (best), 10th to 20th percentile, and 20th to 42nd percentile. And, well, I think the figure in the article says it all:
The data suggests that the relationship between the percentile scores on a grant does not correlate at all with publications per grant or citations per millions of dollars spent. However, a closer look at some of the original research that this Science article data is based off of does actually suggest that there is a weak relationship between those grants published in the top 10th percentile and citations in the first 2 years post-publication (oddly, this fact was left out of the Science article). Nonetheless, this appears to confirm much of what was discussed in the comments section of a recent post by
FunkyDrugMonkey where priority scores seemed to be all over the place for most investigators with no obvious relationship between self-perceived best and worst ideas. Taken together, it looks like the priority scores are an extremely poor predictor of “impact”, at least as judged by citation counts**.
So what does this all mean? Well, to me it suggests that either A) the peer review grant system at NIH is really rubbish or B) the system is wasting a lot of time and money in trying to discern the undiscernible. I suspect the answer is B (as pointed out in the comments section of the article by a Scott Nelson), that these committees at NIH are getting a lot of really good quality proposals, and differentiating between them is an inherently impossible task resulting in a lot of noise in the system. If all the proposals are roughly equal with respect to the quality of the proposed science, then what gets higher/lower scores is going depend more on the idiosyncrasies of the reviewers/review panel than anything particularly objective or obvious (or to take a more cynical tone, the name of the PI on the grant application).
If it is the case that NIH study sections are largely trying to discern the undiscernable, then it suggests that there would be straightforward ways in which to streamline the entire process. Perhaps after proposals are deemed to meet some acceptable scientific threshold a subset of them are chosen randomly to be funded and another subset chosen by program officers or others at the NIH based on certain strategic priorities, something like that. Seems like it could be a fairer and less expensive and time intensive system that would result in similar outcomes.
Even if my ideas here are a bit off, findings such as this at the very least suggest that we need to check our assumptions with respect to the best and most efficient ways in which to assess grant applications. A priori, I certainly would have expected a reasonably strong positive correlation between priority scores and citations. I would love to hear how the findings from this work jive with anecdotes of any readers out there.
*I’m very much a tyro when it comes to NIH grants, so feel free to take me to task in the comments section if I’m dead wrong on my understanding of the NIH grant peer review game.
**I’m not sold on the idea that citations are a good marker for impact. I think impact is a much more ephemeral concept than can be captured in any single or suite of metrics. Given the inherent unpredictability of science, true impact is not apparent when squinting into the bright light of the future, but when taking stock of the past.