There should not be any mystery: a comment on sampling issues in bibliometrics
SubjectSampling issues in bibliometrics
A research unit wants to assess whether its publications tend to be more cited than academic publications in general. This research unit could be anything from a lonesome researcher to a national research council sponsoring thousands of researchers. The unit has access to the (inverted) percentile ranks of n of its publications: each publication has an associated real number between 0 and 100, which measures the percentage of publications in its reference set that receive at least as many citations as its own citation count. For instance, if the percentile rank of one publication is 10%, it means that 90% of publications in its reference set are less cited and 10% are cited at least as much. Now, say that the unit takes the mean of its n percentile ranks and reaches a value below 50%. Can the unit confidently conclude that it produces research that tends to be more cited than most publications? The article by Richard Williams and Lutz Bornmann (in press) proposes to answer this kind of question by relying on standard statistical procedures of significance testing and power analysis. I am deeply sympathetic to this proposal. I find, however, their exposition to be sometimes clouded in mystery. I suspect that many readers will therefore be unconvinced by their proposal. In this comment, I endeavor to make a clearer case for their general strategy. It should not be mysterious why this strategy is sound. By clarifying the case, I show that some technical decisions of Williams and Bornmann are mistakes (choosing a two-tailed instead of a one-tailed test), lead to less efficient estimates (choosing the t-statistic instead of simply relying on the mean) or are not as prudent as they should be (presuming a particular standard deviation). Before making these technical points, I start with a more conceptual issue: in the next section, I dispel some confusion regarding the notion of randomness in the presentation of the authors.
The following license files are associated with this document: