Are there too many uncited articles? Zero inflated variants of the discretised lognormal and hooked power law distributions
Name:
Publisher version
View Source
Access full-text PDFOpen Access
View Source
Check access options
Check access options
Average rating
Cast your vote
You can rate an item by clicking the amount of stars they wish to award to this item.
When enough users have cast their vote on this item, the average rating will also be shown.
Star rating
Your vote was cast
Thank you for your feedback
Thank you for your feedback
Authors
Thelwall, MIssue Date
2016-05-06
Metadata
Show full item recordAbstract
© 2016 Elsevier Ltd. Although statistical models fit many citation data sets reasonably well with the best fitting models being the hooked power law and discretised lognormal distribution, the fits are rarely close. One possible reason is that there might be more uncited articles than would be predicted by any model if some articles are inherently uncitable. Using data from 23 different Scopus categories, this article tests the assumption that removing a proportion of uncited articles from a citation dataset allows statistical distributions to have much closer fits. It also introduces two new models, zero inflated discretised lognormal distribution and the zero inflated hooked power law distribution and algorithms to fit them. In all 23 cases, the zero inflated version of the discretised lognormal distribution was an improvement on the standard version and in 16 out of 23 cases the zero inflated version of the hooked power law was an improvement on the standard version. Without zero inflation the discretised lognormal models fit the data better than the hooked power law distribution 6 out of 23 times and with it, the discretised lognormal models fit the data better than the hooked power law distribution 9 out of 23 times. Apparently uncitable articles seem to occur due to the presence of academic-related magazines in Scopus categories. In conclusion, future citation analysis and research indicators should take into account uncitable articles, and the best fitting distribution for sets of citation counts from a single subject and year is either the zero inflated discretised lognormal or zero inflated hooked power law.Citation
Thelwall, M. (2016) Are there too many uncited articles? Zero inflated variants of the discretised lognormal and hooked power law distributions, Journal of Informetrics, 10(2), pp. 622-633.Publisher
ElsevierJournal
Journal of InformetricsType
Journal articleLanguage
enDescription
Thelwall, M. (in press) Journal of Informetrics. Software and data available here: https://dx.doi.org/10.6084/m9.figshare.3186997.v1ISSN
1751-1577EISSN
1875-5879ae974a485f413a2113503eed53cd6c53
10.1016/j.joi.2016.04.014
Scopus Count
Collections
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by-nc-nd/4.0/