Can the quality of published academic journal articles be assessed with machine learning?
dc.contributor.author | Thelwall, Mike | |
dc.date.accessioned | 2022-02-14T09:16:57Z | |
dc.date.available | 2022-02-14T09:16:57Z | |
dc.date.issued | 2022-02-22 | |
dc.identifier.citation | Thelwall, M. (2022) Can the quality of published academic journal articles be assessed with machine learning? Quantitative Science Studies, 3 (1), pp. 208–226. | en |
dc.identifier.issn | 2641-3337 | en |
dc.identifier.doi | 10.1162/qss_a_00185 | |
dc.identifier.uri | http://hdl.handle.net/2436/624598 | |
dc.description | © 2022 The Author. Published by MIT Press. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1162/qss_a_00185 | en |
dc.description.abstract | Formal assessments of the quality of the research produced by departments and universities are now conducted by many countries to monitor achievements and allocate performancerelated funding. These evaluations are hugely time consuming if conducted by postpublication peer review and are simplistic if based on citations or journal impact factors. This article investigates whether machine learning could help reduce the burden of peer review by using citations and metadata to learn how to score articles from a sample assessed by peer review. An experiment is used to underpin the discussion, attempting to predict journal citation thirds, as a proxy for article quality scores, for all Scopus narrow fields from 2014 to 2020. The results show that these proxy quality thirds can be predicted with above baseline accuracy in all 326 narrow fields, with Gradient Boosting Classifier, Random Forest Classifier, or Multinomial Naïve Bayes being the most accurate in nearly all cases. Nevertheless, the results partly leverage journal writing styles and topics, which are unwanted for some practical applications and cause substantial shifts in average scores between countries and between institutions within a country. There may be scope for predicting articles scores when the predictions have the highest probability. | en |
dc.format | application/pdf | en |
dc.language.iso | en | en |
dc.publisher | MIT Press | en |
dc.relation.url | https://direct.mit.edu/qss/article/doi/10.1162/qss_a_00185/109627/Can-the-quality-of-published-academic-journal | en |
dc.subject | research evaluation | en |
dc.subject | machine learning | en |
dc.subject | citation analysis | en |
dc.subject | text mining | en |
dc.title | Can the quality of published academic journal articles be assessed with machine learning? | en |
dc.type | Journal article | en |
dc.identifier.journal | Quantitative Science Studies | en |
dc.date.updated | 2022-02-10T15:24:18Z | |
dc.date.accepted | 2022-02-08 | |
rioxxterms.funder | UK Research and Innovation | en |
rioxxterms.identifier.project | UOW14022022MT | en |
rioxxterms.version | AM | en |
rioxxterms.licenseref.uri | https://creativecommons.org/licenses/by/4.0/ | en |
rioxxterms.licenseref.startdate | 2022-02-14 | en |
dc.source.volume | 3 | |
dc.source.issue | 1 | |
dc.source.beginpage | 208 | |
dc.source.endpage | 226 | |
refterms.dateFCD | 2022-02-14T09:16:40Z | |
refterms.versionFCD | AM | |
refterms.dateFOA | 2022-02-14T00:00:00Z |