CAG : stylometric authorship attribution of multi-author documents using a co-authorship graph
Average rating
Cast your vote
You can rate an item by clicking the amount of stars they wish to award to this item.
When enough users have cast their vote on this item, the average rating will also be shown.
Star rating
Your vote was cast
Thank you for your feedback
Thank you for your feedback
Issue Date
2020-01-17
Metadata
Show full item recordAbstract
Stylometry has been successfully applied to perform authorship identification of single-author documents (AISD). The AISD task is concerned with identifying the original author of an anonymous document from a group of candidate authors. However, AISD techniques are not applicable to the authorship identification of multi-author documents (AIMD). Unlike AISD, where each document is written by one single author, AIMD focuses on handling multi-author documents. Due to the combinatoric nature of documents, AIMD lacks the ground truth information - that is, information on writing and non-writing authors in a multi-author document - which makes this problem more challenging to solve. Previous AIMD solutions have a number of limitations: (i) the best stylometry-based AIMD solution has a low accuracy, less than 30%; (ii) increasing the number of co-authors of papers adversely affects the performance of AIMD solutions; and (iii) AIMD solutions were not designed to handle the non-writing authors (NWAs). However, NWAs exist in real-world cases - that is, there are papers for which not every co-author listed has contributed as a writer. This paper proposes an AIMD framework called the Co-Authorship Graph that can be used to (i) capture the stylistic information of each author in a corpus of multi-author documents and (ii) make a multi-label prediction for a multi-author query document. We conducted extensive experimental studies on one synthetic and three real-world corpora. Experimental results show that our proposed framework (i) significantly outperformed competitive techniques; (ii) can effectively handle a larger number of co-authors in comparison with competitive techniques; and (iii) can effectively handle NWAs in multi-author documents.Citation
Sarwar, R., Urailertprasert, N., Vannaboot, N. et al. (2020) CAG : stylometric authorship attribution of multi-author documents using a co-authorship graph, IEEE Access, 8, pp. 18374 - 18393. 10.1109/ACCESS.2020.2967449Journal
IEEE AccessAdditional Links
https://doi.org/10.1109/ACCESS.2020.2967449Type
Journal articleLanguage
enDescription
© 2020 The Authors. Published by IEEE. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://ieeexplore.ieee.org/document/8962080ISSN
2169-3536EISSN
2169-3536Sponsors
This work was supported in part by the Digital Economy Promotion Agency under Project MP-62-0003, and in part by the Thailand Research Fund and Office of the Higher Education Commission under Grant MRG6180266.ae974a485f413a2113503eed53cd6c53
10.1109/ACCESS.2020.2967449
Scopus Count
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by/4.0/