A scalable framework for stylometric analysis of multi-author documents
Authors
Sarwar, RaheemYu, Chenyun
Nutanong, Sarana
Urailertprasert, Norawit
Vannaboot, Nattapol
Rakthanmanon, Thanawin
Editors
Pei, JianManolopoulos, Yannis
Sadiq, Shazia W
Li, Jianxin
Issue Date
2018-05-13
Metadata
Show full item recordAbstract
Stylometry is a statistical technique used to analyze the variations in the author’s writing styles and is typically applied to authorship attribution problems. In this investigation, we apply stylometry to authorship identification of multi-author documents (AIMD) task. We propose an AIMD technique called Co-Authorship Graph (CAG) which can be used to collaboratively attribute different portions of documents to different authors belonging to the same community. Based on CAG, we propose a novel AIMD solution which (i) significantly outperforms the existing state-of-the-art solution; (ii) can effectively handle a larger number of co-authors; and (iii) is capable of handling the case when some of the listed co-authors have not contributed to the document as a writer. We conducted an extensive experimental study to compare the proposed solution and the best existing AIMD method using real and synthetic datasets. We show that the proposed solution significantly outperforms existing state-of-the-art method.Citation
Sarwar R., Yu C., Nutanong S., Urailertprasert N., Vannaboot N., Rakthanmanon T. (2018) A Scalable Framework for Stylometric Analysis of Multi-author Documents. In: Pei J., Manolopoulos Y., Sadiq S., Li J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science, vol 10827. Springer, Cham. https://doi.org/10.1007/978-3-319-91452-7_52Publisher
SpringerJournal
Database Systems for Advanced Applications - 23rd International Conference, DASFAA 2018, Gold Coast, QLD, Australia, May 21-24, 2018, Proceedings, Part IAdditional Links
https://link.springer.com/chapter/10.1007%2F978-3-319-91452-7_52Type
Conference contributionLanguage
enDescription
This is an accepted manuscript of a chapter published by Springer in Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science, vol 10827 on 13/05/2018, available online: https://doi.org/10.1007/978-3-319-91452-7_52 The accepted version of the publication may differ from the final published version.Series/Report no.
Lecture Notes in Computer ScienceISSN
0302-9743ISBN
9783319914510ae974a485f413a2113503eed53cd6c53
10.1007/978-3-319-91452-7_52
Scopus Count
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by-nc-nd/4.0/