Loading...
A scalable framework for stylometric analysis of multi-author documents
Sarwar, Raheem ; Yu, Chenyun ; Nutanong, Sarana ; Urailertprasert, Norawit ; Vannaboot, Nattapol ; Rakthanmanon, Thanawin
Sarwar, Raheem
Yu, Chenyun
Nutanong, Sarana
Urailertprasert, Norawit
Vannaboot, Nattapol
Rakthanmanon, Thanawin
Other contributors
Affiliation
Epub Date
Issue Date
2018-05-13
Submitted date
Alternative
Abstract
Stylometry is a statistical technique used to analyze the variations in the author’s writing styles and is typically applied to authorship attribution problems. In this investigation, we apply stylometry to authorship identification of multi-author documents (AIMD) task. We propose an AIMD technique called Co-Authorship Graph (CAG) which can be used to collaboratively attribute different portions of documents to different authors belonging to the same community. Based on CAG, we propose a novel AIMD solution which (i) significantly outperforms the existing state-of-the-art solution; (ii) can effectively handle a larger number of co-authors; and (iii) is capable of handling the case when some of the listed co-authors have not contributed to the document as a writer. We conducted an extensive experimental study to compare the proposed solution and the best existing AIMD method using real and synthetic datasets. We show that the proposed solution significantly outperforms existing state-of-the-art method.
Citation
Sarwar R., Yu C., Nutanong S., Urailertprasert N., Vannaboot N., Rakthanmanon T. (2018) A Scalable Framework for Stylometric Analysis of Multi-author Documents. In: Pei J., Manolopoulos Y., Sadiq S., Li J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science, vol 10827. Springer, Cham. https://doi.org/10.1007/978-3-319-91452-7_52
Publisher
Research Unit
PubMed ID
PubMed Central ID
Embedded videos
Additional Links
Type
Conference contribution
Language
en
Description
This is an accepted manuscript of a chapter published by Springer in Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science, vol 10827 on 13/05/2018, available online: https://doi.org/10.1007/978-3-319-91452-7_52
The accepted version of the publication may differ from the final published version.
Series/Report no.
Lecture Notes in Computer Science
ISSN
0302-9743
EISSN
ISBN
9783319914510