Show simple item record

dc.contributor.authorNutanong, Sarana
dc.contributor.authorYu, Chenyun
dc.contributor.authorSarwar, Raheem
dc.contributor.authorXu, Peter
dc.contributor.authorChow, Dickson
dc.date.accessioned2020-10-12T09:38:13Z
dc.date.available2020-10-12T09:38:13Z
dc.date.issued2017-02-02
dc.identifier.citationNutanong, S., Yu, C., Sarwar, R., Xu, P. and Chow, D. (2016) A scalable framework for stylometric analysis query processing, 2016 IEEE 16th International Conference on Data Mining (ICDM). 10.1109/ICDM.2016.0147en
dc.identifier.issn2374-8486en
dc.identifier.doi10.1109/icdm.2016.0147en
dc.identifier.urihttp://hdl.handle.net/2436/623704
dc.descriptionThis is an accepted manuscript of an article published by IEEE in 2016 IEEE 16th International Conference on Data Mining (ICDM) on 02/02/2017, available online: https://ieeexplore.ieee.org/document/7837960 The accepted version of the publication may differ from the final published version.en
dc.description.abstractStylometry is the statistical analyses of variationsin the author's literary style. The technique has been used inmany linguistic analysis applications, such as, author profiling, authorship identification, and authorship verification. Over thepast two decades, authorship identification has been extensivelystudied by researchers in the area of natural language processing. However, these studies are generally limited to (i) a small number of candidate authors, and (ii) documents with similar lengths. In this paper, we propose a novel solution by modeling authorship attribution as a set similarity problem to overcome the two stated limitations. We conducted extensive experimental studies on a real dataset collected from an online book archive, Project Gutenberg. Experimental results show that in comparison to existing stylometry studies, our proposed solution can handlea larger number of documents of different lengths written by alarger pool of candidate authors with a high accuracy.en
dc.formatapplication/pdfen
dc.language.isoenen
dc.publisherIEEEen
dc.relation.urlhttps://ieeexplore.ieee.org/document/7837960en
dc.subjectstylometryen
dc.titleA scalable framework for stylometric analysis query processingen
dc.typeConference contributionen
dc.identifier.journal2016 IEEE 16th International Conference on Data Mining (ICDM)en
dc.date.updated2020-10-07T19:39:16Z
dc.conference.name2016 IEEE 16th International Conference on Data Mining (ICDM)
pubs.finish-date2016-12-15
pubs.start-date2016-12-12
dc.date.accepted2016-10-09
rioxxterms.funderCity University of Hong Kongen
rioxxterms.identifier.project7200387en
rioxxterms.identifier.project6000511en
rioxxterms.versionAMen
rioxxterms.licenseref.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/en
rioxxterms.licenseref.startdate2020-10-12en
dc.description.versionPublished version
refterms.dateFCD2020-10-12T09:35:14Z
refterms.versionFCDAM
refterms.dateFOA2020-10-12T09:38:13Z


Files in this item

Thumbnail
Name:
Nutanong_et_al_A_scalable_fram ...
Size:
610.7Kb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record

https://creativecommons.org/licenses/by-nc-nd/4.0/
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by-nc-nd/4.0/