Show simple item record

dc.contributor.authorOakes, Michael
dc.contributor.authorWermter, Stefan
dc.contributor.authorTripathi, Nandita
dc.date.accessioned2017-11-23T14:47:20Z
dc.date.available2017-11-23T14:47:20Z
dc.date.issued2015-12
dc.identifier.citationTripathi, N., Oakes, M.P., & Wermter, S. (2015). A Scalable Meta-Classifier Combining Search and Classification Techniques for Multi-Level Text Categorization. International Journal of Computational Intelligence and Applications, 14 (4), pp 1550020:1-1550020:16.
dc.identifier.issn1757-5885
dc.identifier.doi10.1142/S1469026815500200
dc.identifier.urihttp://hdl.handle.net/2436/620894
dc.description.abstractNowadays, documents are increasingly associated with multi-level category hierarchies rather than a flat category scheme. As the volume and diversity of documents grow, so do the size and complexity of the corresponding category hierarchies. To be able to access such hierarchically classified documents in real-time, we need fast automatic methods to navigate these hierarchies. Today’s data domains are also very different from each other, such as medicine and politics. These distinct domains can be handled by different classifiers. A document representation system which incorporates the inherent category structure of the data should also add useful semantic content to the data vectors and thus lead to better separability of classes. In this paper, we present a scalable meta-classifier to tackle today’s problem of multi-level data classification in the presence of large datasets. To speed up the classification process, we use a search-based method to detect the level-1 category of a test document. For this purpose, we use a category–hierarchy-based vector representation. We evaluate the meta-classifier by scaling to both longer documents as well as to a larger category set and show it to be robust in both cases. We test the architecture of our meta-classifier using six different base classifiers (Random forest, C4.5, multilayer perceptron, naïve Bayes, BayesNet (BN) and PART). We observe that even though there is a very small variation in the performance of different architectures, all of them perform much better than the corresponding single baseline classifiers. We conclude that there is substantial potential in this meta-classifier architecture, rather than the classifiers themselves, which successfully improves classification performance.
dc.formatapplication/pdf
dc.language.isoen
dc.publisherWorld Scientific Publishing Company
dc.relation.urlhttps://www.worldscientific.com/doi/abs/10.1142/S1469026815500200
dc.subjectText
dc.subjectClassification
dc.titleA scalable meta-classifier for combining search and classification techniques for multi-level text categorization
dc.typeJournal article
dc.identifier.journalInternational Journal on Computational Intelligence and Applications
dc.source.volume14
dc.source.issue4
dc.source.beginpage1550020:1
dc.source.endpage1550020:16
refterms.dateFOA2018-07-18T14:17:59Z
html.description.abstractNowadays, documents are increasingly associated with multi-level category hierarchies rather than a flat category scheme. As the volume and diversity of documents grow, so do the size and complexity of the corresponding category hierarchies. To be able to access such hierarchically classified documents in real-time, we need fast automatic methods to navigate these hierarchies. Today’s data domains are also very different from each other, such as medicine and politics. These distinct domains can be handled by different classifiers. A document representation system which incorporates the inherent category structure of the data should also add useful semantic content to the data vectors and thus lead to better separability of classes. In this paper, we present a scalable meta-classifier to tackle today’s problem of multi-level data classification in the presence of large datasets. To speed up the classification process, we use a search-based method to detect the level-1 category of a test document. For this purpose, we use a category–hierarchy-based vector representation. We evaluate the meta-classifier by scaling to both longer documents as well as to a larger category set and show it to be robust in both cases. We test the architecture of our meta-classifier using six different base classifiers (Random forest, C4.5, multilayer perceptron, naïve Bayes, BayesNet (BN) and PART). We observe that even though there is a very small variation in the performance of different architectures, all of them perform much better than the corresponding single baseline classifiers. We conclude that there is substantial potential in this meta-classifier architecture, rather than the classifiers themselves, which successfully improves classification performance.


Files in this item

Thumbnail
Name:
Oakes_M_et_al_A scalable meta- ...
Size:
545.1Kb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record