Loading...
Thumbnail Image
Item

Stem-based PoS tagging for agglutinative languages

Bolucu, Necva
Can, Burcu
Editors
Other contributors
Affiliation
Epub Date
Issue Date
2017-06-29
Submitted date
Alternative
Sondan Eklemeli Dillerde Gövde Tabanlı Sözcük Türü ˙I¸saretleme
Abstract
Words are made up of morphemes being glued together in agglutinative languages. This makes it difficult to perform part-of-speech tagging for these languages due to sparsity. In this paper, we present two Hidden Markov Model based Bayesian PoS tagging models for agglutinative languages. Our first model is word-based and the second model is stem-based where the stems of the words are obtained from other two unsupervised stemmers: HPS stemmer and Morfessor FlatCat. The results show that stemming improves the accuracy in PoS tagging. We present the results for Turkish as an agglutinative language and English as a morphologically poor language.
Citation
Bölücü, N. and Can, B. (2017) Stem-based PoS tagging for agglutinative languages, 2017 25th Signal Processing and Communications Applications Conference (SIU), 15-18 May 2017, Antalya, Turkey.
Publisher
Journal
Research Unit
PubMed ID
PubMed Central ID
Embedded videos
Type
Conference contribution
Language
other
Description
This is an accepted manuscript of an article published by IEEE in 2017 25th Signal Processing and Communications Applications Conference (SIU) on 29/06/2017, available online: https://ieeexplore.ieee.org/document/7960386 The accepted version of the publication may differ from the final published version.
Series/Report no.
ISSN
2165-0608
EISSN
ISBN
9781509064946
ISMN
Gov't Doc #
Sponsors
Rights
Research Projects
Organizational Units
Journal Issue
Embedded videos