Browsing Research Institute in Information and Language Processing by Subjects
Now showing items 1-1 of 1
Stem-based PoS tagging for agglutinative languagesWords are made up of morphemes being glued together in agglutinative languages. This makes it difficult to perform part-of-speech tagging for these languages due to sparsity. In this paper, we present two Hidden Markov Model based Bayesian PoS tagging models for agglutinative languages. Our first model is word-based and the second model is stem-based where the stems of the words are obtained from other two unsupervised stemmers: HPS stemmer and Morfessor FlatCat. The results show that stemming improves the accuracy in PoS tagging. We present the results for Turkish as an agglutinative language and English as a morphologically poor language.