Loading...
Thumbnail Image
Item

Unsupervised learning of allomorphs in Turkish

Can, Burcu
Alternative
Abstract
One morpheme may have several surface forms that correspond to allomorphs. In English, ed and d are surface forms of the past tense morpheme, and s, es, and ies are surface forms of the plural or present tense morpheme. Turkish has a large number of allomorphs due to its morphophonemic processes. One morpheme can have tens of different surface forms in Turkish. This leads to a sparsity problem in natural language processing tasks in Turkish. Detection of allomorphs has not been studied much because of its difficulty. For example, t¨u and di are Turkish allomorphs (i.e. past tense morpheme), but all of their letters are different. This paper presents an unsupervised model to extract the allomorphs in Turkish. We are able to obtain an F-measure of 73.71% in the detection of allomorphs, and our model outperforms previous unsupervised models on morpheme clustering.
Citation
Can, B. (2017) Unsupervised learning of allomorphs in Turkish, Turkish Journal of Electrical Engineering & Computer Sciences, 25, pp. 3253–3260.
Research Unit
PubMed ID
PubMed Central ID
Embedded videos
Type
Journal article
Language
en
Description
© 2017 The Author. Published by The Scientific and Technological Research Council of Turkey. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://journals.tubitak.gov.tr/elektrik/issues/elk-17-25-4/elk-25-4-57-1605-216.pdf
Series/Report no.
ISSN
1300-0632
EISSN
1303-6203
ISBN
ISMN
Gov't Doc #
Sponsors
Rights
Research Projects
Organizational Units
Journal Issue
Embedded videos