Loading...
Unsupervised morphological segmentation using neural word embeddings
Can, Burcu ; Üstün, Ahmet
Can, Burcu
Üstün, Ahmet
Authors
Editors
Other contributors
Affiliation
Epub Date
Issue Date
2016-09-21
Submitted date
Alternative
Abstract
We present a fully unsupervised method for morphological segmentation. Unlike many morphological segmentation systems, our method is based on semantic features rather than orthographic features. In order to capture word meanings, word embeddings are obtained from a two-level neural network [11]. We compute the semantic similarity between words using the neural word embeddings, which forms our baseline segmentation model. We model morphotactics with a bigram language model based on maximum likelihood estimates by using the initial segmentations from the baseline. Results show that using semantic features helps to improve morphological segmentation especially in agglutinating languages like Turkish. Our method shows competitive performance compared to other unsupervised morphological segmentation systems.
Citation
Üstün A., Can B. (2016) Unsupervised Morphological Segmentation Using Neural Word Embeddings. In: Král P., Martín-Vide C. (eds) Statistical Language and Speech Processing. SLSP 2016. Lecture Notes in Computer Science, vol 9918. Springer, Cham. https://doi.org/10.1007/978-3-319-45925-7_4
Publisher
Journal
Research Unit
PubMed ID
PubMed Central ID
Embedded videos
Additional Links
Type
Conference contribution
Language
en
Description
This is an accepted manuscript of an article published by Springer in Král P., Martín-Vide C. (eds) Statistical Language and Speech Processing. SLSP 2016. Lecture Notes in Computer Science, vol 9918 on 21/09/2016, available online: https://doi.org/10.1007/978-3-319-45925-7_4
The accepted version of the publication may differ from the final published version.
Series/Report no.
Lecture Notes in Computer Science, vol 9918
ISSN
0302-9743
EISSN
1611-3349
ISBN
9783319459240