Cross-lingual Dependency Parsing of Related Languages with Rich Morphosyntactic Tagsets
Abstract
This paper addresses cross-lingual dependency parsing using rich morphosyntactic tagsets. In our case study, we experiment with three related Slavic languages: Croatian, Serbian and Slovene. Four different dependency treebanks are used for monolingual parsing, direct cross-lingual parsing, and a recently introduced crosslingual parsing approach that utilizes statistical machine translation and annotation projection. We argue for the benefits of using rich morphosyntactic tagsets in cross-lingual parsing and empirically support the claim by showing large improvements over an impoverished common feature representation in form of a reduced part-of-speech tagset. In the process, we improve over the previous state-of-the-art scores in dependency parsing for all three languages.Citation
Language Technology for Closely Related Languages and Language Variants (LT4CloseLang), pages 13–24, October 29, 2014, Doha, QatarJournal
Proceedings of the EMNLP'2014 Workshop on Language Technology for Closely Related Languages and Language VariantsAdditional Links
https://www.aclweb.org/anthology/W14-42Type
Conference contributionLanguage
enISBN
9781937284961ae974a485f413a2113503eed53cd6c53
10.3115/v1/w14-4203
Scopus Count
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by-nc-nd/4.0/