• Cross-lingual Dependency Parsing of Related Languages with Rich Morphosyntactic Tagsets

      Agić, Željko; Tiedemann, Jörg; Merkler, Danijela; Krek, Simon; Dobrovoljc, Kaja; Moze, Sara (Association for Computational Linguistics, 2014)
      This paper addresses cross-lingual dependency parsing using rich morphosyntactic tagsets. In our case study, we experiment with three related Slavic languages: Croatian, Serbian and Slovene. Four different dependency treebanks are used for monolingual parsing, direct cross-lingual parsing, and a recently introduced crosslingual parsing approach that utilizes statistical machine translation and annotation projection. We argue for the benefits of using rich morphosyntactic tagsets in cross-lingual parsing and empirically support the claim by showing large improvements over an impoverished common feature representation in form of a reduced part-of-speech tagset. In the process, we improve over the previous state-of-the-art scores in dependency parsing for all three languages.