Phrase level segmentation and labelling of machine translation errors
Editors
Chair, Nicoletta Calzolari ConferenceChoukri, Khalid
Declerck, Thierry
Grobelnik, Marko
Maegaard, Bente
Mariani, Joseph
Moreno, Asuncion
Odijk, Jan
Piperidis, Stelios
Issue Date
2016-05
Metadata
Show full item recordAbstract
This paper presents our work towards a novel approach for Quality Estimation (QE) of machine translation based on sequences of adjacent words, the so-called phrases. This new level of QE aims to provide a natural balance between QE at word and sentence-level, which are either too fine grained or too coarse levels for some applications. However, phrase-level QE implies an intrinsic challenge: how to segment a machine translation into sequence of words (contiguous or not) that represent an error. We discuss three possible segmentation strategies to automatically extract erroneous phrases. We evaluate these strategies against annotations at phrase-level produced by humans, using a new dataset collected for this purpose.Citation
Blain, F. Logacheva, V. and Specia, L. (2016) Phrase level segmentation and labelling of machine translation errors. In, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), Calzolari, N., Choukri, K., Declerck, T., Goggi, S. et al. Stroudsburg, PA: European Language Resources Association (ELRA), pp. 2240-2245.Additional Links
https://www.aclweb.org/anthology/L16-1356/Type
Conference contributionLanguage
enDescription
© 2016 The Authors. Published by European Language Resources Association (ELRA). This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://www.aclweb.org/anthology/L16-1356/Sponsors
The authors would like to thanks all the annotators who helped to create the first version of gold-standard annotations at phrase-level. This work was supported by the QT21 (H2020 No. 645452, Lucia Specia, Fred´ eric Blain) and EX-PERT (EU FP7 Marie Curie ITN No. 317471, Varvara Logacheva) projects.
Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by/4.0/