Loading...
Bridging the “gApp”: improving neural machine translation systems for multiword expression detection
Hidalgo-Ternero, Carlos Manuel ; Pastor, Gloria Corpas
Hidalgo-Ternero, Carlos Manuel
Pastor, Gloria Corpas
Editors
Other contributors
Affiliation
Epub Date
Issue Date
2020-11-25
Submitted date
Alternative
Abstract
The present research introduces the tool gApp, a Python-based text preprocessing system for the automatic identification and conversion of discontinuous multiword expressions (MWEs) into their continuous form in order to enhance neural machine translation (NMT). To this end, an experiment with semi-fixed verb–noun idiomatic combinations (VNICs) will be carried out in order to evaluate to what extent gApp can optimise the performance of the two main free open-source NMT systems —Google Translate and DeepL— under the challenge of MWE discontinuity in the Spanish into English directionality. In the light of our promising results, the study concludes with suggestions on how to further optimise MWE-aware NMT systems.
Citation
Hidalgo-Ternero, C. and Corpas Pastor, G. (2020) Bridging the “gApp”: improving neural machine translation systems for multiword expression detection, Yearbook of Phraseology, 11(1), pp. 61–80. DOI: https://doi.org/10.1515/phras-2020-0005
Publisher
Journal
Research Unit
PubMed ID
PubMed Central ID
Embedded videos
Additional Links
Type
Journal article
Language
en
Description
This is the published version of an article published by De Gruyter in Yearbook of Phraseology on 25/11/2020, available online: https://doi.org/10.1515/phras-2020-0005
Series/Report no.
ISSN
1868-632X
EISSN
1868-6338