Loading...
Thumbnail Image
Item

Bridging the “gApp”: improving neural machine translation systems for multiword expression detection

Hidalgo-Ternero, Carlos Manuel
Pastor, Gloria Corpas
Alternative
Abstract
The present research introduces the tool gApp, a Python-based text preprocessing system for the automatic identification and conversion of discontinuous multiword expressions (MWEs) into their continuous form in order to enhance neural machine translation (NMT). To this end, an experiment with semi-fixed verb–noun idiomatic combinations (VNICs) will be carried out in order to evaluate to what extent gApp can optimise the performance of the two main free open-source NMT systems —Google Translate and DeepL— under the challenge of MWE discontinuity in the Spanish into English directionality. In the light of our promising results, the study concludes with suggestions on how to further optimise MWE-aware NMT systems.
Citation
Hidalgo-Ternero, C. and Corpas Pastor, G. (2020) Bridging the “gApp”: improving neural machine translation systems for multiword expression detection, Yearbook of Phraseology, 11(1), pp. 61–80. DOI: https://doi.org/10.1515/phras-2020-0005
Research Unit
PubMed ID
PubMed Central ID
Embedded videos
Type
Journal article
Language
en
Description
This is the published version of an article published by De Gruyter in Yearbook of Phraseology on 25/11/2020, available online: https://doi.org/10.1515/phras-2020-0005
Series/Report no.
ISSN
1868-632X
EISSN
1868-6338
ISBN
ISMN
Gov't Doc #
Sponsors
Rights
Research Projects
Organizational Units
Journal Issue
Embedded videos