Abstract
This paper presents a machine learning approach to the study of translationese. The goal is to train a computer system to distinguish between translated and non-translated text, in order to determine the characteristic features that influence the classifiers. Several algorithms reach up to 97.62% success rate on a technical dataset. Moreover, the SVM classifier consistently reports a statistically significant improved accuracy when the learning system benefits from the addition of simplification features to the basic translational classifier system. Therefore, these findings may be considered an argument for the existence of the Simplification Universal.Citation
Ilisei I., Inkpen D., Corpas Pastor G., Mitkov R. (2010) Identification of Translationese: A Machine Learning Approach. In: Gelbukh A. (Ed.) Computational Linguistics and Intelligent Text Processing: 11th International Conference, CICLing 2010, Iasi, Romania, March 21-27, 2010, Proceedings. Berlin, Heidelberg: Springer Verlag, pp. 503-511.Publisher
SpringerAdditional Links
https://link.springer.com/chapter/10.1007%2F978-3-642-12116-6_43Type
Conference contributionLanguage
enSeries/Report no.
Lecture Notes in Computer Science, vol. 6008ISSN
0302-9743ae974a485f413a2113503eed53cd6c53
10.1007/978-3-642-12116-6_43
Scopus Count
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by-nc-nd/4.0/