Deep learning based semantic textual similarity for applications in translation technology
MetadataShow full item record
AbstractSemantic Textual Similarity (STS) measures the equivalence of meanings between two textual segments. It is a fundamental task for many natural language processing applications. In this study, we focus on employing STS in the context of translation technology. We start by developing models to estimate STS. We propose a new unsupervised vector aggregation-based STS method which relies on contextual word embeddings. We also propose a novel Siamese neural network based on efficient recurrent neural network units. We empirically evaluate various unsupervised and supervised STS methods, including these newly proposed methods in three different English STS datasets, two non- English datasets and a bio-medical STS dataset to list the best supervised and unsupervised STS methods. We then embed these STS methods in translation technology applications. Firstly we experiment with Translation Memory (TM) systems. We propose a novel TM matching and retrieval method based on STS methods that outperform current TM systems. We then utilise the developed STS architectures in translation Quality Estimation (QE). We show that the proposed methods are simple but outperform complex QE architectures and improve the state-of-theart results. The implementations of these methods have been released as open source.
PublisherUniversity of Wolverhampton
TypeThesis or dissertation
DescriptionA thesis submitted in partial fulfilment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.
The following licence applies to the copyright and re-use of this item:
- Creative Commons
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International