• Combining quality estimation and automatic post-editing to enhance machine translation output

      Chatterjee, Rajen; Negri, Matteo; Turchi, Marco; Blain, Frédéric; Specia, Lucia (Association for Machine Translation in the America, 2018-03)
    • Combining text and images for film age appropriateness classification

      Ha, Le; Mohamed, Emad (Elsevier, 2021-07-14)
      We combine textual information from a corpus of film scripts and the images of important scenes from IMDB that correspond to these films to create a bimodal dataset (the dataset and scripts can be obtained from https://tinyurl.com/se9tlmr) for film age appropriateness classification with the objective of improving the prediction of age appropriateness for parents and children. We use state-of-the art Deep Learning image feature extraction, including DENSENet, ResNet, Inception, and NASNet. We have tested several Machine learning algorithms and have found xgboost to yield the best results. Previously reported classification accuracy, using only textual features, were 79.1% and 65.3% for American MPAA and British BBFC classification respectively. Using images alone, we achieve 64.8% and 56.7% classification accuracy. The most consistent combination of textual features and images’ features achieves 81.1% and 66.8%, both statistically significant improvements over the use of text only.
    • Communication-based influence components model

      Cugelman, Brian; Thelwall, Mike; Dawes, Philip L. (New York: ACM, 2009)
      This paper discusses problems faced by planners of real-world online behavioural change interventions who must select behavioural change frameworks from a variety of competing theories and taxonomies. As a solution, this paper examines approaches that isolate the components of behavioural influence and shows how these components can be placed within an adapted communication framework to aid the design and analysis of online behavioural change interventions. Finally, using this framework, a summary of behavioural change factors are presented from an analysis of 32 online interventions.
    • Context based automatic spelling correction for Turkish

      Bolucu, Necva; Can, Burcu (IEEE, 2019-06-20)
      Spelling errors are one of the crucial problems to be addressed in Natural Language Processing tasks. In this study, a context-based automatic spell correction method for Turkish texts is presented. The method combines the Noisy Channel Model with Hidden Markov Models to correct a given word. This study deviates from the other studies by also considering the contextual information of the word within the sentence. The proposed method is aimed to be integrated to other word-based spelling correction models.
    • Continuous adaptation to user feedback for statistical machine translation

      Blain, Frédéric; Bougares, Fethi; Hazem, Amir; Barrault, Loïc; Schwenk, Holger (Association for Computational Linguistics, 2015-06-30)
      This paper gives a detailed experiment feedback of different approaches to adapt a statistical machine translation system towards a targeted translation project, using only small amounts of parallel in-domain data. The experiments were performed by professional translators under realistic conditions of work using a computer assisted translation tool. We analyze the influence of these adaptations on the translator productivity and on the overall post-editing effort. We show that significant improvements can be obtained by using the presented adaptation techniques.
    • Cross-lingual Dependency Parsing of Related Languages with Rich Morphosyntactic Tagsets

      Agić, Željko; Tiedemann, Jörg; Merkler, Danijela; Krek, Simon; Dobrovoljc, Kaja; Moze, Sara (Association for Computational Linguistics, 2014)
      This paper addresses cross-lingual dependency parsing using rich morphosyntactic tagsets. In our case study, we experiment with three related Slavic languages: Croatian, Serbian and Slovene. Four different dependency treebanks are used for monolingual parsing, direct cross-lingual parsing, and a recently introduced crosslingual parsing approach that utilizes statistical machine translation and annotation projection. We argue for the benefits of using rich morphosyntactic tagsets in cross-lingual parsing and empirically support the claim by showing large improvements over an impoverished common feature representation in form of a reduced part-of-speech tagset. In the process, we improve over the previous state-of-the-art scores in dependency parsing for all three languages.
    • Cross-lingual transfer learning and multitask learning for capturing multiword expressions

      Taslimipoor, Shiva; Rohanian, Omid; Ha, Le An (Association for Computational Linguistics, 2019-08-31)
      Recent developments in deep learning have prompted a surge of interest in the application of multitask and transfer learning to NLP problems. In this study, we explore for the first time, the application of transfer learning (TRL) and multitask learning (MTL) to the identification of Multiword Expressions (MWEs). For MTL, we exploit the shared syntactic information between MWE and dependency parsing models to jointly train a single model on both tasks. We specifically predict two types of labels: MWE and dependency parse. Our neural MTL architecture utilises the supervision of dependency parsing in lower layers and predicts MWE tags in upper layers. In the TRL scenario, we overcome the scarcity of data by learning a model on a larger MWE dataset and transferring the knowledge to a resource-poor setting in another language. In both scenarios, the resulting models achieved higher performance compared to standard neural approaches.
    • deepQuest-py: large and distilled models for quality estimation

      Alva-Manchego, Fernando; Obamuyide, Abiola; Gajbhiye, Amit; Blain, Frederic; Fomicheva, Marina; Specia, Lucia; Adel, Heike; Shi, Shuming (Association for Computational Linguistics, 2021-11-01)
      We introduce deepQuest-py, a framework for training and evaluation of large and lightweight models for Quality Estimation (QE). deepQuest-py provides access to (1) state-ofthe-art models based on pre-trained Transformers for sentence-level and word-level QE; (2) light-weight and efficient sentence-level models implemented via knowledge distillation; and (3) a web interface for testing models and visualising their predictions. deepQuestpy is available at https://github.com/ sheffieldnlp/deepQuest-py under a CC BY-NC-SA licence.
    • deepQuest: a framework for neural-based quality estimation

      Ive, Julia; Blain, Frederic; Specia, Lucia (Association for Computational Linguistics, 2018-08)
      Predicting Machine Translation (MT) quality can help in many practical tasks such as MT post-editing. The performance of Quality Estimation (QE) methods has drastically improved recently with the introduction of neural approaches to the problem. However, thus far neural approaches have only been designed for word and sentence-level prediction. We present a neural framework that is able to accommodate neural QE approaches at these fine-grained levels and generalize them to the level of documents. We test the framework with two sentence-level neural QE approaches: a state of the art approach that requires extensive pre-training, and a new light-weight approach that we propose, which employs basic encoders. Our approach is significantly faster and yields performance improvements for a range of document-level quality estimation tasks. To our knowledge, this is the first neural architecture for document-level QE. In addition, for the first time we apply QE models to the output of both statistical and neural MT systems for a series of European languages and highlight the new challenges resulting from the use of neural MT.
    • Do online resources give satisfactory answers to questions about meaning and phraseology?

      Hanks, Patrick; Franklin, Emma (Springer, 2019-09-18)
      In this paper we explore some aspects of the differences between printed paper dictionaries and online dictionaries in the ways in which they explain meaning and phraseology. After noting the importance of the lexicon as an inventory of linguistic items and the neglect in both linguistics and lexicography of phraseological aspects of that inventory, we investigate the treatment in online resources of phraseology – in particular, the phrasal verbs wipe out and put down – and we go on to investigate a word, dope, that has undergone some dramatic meaning changes during the 20th century. In the course of discussion, we mention the new availability of corpus evidence and the technique of Corpus Pattern Analysis, which is important for linking phraseology and meaning and distinguishing normal phraseology from rare and unusual phraseology. The online resources that we discuss include Google, the Urban Dictionary (UD), and Wiktionary.
    • Domain adaptation of Thai word segmentation models using stacked ensemble

      Limkonchotiwat, Peerat; Phatthiyaphaibun, Wannaphong; Sarwar, Raheem; Chuangsuwanich, Ekapol; Nutanong, Sarana (Association for Computational Linguistics, 2020-11-12)
      Like many Natural Language Processing tasks, Thai word segmentation is domain-dependent. Researchers have been relying on transfer learning to adapt an existing model to a new domain. However, this approach is inapplicable to cases where we can interact with only input and output layers of the models, also known as “black boxes”. We propose a filter-and-refine solution based on the stacked-ensemble learning paradigm to address this black-box limitation. We conducted extensive experimental studies comparing our method against state-of-the-art models and transfer learning. Experimental results show that our proposed solution is an effective domain adaptation method and has a similar performance as the transfer learning method.
    • Effects of lexical properties on viewing time per word in autistic and neurotypical readers

      Štajner, Sanja; Yaneva, Victoria; Mitkov, Ruslan; Ponzetto, Simone Paolo (Association of Computational Linguistics, 2017-09-08)
      Eye tracking studies from the past few decades have shaped the way we think of word complexity and cognitive load: words that are long, rare and ambiguous are more difficult to read. However, online processing techniques have been scarcely applied to investigating the reading difficulties of people with autism and what vocabulary is challenging for them. We present parallel gaze data obtained from adult readers with autism and a control group of neurotypical readers and show that the former required higher cognitive effort to comprehend the texts as evidenced by three gaze-based measures. We divide all words into four classes based on their viewing times for both groups and investigate the relationship between longer viewing times and word length, word frequency, and four cognitively-based measures (word concreteness, familiarity, age of acquisition and imagability).
    • Enhancing unsupervised sentence similarity methods with deep contextualised word representations

      Ranashinghe, Tharindu; Orasan, Constantin; Mitkov, Ruslan (RANLP, 2019-09-02)
      Calculating Semantic Textual Similarity (STS) plays a significant role in many applications such as question answering, document summarisation, information retrieval and information extraction. All modern state of the art STS methods rely on word embeddings one way or another. The recently introduced contextualised word embeddings have proved more effective than standard word embeddings in many natural language processing tasks. This paper evaluates the impact of several contextualised word embeddings on unsupervised STS methods and compares it with the existing supervised/unsupervised STS methods for different datasets in different languages and different domains.
    • Evaluation of a cross-lingual Romanian-English multi-document summariser

      Orǎsan, C; Chiorean, OA (European Language Resources Association, 2008-01-01)
      The rapid growth of the Internet means that more information is available than ever before. Multilingual multi-document summarisation offers a way to access this information even when it is not in a language spoken by the reader by extracting the GIST from related documents and translating it automatically. This paper presents an experiment in which Maximal Marginal Relevance (MMR), a well known multi-document summarisation method, is used to produce summaries from Romanian news articles. A task-based evaluation performed on both the original summaries and on their automatically translated versions reveals that they still contain a significant portion of the important information from the original texts. However, direct evaluation of the automatically translated summaries shows that they are not very legible and this can put off some readers who want to find out more about a topic.
    • An exploratory analysis of multilingual word-level quality estimation with cross-lingual transformers

      Ranasinghe, Tharindu; Orasan, Constantin; Mitkov, Ruslan (Association for Computational Linguistics, 2021-08-31)
      Most studies on word-level Quality Estimation (QE) of machine translation focus on language-specific models. The obvious disadvantages of these approaches are the need for labelled data for each language pair and the high cost required to maintain several language-specific models. To overcome these problems, we explore different approaches to multilingual, word-level QE. We show that these QE models perform on par with the current language-specific models. In the cases of zero-shot and few-shot QE, we demonstrate that it is possible to accurately predict word-level quality for any given new language pair from models trained on other language pairs. Our findings suggest that the word-level QE models based on powerful pre-trained transformers that we propose in this paper generalise well across languages, making them more useful in real-world scenarios.
    • An exploratory study on multilingual quality estimation

      Sun, Shuo; Fomicheva, Marina; Blain, Frederic; Chaudhary, Vishrav; El-Kishky, Ahmed; Renduchintala, Adithya; Guzman, Francisco; Specia, Lucia (Association for Computational Linguistics, 2020-12-31)
      Predicting the quality of machine translation has traditionally been addressed with language-specific models, under the assumption that the quality label distribution or linguistic features exhibit traits that are not shared across languages. An obvious disadvantage of this approach is the need for labelled data for each given language pair. We challenge this assumption by exploring different approaches to multilingual Quality Estimation (QE), including using scores from translation models. We show that these outperform singlelanguage models, particularly in less balanced quality label distributions and low-resource settings. In the extreme case of zero-shot QE, we show that it is possible to accurately predict quality for any given new language from models trained on other languages. Our findings indicate that state-of-the-art neural QE models based on powerful pre-trained representations generalise well across languages, making them more applicable in real-world settings.
    • Findings of the WMT 2018 shared task on quality estimation

      Specia, Lucia; Blain, Frederic; Logacheva, Varvara; Astudillo, Ramón; Martins, André (Association for Computational Linguistics, 2018-11)
      We report the results of the WMT18 shared task on Quality Estimation, i.e. the task of predicting the quality of the output of machine translation systems at various granularity levels: word, phrase, sentence and document. This year we include four language pairs, three text domains, and translations produced by both statistical and neural machine translation systems. Participating teams from ten institutions submitted a variety of systems to different task variants and language pairs.
    • Findings of the WMT 2020 shared task on quality estimation

      Specia, Lucia; Blain, Frédéric; Fomicheva, Marina; Fonseca, Erick; Chaudhary, Vishrav; Guzmán, Francisco; Martins, André FT (Association for Computational Linguistics, 2020-11-30)
      We report the results of the WMT20 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word, sentence and document levels. This edition included new data with open domain texts, direct assessment annotations, and multiple language pairs: English-German, English-Chinese, Russian-English, Romanian-English, Estonian-English, Sinhala-English and Nepali-English data for the sentence-level subtasks, English-German and English-Chinese for the word-level subtask, and English-French data for the document-level subtask. In addition, we made neural machine translation models available to participants. 19 participating teams from 27 institutions submitted altogether 1374 systems to different task variants and language pairs.
    • Findings of the WMT 2021 shared task on quality estimation

      Specia, Lucia; Blain, Frederic; Fomicheva, Marina; Zerva, Chrysoula; Li, Zhenhao; Chaudhary, Vishrav; Martins, André (Association for Computational Linguistics, 2021-11-10)
      We report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels. This edition focused on two main novel additions: (i) prediction for unseen languages, i.e. zero-shot settings, and (ii) prediction of sentences with catastrophic errors. In addition, new data was released for a number of languages, especially post-edited data. Participating teams from 19 institutions submitted altogether 1263 systems to different task variants and language pairs.
    • A first dataset for film age appropriateness investigation

      Mohamed, Emad; Ha, Le An (LREC, 2020-05-13)