Loading...
Thumbnail Image
Item

Using gaze data to predict multiword expressions

Rohanian, Omid
Taslimipoor, Shiva
Yaneva, Victoria
Ha, Le An
Alternative
Abstract
In recent years gaze data has been increasingly used to improve and evaluate NLP models due to the fact that it carries information about the cognitive processing of linguistic phenomena. In this paper we conduct a preliminary study towards the automatic identification of multiword expressions based on gaze features from native and non-native speakers of English. We report comparisons between a part-ofspeech (POS) and frequency baseline to: i) a prediction model based solely on gaze data and ii) a combined model of gaze data, POS and frequency. In spite of the challenging nature of the task, best performance was achieved by the latter. Furthermore, we explore how the type of gaze data (from native versus non-native speakers) affects the prediction, showing that data from the two groups is discriminative to an equal degree. Finally, we show that late processing measures are more predictive than early ones, which is in line with previous research on idioms and other formulaic structures.
Citation
Rohanian, O., Taslimipoor, S., Yaneva, V. and Ha, L. A. (2017) Using gaze data to predict multiword expressions, in Mitkov, R. and Angelova, G. (Eds.) Proceedings of the International Conference Recent Advances in Natural Language Processing. Stroudsberg, PA: The Association for Computational Linguistics, pp. 601-609.
Publisher
Journal
Research Unit
DOI
PubMed ID
PubMed Central ID
Embedded videos
Type
Conference contribution
Language
en
Description
Series/Report no.
ISSN
EISSN
ISBN
9789544520489
ISMN
Gov't Doc #
Sponsors
Nan
Rights
Attribution-NonCommercial-NoDerivs 3.0 United States
Research Projects
Organizational Units
Journal Issue
Embedded videos