• Adults with High-functioning Autism Process Web Pages With Similar Accuracy but Higher Cognitive Effort Compared to Controls

      Yaneva, Victoria; Ha, Le; Eraslan, Sukru; Yesilada, Yeliz (ACM, 2019-05-31)
      To accommodate the needs of web users with high-functioning autism, a designer's only option at present is to rely on guidelines that: i) have not been empirically evaluated and ii) do not account for the di erent levels of autism severity. Before designing effective interventions, we need to obtain an empirical understanding of the aspects that speci c user groups need support with. This has not yet been done for web users at the high ends of the autism spectrum, as often they appear to execute tasks effortlessly, without facing barriers related to their neurodiverse processing style. This paper investigates the accuracy and efficiency with which high-functioning web users with autism and a control group of neurotypical participants obtain information from web pages. Measures include answer correctness and a number of eye-tracking features. The results indicate similar levels of accuracy for the two groups at the expense of efficiency for the autism group, showing that the autism group invests more cognitive effort in order to achieve the same results as their neurotypical counterparts.
    • Autism and the web: using web-searching tasks to detect autism and improve web accessibility

      Yaneva, Victoria (Association for Computing Machinery (ACM), 2018-08-02)
      People with autism consistently exhibit different attention-shifting patterns compared to neurotypical people. Research has shown that these differences can be successfully captured using eye tracking. In this paper, we summarise our recent research on using gaze data from web-related tasks to address two problems: improving web accessibility for people with autism and detecting autism automatically. We first examine the way a group of participants with autism and a control group process the visual information from web pages and provide empirical evidence of different visual searching strategies. We then use these differences in visual attention, to train a machine learning classifier which can successfully use the gaze data to distinguish between the two groups with an accuracy of 0.75. At the end of this paper we review the way forward to improving web accessibility and automatic autism detection, as well as the practical implications and alternatives for using eye tracking in these research areas.
    • Autism detection based on eye movement sequences on the web: a scanpath trend analysis approach

      Eraslan, Sukru; Yesilada, Yeliz; Yaneva, Victoria; Harper, Simon; Duarte, Carlos; Drake, Ted; Hwang, Faustina; Lewis, Clayton (ACM, 2020-04-20)
      Autism diagnostic procedure is a subjective, challenging and expensive procedure and relies on behavioral, historical and parental report information. In our previous, we proposed a machine learning classifier to be used as a potential screening tool or used in conjunction with other diagnostic methods, thus aiding established diagnostic methods. The classifier uses eye movements of people on web pages but it only considers non-sequential data. It achieves the best accuracy by combining data from several web pages and it has varying levels of accuracy on different web pages. In this present paper, we investigate whether it is possible to detect autism based on eye-movement sequences and achieve stable accuracy across different web pages to be not dependent on specific web pages. We used Scanpath Trend Analysis (STA) which is designed for identifying a trending path of a group of users on a web page based on their eye movements. We first identify trending paths of people with autism and neurotypical people. To detect whether or not a person has autism, we calculate the similarity of his/her path to the trending paths of people with autism and neurotypical people. If the path is more similar to the trending path of neurotypical people, we classify the person as a neurotypical person. Otherwise, we classify her/him as a person with autism. We systematically evaluate our approach with an eye-tracking dataset of 15 verbal and highly-independent people with autism and 15 neurotypical people on six web pages. Our evaluation shows that the STA approach performs better on individual web pages and provides more stable accuracy across different pages.
    • Classifying referential and non-referential it using gaze

      Yaneva, Victoria; Ha, Le An; Evans, Richard; Mitkov, Ruslan (Association for Computational Linguistics (ACL), 2018-10-31)
      When processing a text, humans and machines must disambiguate between different uses of the pronoun it, including non-referential, nominal anaphoric or clause anaphoric ones. In this paper, we use eye-tracking data to learn how humans perform this disambiguation. We use this knowledge to improve the automatic classification of it. We show that by using gaze data and a POS-tagger we are able to significantly outperform a common baseline and classify between three categories of it with an accuracy comparable to that of linguisticbased approaches. In addition, the discriminatory power of specific gaze features informs the way humans process the pronoun, which, to the best of our knowledge, has not been explored using data from a natural reading task.
    • Combining Multiple Corpora for Readability Assessment for People with Cognitive Disabilities

      Yaneva, Victoria; Orăsan, Constantin; Evans, Richard; Rohanian, Omid (Association for Computational Linguistics, 2017-09-08)
      Given the lack of large user-evaluated corpora in disability-related NLP research (e.g. text simplification or readability assessment for people with cognitive disabilities), the question of choosing suitable training data for NLP models is not straightforward. The use of large generic corpora may be problematic because such data may not reflect the needs of the target population. At the same time, the available user-evaluated corpora are not large enough to be used as training data. In this paper we explore a third approach, in which a large generic corpus is combined with a smaller population-specific corpus to train a classifier which is evaluated using two sets of unseen user-evaluated data. One of these sets, the ASD Comprehension corpus, is developed for the purposes of this study and made freely available. We explore the effects of the size and type of the training data used on the performance of the classifiers, and the effects of the type of the unseen test datasets on the classification performance.
    • Detecting high-functioning autism in adults using eye tracking and machine learning

      Yaneva, Victoria; Ha, Le An; Eraslan, Sukru; Yesilada, Yeliz; Mitkov, Ruslan (Institute of Electrical and Electronics Engineers (IEEE), 2020-04-30)
      The purpose of this study is to test whether visual processing differences between adults with and without highfunctioning autism captured through eye tracking can be used to detect autism. We record the eye movements of adult participants with and without autism while they look for information within web pages. We then use the recorded eye-tracking data to train machine learning classifiers to detect the condition. The data was collected as part of two separate studies involving a total of 71 unique participants (31 with autism and 40 control), which enabled the evaluation of the approach on two separate groups of participants, using different stimuli and tasks. We explore the effects of a number of gaze-based and other variables, showing that autism can be detected automatically with around 74% accuracy. These results confirm that eye-tracking data can be used for the automatic detection of high-functioning autism in adults and that visual processing differences between the two groups exist when processing web pages.
    • Effects of lexical properties on viewing time per word in autistic and neurotypical readers

      Štajner, Sanja; Yaneva, Victoria; Mitkov, Ruslan; Ponzetto, Simone Paolo (Association of Computational Linguistics, 2017-09-08)
      Eye tracking studies from the past few decades have shaped the way we think of word complexity and cognitive load: words that are long, rare and ambiguous are more difficult to read. However, online processing techniques have been scarcely applied to investigating the reading difficulties of people with autism and what vocabulary is challenging for them. We present parallel gaze data obtained from adult readers with autism and a control group of neurotypical readers and show that the former required higher cognitive effort to comprehend the texts as evidenced by three gaze-based measures. We divide all words into four classes based on their viewing times for both groups and investigate the relationship between longer viewing times and word length, word frequency, and four cognitively-based measures (word concreteness, familiarity, age of acquisition and imagability).
    • “Keep it simple!”: an eye-tracking study for exploring complexity and distinguishability of web pages for people with autism

      Eraslan, Sukru; Yesilada, Yeliz; Yaneva, Victoria; Ha, Le An (Springer Science and Business Media LLC, 2020-02-03)
      A major limitation of the international well-known standard web accessibility guidelines for people with cognitive disabilities is that they have not been empirically evaluated by using relevant user groups. Instead, they aim to anticipate issues that may arise following the diagnostic criteria. In this paper, we address this problem by empirically evaluating two of the most popular guidelines related to the visual complexity of web pages and the distinguishability of web-page elements. We conducted a comparative eye-tracking study with 19 verbal and highly independent people with autism and 19 neurotypical people on eight web pages with varying levels of visual complexity and distinguishability, with synthesis and browsing tasks. Our results show that people with autism have a higher number of fixations and make more transitions with synthesis tasks. When we consider the number of elements which are not related to given tasks, our analysis shows that they look at more irrelevant elements while completing the synthesis task on visually complex pages or on pages whose elements are not easily distinguishable. To the best of our knowledge, this is the first empirical behavioural study which evaluates these guidelines by showing that the high visual complexity of pages or the low distinguishability of page elements causes non-equivalent experience for people with autism.
    • Predicting reading difficulty for readers with autism spectrum disorder

      Evans, Richard; Yaneva, Victoria; Temnikova, Irina (European Language Resources Association, 2016-05-23)
      People with autism experience various reading comprehension difficulties, which is one explanation for the early school dropout, reduced academic achievement and lower levels of employment in this population. To overcome this issue, content developers who want to make their textbooks, websites or social media accessible to people with autism (and thus for every other user) but who are not necessarily experts in autism, can benefit from tools which are easy to use, which can assess the accessibility of their content, and which are sensitive to the difficulties that autistic people might have when processing texts/websites. In this paper we present a preliminary machine learning readability model for English developed specifically for the needs of adults with autism. We evaluate the model on the ASD corpus, which has been developed specifically for this task and is, so far, the only corpus for which readability for people with autism has been evaluated. The results show that out model outperforms the baseline, which is the widely-used Flesch-Kincaid Grade Level formula.
    • Predicting the difficulty of multiple choice questions in a high-stakes medical exam

      Ha, Le; Yaneva, Victoria; Balwin, Peter; Mee, Janet (Association for Computational Linguistics, 2019-08-02)
      Predicting the construct-relevant difficulty of Multiple-Choice Questions (MCQs) has the potential to reduce cost while maintaining the quality of high-stakes exams. In this paper, we propose a method for estimating the difficulty of MCQs from a high-stakes medical exam, where all questions were deliberately written to a common reading level. To accomplish this, we extract a large number of linguistic features and embedding types, as well as features quantifying the difficulty of the items for an automatic question-answering system. The results show that the proposed approach outperforms various baselines with a statistically significant difference. Best results were achieved when using the full feature set, where embeddings had the highest predictive power, followed by linguistic features. An ablation study of the various types of linguistic features suggested that information from all levels of linguistic processing contributes to predicting item difficulty, with features related to semantic ambiguity and the psycholinguistic properties of words having a slightly higher importance. Owing to its generic nature, the presented approach has the potential to generalize over other exams containing MCQs.
    • Six good predictors of autistic reading comprehension

      Yaneva, Victoria; Evans, Richard (INCOMA Ltd, 2015-09-07)
      This paper presents our investigation of the ability of 33 readability indices to account for the reading comprehension difficulty posed by texts for people with autism. The evaluation by autistic readers of 16 text passages is described, a process which led to the production of the first text collection for which readability has been evaluated by people with autism. We present the findings of a study to determine which of the 33 indices can successfully discriminate between the difficulty levels of the text passages, as determined by our reading experiment involving autistic participants. The discriminatory power of the indices is further assessed through their application to the FIRST corpus which consists of 25 texts presented in their original form and in a manually simplified form (50 texts in total), produced specifically for readers with autism.
    • Using gaze data to predict multiword expressions

      Rohanian, Omid; Taslimipoor, Shiva; Yaneva, Victoria; Ha, Le An (INCOMA Ltd, 2017-09-01)
      In recent years gaze data has been increasingly used to improve and evaluate NLP models due to the fact that it carries information about the cognitive processing of linguistic phenomena. In this paper we conduct a preliminary study towards the automatic identification of multiword expressions based on gaze features from native and non-native speakers of English. We report comparisons between a part-ofspeech (POS) and frequency baseline to: i) a prediction model based solely on gaze data and ii) a combined model of gaze data, POS and frequency. In spite of the challenging nature of the task, best performance was achieved by the latter. Furthermore, we explore how the type of gaze data (from native versus non-native speakers) affects the prediction, showing that data from the two groups is discriminative to an equal degree. Finally, we show that late processing measures are more predictive than early ones, which is in line with previous research on idioms and other formulaic structures.
    • Using linguistic features to predict the response process complexity associated with answering clinical MCQs

      Yaneva, Victoria; Jurich, Daniel; Ha, Le An; Baldwin, Peter (Association for Computational Linguistics, 2021-04-30)
      This study examines the relationship between the linguistic characteristics of a test item and the complexity of the response process required to answer it correctly. Using data from a large-scale medical licensing exam, clustering methods identified items that were similar with respect to their relative difficulty and relative response-time intensiveness to create low response process complexity and high response process complexity item classes. Interpretable models were used to investigate the linguistic features that best differentiated between these classes from a descriptive and predictive framework. Results suggest that nuanced features such as the number of ambiguous medical terms help explain response process complexity beyond superficial item characteristics such as word count. Yet, although linguistic features carry signal relevant to response process complexity, the classification of individual items remains challenging.
    • Using natural language processing to predict item response times and improve test construction

      Baldwin, Peter; Yaneva, Victoria; Mee, Janet; Clauser, Brian E; Ha, Le An (Wiley, 2020-02-24)
      In this article, it is shown how item text can be represented by (a) 113 features quantifying the text's linguistic characteristics, (b) 16 measures of the extent to which an information‐retrieval‐based automatic question‐answering system finds an item challenging, and (c) through dense word representations (word embeddings). Using a random forests algorithm, these data then are used to train a prediction model for item response times and predicted response times then are used to assemble test forms. Using empirical data from the United States Medical Licensing Examination, we show that timing demands are more consistent across these specially assembled forms than across forms comprising randomly‐selected items. Because an exam's timing conditions affect examinee performance, this result has implications for exam fairness whenever examinees are compared with each other or against a common standard.
    • Web users with autism: eye tracking evidence for differences

      Eraslan, Sukru; Yaneva, Victoria; Yesilada, Yeliz; Harper, Simon (Taylor and Francis, 2018-12-11)
      Anecdotal evidence suggests that people with autism may have different processing strategies when accessing the web. However, limited empirical evidence is available to support this. This paper presents an eye tracking study with 18 participants with high-functioning autism and 18 neurotypical participants to investigate the similarities and differences between these two groups in terms of how they search for information within web pages. According to our analysis, people with autism are likely to be less successful in completing their searching tasks. They also have a tendency to look at more elements on web pages and make more transitions between the elements in comparison to neurotypical people. In addition, they tend to make shorter but more frequent fixations on elements which are not directly related to a given search task. Therefore, this paper presents the first empirical study to investigate how people with autism differ from neurotypical people when they search for information within web pages based on an in-depth statistical analysis of their gaze patterns.