• Early Mendeley readers correlate with later citation counts

      Thelwall, Mike (Springer, 2018-03-26)
      Counts of the number of readers registered in the social reference manager Mendeley have been proposed as an early impact indicator for journal articles. Although previous research has shown that Mendeley reader counts for articles tend to have a strong positive correlation with synchronous citation counts after a few years, no previous studies have compared early Mendeley reader counts with later citation counts. In response, this first diachronic analysis compares reader counts within a month of publication with citation counts after 20 months for ten fields. There were moderate or strong correlations in eight out of ten fields, with the two exceptions being the smallest categories (n=18, 36) with wide confidence intervals. The correlations are higher than the correlations between later citations and early citations, showing that Mendeley reader counts are more useful early impact indicators than citation counts.
    • Effective websites for small and medium-sized enterprises

      Thelwall, Mike (MCB UP Ltd, 2000)
      In the UK, millions are now online and many are prepared to use the Internet to make and influence purchasing decisions. Businesses should, therefore, consider whether the Internet could provide them with a new marketing opportunity. Although increasing numbers of businesses now have a website, there seems to be a quality problem that is leading to missed opportunities, particularly for smaller enterprises. This belief is backed up by an automated survey of 3,802 predominantly small UK business sites, believed to be by far the largest of its kind to date. Analysis of the results reveals widespread problems in relation to search engines. Most Internet users find new sites through search engines, yet over half of the sites checked were not registered in the largest one, Yahoo!, and could therefore be missing a sizeable percentage of potential customers. The underlying problem with business sites is the lack of maturity of the medium as evidenced by the focus on technological issues amongst designers and the inevitable lack of Web-business experience of managers. Designers need to take seriously the usability of the site, its design and its ability to meet the business goals of the client. These issues are perhaps being taken up less than in the related discipline of software engineering, probably owing to the relative ease of website creation. Managers need to dictate the objectives of their site, but also, in the current climate, cannot rely even on professional website design companies and must be capable of evaluating the quality of their site themselves. Finally, educators need to ensure that these issues are emphasised to the next generation of designers and managers in order that the full potential of the Internet for business can be realised.
    • Effects of lexical properties on viewing time per word in autistic and neurotypical readers

      Štajner, Sanja; Yaneva, Victoria; Mitkov, Ruslan; Ponzetto, Simone Paolo (Association of Computational Linguistics, 2017-09-08)
      Eye tracking studies from the past few decades have shaped the way we think of word complexity and cognitive load: words that are long, rare and ambiguous are more difficult to read. However, online processing techniques have been scarcely applied to investigating the reading difficulties of people with autism and what vocabulary is challenging for them. We present parallel gaze data obtained from adult readers with autism and a control group of neurotypical readers and show that the former required higher cognitive effort to comprehend the texts as evidenced by three gaze-based measures. We divide all words into four classes based on their viewing times for both groups and investigate the relationship between longer viewing times and word length, word frequency, and four cognitively-based measures (word concreteness, familiarity, age of acquisition and imagability).
    • El EEES y la competencia tecnológica: los nuevos grados en Traducción

      Corpas Pastor, Gloria; Muñoz, María (Universidad de Las Palmas de Gran Canaria, Servicio de Publicaciones y Difusión Científica, 2015-04-23)
      El presente trabajo toma como punto de partida la investigación que se describe en Muñoz Ramos (2012). En él haremos una breve síntesis del origen y evolución del EEES hasta llegar a nuestros días y su repercusión en los estudios de Traducción. Daremos cuenta de la imbricación existente entre los principios constitutivos del Proceso de Bolonia y las Tecnologías de la Información y Comunicación (TIC), que se posicionan como las compañeras idóneas para la consecución de los objetivos de la Declaración de Bolonia. Finalmente, podremos comprobar cómo estos dos puntos convergen en los nuevos grados en Traducción españoles, que se ajustan al EEES y encuentran en las materias de tecnologías de la traducción la piedra angular de su razón de ser.
    • Enhancing unsupervised sentence similarity methods with deep contextualised word representations

      Ranashinghe, Tharindu; Orasan, Constantin; Mitkov, Ruslan (RANLP, 2019-09-02)
      Calculating Semantic Textual Similarity (STS) plays a significant role in many applications such as question answering, document summarisation, information retrieval and information extraction. All modern state of the art STS methods rely on word embeddings one way or another. The recently introduced contextualised word embeddings have proved more effective than standard word embeddings in many natural language processing tasks. This paper evaluates the impact of several contextualised word embeddings on unsupervised STS methods and compares it with the existing supervised/unsupervised STS methods for different datasets in different languages and different domains.
    • Evaluation of a cross-lingual Romanian-English multi-document summariser

      Orǎsan, C; Chiorean, OA (European Language Resources Association, 2008-01-01)
      The rapid growth of the Internet means that more information is available than ever before. Multilingual multi-document summarisation offers a way to access this information even when it is not in a language spoken by the reader by extracting the GIST from related documents and translating it automatically. This paper presents an experiment in which Maximal Marginal Relevance (MMR), a well known multi-document summarisation method, is used to produce summaries from Romanian news articles. A task-based evaluation performed on both the original summaries and on their automatically translated versions reveals that they still contain a significant portion of the important information from the original texts. However, direct evaluation of the automatically translated summaries shows that they are not very legible and this can put off some readers who want to find out more about a topic.
    • Evidence for the existence of geographic trends in university web site interlinking

      Thelwall, Mike (MCB UP Ltd, 2002)
      The Web is an important medium for scholarly communication of various types, perhaps eventually to replace entirely some traditional mechanisms such as print journals. Yet the Web analogy of citations, hyperlinks, are much more varied in use and existing citation techniques are difficult to generalise to the new medium. In this context, one new challenging object of study is the modern multi-faceted, multi-genre, partly unregulated university Web site. This paper develops a methodology to analyse the patterns of interlinking between university Web sites and uses it to indicate that the degree of interlinking decreases with distance, at least in the UK. This is perhaps not in itself a surprising result, despite claims of a paradigm shift from the traditional virtual college towards collaboratories, but the methodology developed can also be used to refine existing Web link metrics to produce more powerful tools for comparing groups of sites.
    • Exploiting Data-Driven Hybrid Approaches to Translation in the EXPERT Project

      Orăsan, Constantin; Escartín, Carla Parra; Torres, Lianet Sepúlveda; Barbu, Eduard; Ji, Meng; Oakes, Michael (Cambridge University Press, 2019-06-13)
      Technologies have transformed the way we work, and this is also applicable to the translation industry. In the past thirty to thirty-five years, professional translators have experienced an increased technification of their work. Barely thirty years ago, a professional translator would not have received a translation assignment attached to an e-mail or via an FTP and yet, for the younger generation of professional translators, receiving an assignment by electronic means is the only reality they know. In addition, as pointed out in several works such as Folaron (2010) and Kenny (2011), professional translators now have a myriad of tools available to use in the translation process.
    • FGFR1 expression and role in migration in low and high grade pediatric gliomas

      Egbivwie, Naomi; Cockle, Julia V.; Humphries, Matthew; Ismail, Azzam; Esteves, Filomena; Taylor, Claire; Karakoula, Katherine; Morton, Ruth; Warr, Tracy; Short, Susan C.; et al. (Frontiers Media, 2019-03-13)
      The heterogeneous and invasive nature of pediatric gliomas poses significant treatment challenges, highlighting the importance of identifying novel chemotherapeutic targets. Recently, recurrent Fibroblast growth factor receptor 1 (FGFR1) mutations in pediatric gliomas have been reported. Here, we explored the clinical relevance of FGFR1 expression, cell migration in low and high grade pediatric gliomas and the role of FGFR1 in cell migration/invasion as a potential chemotherapeutic target. A high density tissue microarray (TMA) was used to investigate associations between FGFR1 and activated phosphorylated FGFR1 (pFGFR1) expression and various clinicopathologic parameters. Expression of FGFR1 and pFGFR1 were measured by immunofluorescence and by immunohistochemistry (IHC) in 3D spheroids in five rare patient-derived pediatric low-grade glioma (pLGG) and two established high-grade glioma (pHGG) cell lines. Two-dimensional (2D) and three-dimensional (3D) migration assays were performed for migration and inhibitor studies with three FGFR1 inhibitors. High FGFR1 expression was associated with age, malignancy, tumor location and tumor grade among astrocytomas. Membranous pFGFR1 was associated with malignancy and tumor grade. All glioma cell lines exhibited varying levels of FGFR1 and pFGFR1 expression and migratory phenotypes. There were significant anti-migratory effects on the pHGG cell lines with inhibitor treatment and anti-migratory or pro-migratory responses to FGFR1 inhibition in the pLGGs. Our findings support further research to target FGFR1 signaling in pediatric gliomas.
    • Figshare: A universal repository for academic resource sharing?

      Thelwall, Mike; Kousha, Kayvan (Emerald Group Publishing Limited, 2015-12-18)
      Purpose A number of subject-orientated and general websites have emerged to host academic resources. It is important to evaluate the uptake of such services in order to decide which depositing strategies are effective and should be encouraged. Design/methodology/approach This article evaluates the views and shares of resources in the generic repository Figshare by subject category and resource type. Findings Figshare use and common resource types vary substantially by subject category but resources can be highly viewed even in subjects with few members. Subject areas with more resources deposited do not tend to have higher viewing or sharing statistics. Practical implications Limited uptake of Figshare within a subject area should not be a barrier to its use. Several highly successful innovative uses for Figshare show that it can reach beyond a purely academic audience. Originality/value This is the first analysis of the uptake and use of a generic academic resource sharing repository.
    • GCN-Sem at SemEval-2019 Task 1: Semantic Parsing using Graph Convolutional and Recurrent Neural Networks

      Taslimipoor, Shiva; Rohanian, Omid; Može, Sara (Association for Computational Linguistics, 2019-06-06)
      This paper describes the system submitted to the SemEval 2019 shared task 1 ‘Cross-lingual Semantic Parsing with UCCA’. We rely on the semantic dependency parse trees provided in the shared task which are converted from the original UCCA files and model the task as tagging. The aim is to predict the graph structure of the output along with the types of relations among the nodes. Our proposed neural architecture is composed of Graph Convolution and BiLSTM components. The layers of the system share their weights while predicting dependency links and semantic labels. The system is applied to the CONLLU format of the input data and is best suited for semantic dependency parsing.
    • Gender and image sharing on Facebook, Twitter, Instagram, Snapchat and WhatsApp in the UK: Hobbying alone or filtering for friends?

      Thelwall, Mike; Vis, Farida (Emerald, 2017-10-01)
      Purpose: Despite the ongoing shift from text-based to image-based communication in the social web, supported by the affordances of smartphones, little is known about the new image sharing practices. Both gender and platform type seem likely to be important, but it is unclear how. Design/methodology/approach: This article surveys an age-balanced sample of UK Facebook, Twitter, Instagram, Snapchat and WhatsApp image sharers with a range of exploratory questions about platform use, privacy, interactions, technology use and profile pictures. Findings: Females shared photos more often overall and shared images more frequently on Snapchat, but males shared more images on Twitter, particularly for hobbies. Females also tended to have more privacy-related concerns but were more willing, in principle, to share pictures of their children. Females also interacted more through others’ images by liking and commenting on them. Both genders used supporting apps but in different ways: females applied filters and posted to albums whereas males retouched photos and used photo organising apps. Finally, males were more likely to be alone in their profile pictures. Practical implications: Those designing visual social web communication strategies to reach out to users should consider the different ways in which platforms are used by males and females to optimise their message for their target audience. Social implications: There are clear gender and platform differences in visual communication strategies. Overall, males may tend to have more informational, and females more relationship-based, skills or needs. Originality/value: This is the first detailed survey of electronic image sharing practices and the first to systematically compare the current generation of platforms.
    • Gender and research Publishing in India: Uniformly high inequality?

      Thelwall, Mike; Bailey, Carol; Makita, Meiko; Sud, Pardeep; Madalli, Devika P. (Elsevier, 2018-12-10)
      Gender inequalities have been a persistent feature of all modern societies. Although employment-related gender discrimination in various forms is legally prohibited, prejudice and violence against females have not been eradicated. Moreover, gendered social expectations can constrain the career choices of both males and females. Within academia, continuing gender imbalances have been found in many countries (Larivière, Ni, Gingras, Cronin, & Sugimoto, 2013), and particularly at senior levels (e.g., Ucal, O'Neil, & Toktas, 2015; Weisshaar, 2017; Winchester & Browning, 2015). India was the fifth largest research producer in 2017, according to Scopus, but has the highest United Nations Development Programme (UNDP) gender inequality index of the 30 largest research producers in Scopus (/hdr.undp.org/en/data) and so is an important case for global science. Moreover, the complex web of influences that have led to women being underrepresented in science in India is not well understood (Gupta, 2015). The absence of basic information about gender inequalities is a serious limitation because gender issues in India differ from the better researched case of the USA, due to economic conditions, probably stronger family influences (Vindhya, 2007), greater female safety concerns (Vindhya, 2007), and differing cultural expectations (Chandrakar, 2014).
    • Gender bias in machine learning for sentiment analysis

      Thelwall, Mike (Emerald, 2018-01-01)
      Purpose: This paper investigates whether machine learning induces gender biases in the sense of results that are more accurate for male authors than for female authors. It also investigates whether training separate male and female variants could improve the accuracy of machine learning for sentiment analysis. Design/methodology/approach: This article uses ratings-balanced sets of reviews of restaurants and hotels (3 sets) to train algorithms with and without gender selection. Findings: Accuracy is higher on female-authored reviews than on male-authored reviews for all data sets, so applications of sentiment analysis using mixed gender datasets will over represent the opinions of women. Training on same gender data improves performance less than having additional data from both genders. Practical implications: End users of sentiment analysis should be aware that its small gender biases can affect the conclusions drawn from it and apply correction factors when necessary. Users of systems that incorporate sentiment analysis should be aware that performance will vary by author gender. Developers do not need to create gender-specific algorithms unless they have more training data than their system can cope with. Originality/value: This is the first demonstration of gender bias in machine learning sentiment analysis.
    • Gender bias in sentiment analysis

      Thelwall, Mike (Emerald, 2018-02-14)
      Purpose: To test if there are biases in lexical sentiment analysis accuracy between reviews authored by males and females. Design: This paper uses datasets of TripAdvisor reviews of hotels and restaurants in the UK written by UK residents to contrast the accuracy of lexical sentiment analysis for males and females. Findings: Male sentiment is harder to detect because it is less explicit. There was no evidence that this problem could be solved by gender-specific lexical sentiment analysis. Research limitations: Only one lexical sentiment analysis algorithm was used. Practical implications: Care should be taken when drawing conclusions about gender differences from automatic sentiment analysis results. When comparing opinions for product aspects that appeal differently to men and women, female sentiments are likely to be overrepresented, biasing the results. Originality/value: This is the first evidence that lexical sentiment analysis is less able to detect the opinions of one gender than another.
    • Gender differences in research areas, methods and topics: Can people and thing orientations explain the results?

      Thelwall, Mike; Bailey, Carol; Tobin, Catherine; Bradshaw, Noel-Ann (Elsevier, 2018-12-26)
      Although the gender gap in academia has narrowed, females are underrepresented within some fields in the USA. Prior research suggests that the imbalances between science, technology, engineering and mathematics fields may be partly due to greater male interest in things and greater female interest in people, or to off-putting masculine cultures in some disciplines. To seek more detailed insights across all subjects, this article compares practising US male and female researchers between and within 285 narrow Scopus fields inside 26 broad fields from their first-authored articles published in 2017. The comparison is based on publishing fields and the words used in article titles, abstracts, and keywords. The results cannot be fully explained by the people/thing dimensions. Exceptions include greater female interest in veterinary science and cell biology and greater male interest in abstraction, patients, and power/control fields, such as politics and law. These may be due to other factors, such as the ability of a career to provide status or social impact or the availability of alternative careers. As a possible side effect of the partial people/thing relationship, females are more likely to use exploratory and qualitative methods and males are more likely to use quantitative methods. The results suggest that the necessary steps of eliminating explicit and implicit gender bias in academia are insufficient and might be complemented by measures to make fields more attractive to minority genders.
    • Goodreads Reviews to Assess the Wider Impacts of Books

      Kousha, Kayvan; Thelwall, Mike; Abdoli, Mahshid (John Wiley & Sons, 2017-07-17)
      Although peer-review and citation counts are commonly used to help assess the scholarly impact of published research, informal reader feedback might also be exploited to help assess the wider impacts of books, such as their educational or cultural value. The social website Goodreads seems to be a reasonable source for this purpose because it includes a large number of book reviews and ratings by many users inside and outside of academia. To check this, Goodreads book metrics were compared with different book-based impact indicators for 15,928 academic books across broad fields. Goodreads engagements were numerous enough in the Arts (85% of books had at least one), Humanities (80%) and Social Sciences (67%) for use as a source of impact evidence. Low and moderate correlations between Goodreads book metrics and scholarly or non-scholarly indicators suggest that reader feedback in Goodreads reflects the many purposes of books rather than a single type of impact. Although Goodreads book metrics can be manipulated they could be used guardedly by academics, authors, and publishers in evaluations.
    • Goodreads: A social network site for book readers

      Thelwall, Mike; Kousha, Kayvan (John Wiley & Sons, Inc., 2016-12-21)
      Goodreads is an Amazon‐owned book‐based social web site for members to share books, read, review books, rate books, and connect with other readers. Goodreads has tens of millions of book reviews, recommendations, and ratings that may help librarians and readers to select relevant books. This article describes a first investigation of the properties of Goodreads users, using a random sample of 50,000 members. The results suggest that about three quarters of members with a public profile are female, and that there is little difference between male and female users in patterns of behavior, except for females registering more books and rating them less positively. Goodreads librarians and super‐users engage extensively with most features of the site. The absence of strong correlations between book‐based and social usage statistics (e.g., numbers of friends, followers, books, reviews, and ratings) suggests that members choose their own individual balance of social and book activities and rarely ignore one at the expense of the other. Goodreads is therefore neither primarily a book‐based website nor primarily a social network site but is a genuine hybrid, social navigation site.
    • Google Scholar, Web of Science, and Scopus: a systematic comparison of citations in 252 subject categories

      Martín-Martín, Alberto; Orduna-Malea, Enrique; Thelwall, Mike; Delgado López-Cózar, Emilio (Elsevier, 2018-10-05)
      Despite citation counts from Google Scholar (GS), Web of Science (WoS), and Scopus being widely consulted by researchers and sometimes used in research evaluations, there is no recent or systematic evidence about the differences between them. In response, this paper investigates 2,448,055 citations to 2299 English-language highly-cited documents from 252 GS subject categories published in 2006, comparing GS, the WoS Core Collection, and Scopus. GS consistently found the largest percentage of citations across all areas (93%–96%), far ahead of Scopus (35%–77%) and WoS (27%–73%). GS found nearly all the WoS (95%) and Scopus (92%) citations. Most citations found only by GS were from non-journal sources (48%–65%), including theses, books, conference papers, and unpublished materials. Many were non-English (19%–38%), and they tended to be much less cited than citing sources that were also in Scopus or WoS. Despite the many unique GS citing sources, Spearman correlations between citation counts in GS and WoS or Scopus are high (0.78-0.99). They are lower in the Humanities, and lower between GS and WoS than between GS and Scopus. The results suggest that in all areas GS citation data is essentially a superset of WoS and Scopus, with substantial extra coverage.
    • Grammatical annotation of historical Portuguese: Generating a corpus-based diachronic dictionary

      Bick, Eckhard; Zampieri, Marcos (Springer, 2016-09-03)
      In this paper, we present an automatic system for the morphosyntactic annotation and lexicographical evaluation of historical Portuguese corpora. Using rule-based orthographical normalization, we were able to apply a standard parser (PALAVRAS) to historical data (Colonia corpus) and to achieve accurate annotation for both POS and syntax. By aligning original and standardized word forms, our method allows to create tailor-made standardization dictionaries for historical Portuguese with optional period or author frequencies.