Recent Submissions

  • A flexible framework for collocation retrieval and translation from parallel and comparable corpora

    Rivera, Oscar Mendoza; Mitkov, Ruslan; Corpas Pastor, Gloria (John Benjamins, 2018)
    This paper outlines a methodology and a system for collocation retrieval and translation from parallel and comparable corpora. The methodology was developed with translators and language learners in mind. It is based on a phraseology framework, applies statistical techniques, and employs source tools and online resources. The collocation retrieval and translation has proved successful for English and Spanish and can be easily adapted to other languages. The evaluation results are promising and future goals are proposed. Furthermore, conclusions are drawn on the nature of comparable corpora and how they can be better exploited to suit particular needs of target users.
  • Dissecting tweets in search of irony

    Rohanian, Omid; Taslimipoor, Shiva; Evans, Richard; Mitkov, Ruslan (Association for Computational Linguistics, 2018-06-05)
    This paper describes the systems submitted to SemEval 2018 Task 3 “Irony detection in English tweets” for both subtasks A and B. The first system leveraging a combination of sentiment, distributional semantic, and text surface features is ranked third among 44 teams according to the official leaderboard of the subtask A. The second system with slightly different representation of the features ranked ninth in subtask B. We present a method that entails decomposing tweets into separate parts. Searching for contrast within the constituents of a tweet is an integral part of our system. We embrace an extensive definition of contrast which leads to a vast coverage in detecting ironic content.
  • Does female-authored research have more educational impact than male-authored research?

    Thelwall, Mike (Levy Library Press, 2018-10-04)
    Female academics are more likely to be in teaching-related roles in some countries, including the USA. As a side effect of this, female-authored journal articles may tend to be more useful for students. This study assesses this hypothesis by investigating whether female first-authored research has more uptake in education than male first-authored research. Based on an analysis of Mendeley readers of articles from 2014 in five countries and 100 narrow Scopus subject categories, the results show that female-authored articles attract more student readers than male-authored articles in Spain, Turkey, the UK and USA but not India. They also attract fewer professorial readers in Spain, the UK and the USA, but not India and Turkey, and tend to be less popular with senior academics. Because the results are based on analysis of differences within narrow fields they cannot be accounted for by females working in more education-related disciplines. The apparent additional educational impact for female-authored research could be due to selecting more accessible micro-specialisms, however, such as health-related instruments within the instrumentation narrow field. Whatever the cause, the results suggest that citation-based research evaluations may undervalue the wider impact of female researchers.
  • Semantic discrimination based on knowledge and association

    Taslimipoor, Shiva; Rohanian, Omid; Ha, Le An; Corpas Pastor, Gloria; Mitkov, Ruslan (Association for Computational Linguistics, 2018-06)
    This paper describes the system submitted to SemEval 2018 shared task 10 ‘Capturing Discriminative Attributes’. We use a combination of knowledge-based and co-occurrence features to capture the semantic difference between two words in relation to an attribute. We define scores based on association measures, ngram counts, word similarity, and ConceptNet relations. The system is ranked 4th (joint) on the official leaderboard of the task.
  • Identifying Signs of Syntactic Complexity for Rule-Based Sentence Simplification

    Evans, Richard; Orasan, Constantin (Cambridge University Press, 2018)
  • Can museums find male or female audiences online with YouTube?

    Thelwall, Michael (Emerald, 2018)
    Purpose: This article investigates if and why audience gender ratios vary between museum YouTube channels, including for museums of the same type. Design/methodology/approach: Gender ratios were examined for public comments on YouTube videos from 50 popular museums in English-speaking nations. Terms that were more frequently used by males or females in comments were also examined for gender differences. Findings: The ratio of female to male YouTube commenters varies almost a hundredfold between museums. Some of the difference could be explained by gendered interests in museum themes (e.g., military, art) but others were due to the topics chosen for online content and could address a gender minority audience. Practical implications: Museums can attract new audiences online with YouTube videos that target outside their expected demographics. Originality/value: This is the first analysis of YouTube audience gender for museums.
  • Aggressive language identification using word embeddings and sentiment features

    Orasan, Constantin (Association for Computational Linguistics, 2018-06-25)
    This paper describes our participation in the First Shared Task on Aggression Identification. The method proposed relies on machine learning to identify social media texts which contain aggression. The main features employed by our method are information extracted from word embeddings and the output of a sentiment analyser. Several machine learning methods and different combinations of features were tried. The official submissions used Support Vector Machines and Random Forests. The official evaluation showed that for texts similar to the ones in the training dataset Random Forests work best, whilst for texts which are different SVMs are a better choice. The evaluation also showed that despite its simplicity the method performs well when compared with more elaborated methods.
  • Do females create higher impact research? Scopus citations and Mendeley readers for articles from five countries

    Thelwall, Mike (Elsevier, 2018-09-01)
    There are known gender imbalances in participation in scientific fields, from female dominance of nursing to male dominance of mathematics. It is not clear whether there is also a citation imbalance, with some claiming that male-authored research tends to be more cited. No previous study has assessed gender differences in the readers of academic research on a large scale, however. In response, this article assesses whether there are gender differences in the average citations and/or Mendeley readers of academic publications. Field normalised logged Scopus citations and Mendeley readers from mid-2018 for articles published in 2014 were investigated for articles with first authors from India, Spain, Turkey, the UK and the USA in up to 251 fields with at least 50 male and female authors. Although female-authored research is less cited in Turkey (−4.0%) and India (−3.6%), it is marginally more cited in Spain (0.4%), the UK (0.4%), and the USA (0.2%). Female-authored research has fewer Mendeley readers in India (−1.1%) but more in Spain (1.4%), Turkey (1.1%), the UK (2.7%) and the USA (3.0%). Thus, whilst there may be little practical gender difference in citation impact in countries with mature science systems, the higher female readership impact suggests a wider audience for female-authored research. The results also show that the conclusions from a gender analysis depend on the field normalisation method. A theoretically informed decision must therefore be made about which normalisation to use. The results also suggest that arithmetic mean-based field normalisation is favourable to males.
  • Which US and European Higher Education Institutions are visible in ResearchGate and what affects their RG Score?

    Lepori, Benedetto; Thelwall, Michael; Hoorani, Bareerah Hafeez (Elsevier, 2018-07-19)
    While ResearchGate has become the most popular academic social networking site in terms of regular users, not all institutions have joined and the scores it assigns to academics and institutions are controversial. This paper assesses the presence in ResearchGate of higher education institutions in Europe and the US in 2017, and the extent to which institutional ResearchGate Scores reflect institutional academic impact. Most of the 2258 European and 4355 US higher educational institutions included in the sample had an institutional ResearchGate profile, with near universal coverage for PhD-awarding institutions found in the Web of Science (WoS). For non-PhD awarding institutions that did not publish, size (number of staff members) was most associated with presence in ResearchGate. For PhD-awarding institutions in WoS, presence in RG was strongly related to the number of WoS publications. In conclusion, a) institutional RG scores reflect research volume more than visibility and b) this indicator is highly correlated to the number of WoS publications. Hence, the value of RG Scores for institutional comparisons is limited.
  • Combining Multiple Corpora for Readability Assessment for People with Cognitive Disabilities

    Yaneva, Victoria; Orăsan, Constantin; Evans, Richard; Rohanian, Omid (Association for Computational Linguistics, 2017-09-08)
    Given the lack of large user-evaluated corpora in disability-related NLP research (e.g. text simplification or readability assessment for people with cognitive disabilities), the question of choosing suitable training data for NLP models is not straightforward. The use of large generic corpora may be problematic because such data may not reflect the needs of the target population. At the same time, the available user-evaluated corpora are not large enough to be used as training data. In this paper we explore a third approach, in which a large generic corpus is combined with a smaller population-specific corpus to train a classifier which is evaluated using two sets of unseen user-evaluated data. One of these sets, the ASD Comprehension corpus, is developed for the purposes of this study and made freely available. We explore the effects of the size and type of the training data used on the performance of the classifiers, and the effects of the type of the unseen test datasets on the classification performance.
  • 10 Marked differences in the pharmacokinetic and pharmacodynamic profiles of ticagrelor in patients undergoing treatment for ST elevation and non ST elevation myocardial infarction (stemi and nstemi)

    Khan, Nazish; Amoah, Vincent; Cornes, Mike; Martins, Joe; Wrigley, Ben; Khogali, Saib; Nevill, Alan M.; Cotton, James (BMJ Publishing Group, 2018-06-01)
    Introduction Ticagrelor, an orally administered, direct acting, reversible P2Y12 receptor inhibitor, provides faster onset and greater levels of platelet inhibition when compared to clopidogrel. Current data indicates a reduced antiplatelet effect in STEMI. We sought to determine the early pharmacokinetic (PK) and pharmacodynamic (PD) effect of ticagrelor loading doses administered to patients undergoing PCI for STEMI and NSTEMI. Methods This is a single centre non-randomised study. P2Y12 naive patients presenting with STEMI or NSTEMI were considered for inclusion. All patients gave informed consent. Enrolled patients were administered a loading dose of aspirin 300 mg and ticagrelor 180 mg prior to PCI. Blood was sampled at 20 min, coronary balloon time, 1 hour and 4 hours after loading. PD results are expressed as P2Y12 reaction units (PRU) and were assessed using VerifyNow. A PRU>208 indicates a sub-optimal antiplatelet response. PK properties were assessed by measuring plasma concentration of ticagrelor parent compound (T-PC) and active metabolite (T-AM) using liquid chromatography in tandem with mass spectrometry. The lower limits of quantification of T-PC and its active metabolite, AR-C124910XX (T-AM) are 1 ng/ml and 2.5 ng/ml respectively. PRU and plasma concentrations over time were tested between the two groups using 2-way ANOVA. p<0.05 was considered significant. Results 30 patients (15 STEMI/15 NSTEMI) were recruited. Baseline characteristics are described in Table 1.
  • Bilingual contexts from comparable corpora to mine for translations of collocations

    Taslimipoor, Shiva (Springer, 2018-03-21)
    Due to the limited availability of parallel data in many languages, we propose a methodology that benefits from comparable corpora to find translation equivalents for collocations (as a specific type of difficult-to-translate multi-word expressions). Finding translations is known to be more difficult for collocations than for words. We propose a method based on bilingual context extraction and build a word (distributional) representation model drawing on these bilingual contexts (bilingual English-Spanish contexts in our case). We show that the bilingual context construction is effective for the task of translation equivalent learning and that our method outperforms a simplified distributional similarity baseline in finding translation equivalents.
  • A new waist-to-height ratio predicts abdominal adiposity in adults.

    Nevill, Alan M.; Stewart, Arthur D; Olds, Tim; Duncan, Michael J (Taylor & Francis, 2018-07-25)
    Our aim was to identify the best anthropometric index associated with waist adiposity. The six weight-status indices included body mass index (BMI), waist-to-hip ratio (WHR), waist-to-height ratio (WHTR), and a new waist-by-height
  • Academic information on Twitter: A user survey

    Mohammadi, Ehsan; Thelwall, Mike; Kwasny, Mary; Holmes, Kristi L. (PLOS, 2018-05-17)
    Although counts of tweets citing academic papers are used as an informal indicator of interest, little is known about who tweets academic papers and who uses Twitter to find scholarly information. Without knowing this, it is difficult to draw useful conclusions from a publication being frequently tweeted. This study surveyed 1,912 users that have tweeted journal articles to ask about their scholarly-related Twitter uses. Almost half of the respondents (45%) did not work in academia, despite the sample probably being biased towards academics. Twitter was used most by people with a social science or humanities background. People tend to leverage social ties on Twitter to find information rather than searching for relevant tweets. Twitter is used in academia to acquire and share real-time information and to develop connections with others. Motivations for using Twitter vary by discipline, occupation, and employment sector, but not much by gender. These factors also influence the sharing of different types of academic information. This study provides evidence that Twitter plays a significant role in the discovery of scholarly information and cross-disciplinary knowledge spreading. Most importantly, the large numbers of non-academic users support the claims of those using tweet counts as evidence for the non-academic impacts of scholarly research
  • Assessing the teaching value of non-English academic books: The case of Spain

    Mas Bleda, Amalia; Thelwall, Mike (Consejo Superior de Investigaciones Científicas, 2018)
  • Leveraging large corpora for translation using the Sketch Engine

    Moze, Sarah; Krek, Simon (Cambridge University Press, 2018)
  • Co-saved, co-tweeted, and co-cited networks

    Didegah, Fereshteh; Thelwall, Mike; Danish Centre for Studies in Research & Research Policy, Department of Political Science & Government; Aarhus University; Aarhus Denmark; Statistical Cybermetrics Research Group, University of Wolverhampton, Wulfruna Street; Wolverhampton WV1 1LY UK (Wiley-Blackwell, 2018-05-14)
    Counts of tweets and Mendeley user libraries have been proposed as altmetric alternatives to citation counts for the impact assessment of articles. Although both have been investigated to discover whether they correlate with article citations, it is not known whether users tend to tweet or save (in Mendeley) the same kinds of articles that they cite. In response, this article compares pairs of articles that are tweeted, saved to a Mendeley library, or cited by the same user, but possibly a different user for each source. The study analyzes 1,131,318 articles published in 2012, with minimum tweeted (10), saved to Mendeley (100), and cited (10) thresholds. The results show surprisingly minor overall overlaps between the three phenomena. The importance of journals for Twitter and the presence of many bots at different levels of activity suggest that this site has little value for impact altmetrics. The moderate differences between patterns of saving and citation suggest that Mendeley can be used for some types of impact assessments, but sensitivity is needed for underlying differences.
  • The effect of walking on risk factors for cardiovascular disease: an updated systematic review and meta-analysis of randomised control trials.

    Murtagh, Elaine M; Nichols, Linda; Mohammed, Mohammed A; Holder, Roger; Nevill, Alan M.; Murphy, Marie H (Elsevier, 2015-03)
    Objective To conduct a systematic review and meta-analysis of randomised control trials that examined the effect of walking on risk factors for cardiovascular disease.Methods Four electronic databases and reference lists were searched (Jan 1971–June 2012). Two authors identified randomised control trials of interventions ≥ 4 weeks in duration that included at least one group with walking as the only treatment and a no-exercise comparator group. Participants were inactive at baseline. Pooled results were reported as weighted mean treatment effects and 95% confidence intervals using a random effects model. Results 32 articles reported the effects of walking interventions on cardiovascular disease risk factors. Walking increased aerobic capacity (3.04 mL/kg/min, 95% CI 2.48 to 3.60) and reduced systolic (− 3.58 mm Hg, 95% CI − 5.19 to − 1.97) and diastolic (− 1.54 mm Hg, 95% CI − 2.83 to − 0.26) blood pressure, waist circumference (− 1.51 cm, 95% CI − 2.34 to − 0.68), weight (− 1.37 kg, 95% CI − 1.75 to − 1.00), percentage body fat (− 1.22%, 95% CI − 1.70 to − 0.73) and body mass index (− 0.53 kg/m2, 95% CI − 0.72 to − 0.35) but failed to alter blood lipids. Conclusions Walking interventions improve many risk factors for cardiovascular disease. This underscores the central role of walking in physical activity for health promotion.
  • Faster, higher, stronger, older: Relative age effects are most influential during the youngest age grade of track and field athletics in the United Kingdom.

    Kearney, Philip E; Hayes, Philip R; Nevill, Alan M. (Taylor & Francis, 2018-03-07)
    The relative age effect (RAE) is a common phenomenon in youth sport, whereby children born early in the selection year are more likely to experience success and to sustain participation. There is a lack of research investigating variables which influence RAEs within track and field athletics. Such information is vital to guide policies in relation to competition structure, youth development squads and coach education. A database of competition results was analysed to determine the extent to which RAEs were present in track and field athletics in the United Kingdom. Subsequent analyses examined whether age, sex, event and skill level influenced the RAE. Examination of 77,571 records revealed that RAEs were widespread, but most pronounced during Under 13 (U13) competitions; that is, during athletes' first exposure to formal track and field competition. Sex, event and skill level further influenced the existence and magnitude of RAEs at different age grades. Relative age is a key influencing factor within track and field athletics, especially at the youngest age category. Consequently, national governing bodies need to consider what administrative and stakeholder initiatives are necessary to minimise the effects of RAEs on young athletes' early experiences of competition.
  • How Does a Photocatalytic Antimicrobial Coating Affect Environmental Bioburden in Hospitals?

    Reid, Matthew; Whatley, Vanessa; Spooner, Emma; Nevill, Alan M.; Cooper, Michael; Ramsden, Jeremy J; Dancer, Stephanie J (Cambridge University Press, 2018-04)
    BACKGROUND The healthcare environment is recognized as a source for healthcare-acquired infection. Because cleaning practices are often erratic and always intermittent, we hypothesize that continuously antimicrobial surfaces offer superior control of surface bioburden. OBJECTIVE To evaluate the impact of a photocatalytic antimicrobial coating at near-patient, high-touch sites in a hospital ward. SETTING The study took place in 2 acute-care wards in a large acute-care hospital. METHODS A titanium dioxide-based photocatalytic coating was sprayed onto 6 surfaces in a 4-bed bay in a ward and compared under normal illumination against the same surfaces in an untreated ward: right and left bed rails, bed control, bedside locker, overbed table, and bed footboard. Using standardized methods, the overall microbial burden and presence of an indicator pathogen (Staphylococcus aureus) were assessed biweekly for 12 weeks. RESULTS Treated surfaces demonstrated significantly lower microbial burden than control sites, and the difference increased between treated and untreated surfaces during the study. Hygiene failures (>2.5 colony-forming units [CFU]/cm2) increased 2.6% per day for control surfaces (odds ratio [OR], 1.026; 95% confidence interval [CI], 1.009-1.043; P=.003) but declined 2.5% per day for treated surfaces (OR, 0.95; 95% CI, 0.925-0.977; P<.001). We detected no significant difference between coated and control surfaces regarding S. aureus contamination. CONCLUSION Photocatalytic coatings reduced the bioburden of high-risk surfaces in the healthcare environment. Treated surfaces became steadily cleaner, while untreated surfaces accumulated bioburden. This evaluation encourages a larger-scale investigation to ascertain whether the observed environmental amelioration has an effect on healthcare-acquired infection. Infect Control Hosp Epidemiol 2018;39:398-404.

View more