Show simple item record

dc.contributor.authorThelwall, Mike
dc.date.accessioned2017-09-22T14:17:57Z
dc.date.available2017-09-22T14:17:57Z
dc.date.issued2018-01-01
dc.identifier.citationThelwall, M. (2018), "Gender bias in machine learning for sentiment analysis", Online Information Review, Vol. 42 No. 3, pp. 343-354. https://doi.org/10.1108/OIR-05-2017-0153
dc.identifier.issn1468-4527
dc.identifier.doi10.1108/OIR-05-2017-0153
dc.identifier.urihttp://hdl.handle.net/2436/620690
dc.descriptionThis is an accepted manuscript of an article published by Emerald Publishing Limited in Online Information Review on 01/01/2018, available online: https://doi.org/10.1108/OIR-05-2017-0153 The accepted version of the publication may differ from the final published version.
dc.description.abstractPurpose: This paper investigates whether machine learning induces gender biases in the sense of results that are more accurate for male authors than for female authors. It also investigates whether training separate male and female variants could improve the accuracy of machine learning for sentiment analysis. Design/methodology/approach: This article uses ratings-balanced sets of reviews of restaurants and hotels (3 sets) to train algorithms with and without gender selection. Findings: Accuracy is higher on female-authored reviews than on male-authored reviews for all data sets, so applications of sentiment analysis using mixed gender datasets will over represent the opinions of women. Training on same gender data improves performance less than having additional data from both genders. Practical implications: End users of sentiment analysis should be aware that its small gender biases can affect the conclusions drawn from it and apply correction factors when necessary. Users of systems that incorporate sentiment analysis should be aware that performance will vary by author gender. Developers do not need to create gender-specific algorithms unless they have more training data than their system can cope with. Originality/value: This is the first demonstration of gender bias in machine learning sentiment analysis.
dc.formatapplication/pdf
dc.language.isoen
dc.publisherEmerald Publishing Limited
dc.relation.urlhttps://www.emerald.com/insight/content/doi/10.1108/OIR-05-2017-0153/full/html
dc.subjectSentiment analysis
dc.subjectopinion mining
dc.subjectsocial media
dc.subjectonline customer relations management
dc.titleGender bias in machine learning for sentiment analysis
dc.typeJournal article
dc.identifier.journalGender bias in machine learning for sentiment analysis
dc.date.accepted2017-09-21
rioxxterms.funderThe University of Wolverhampton
rioxxterms.identifier.projectUoW220917MT
rioxxterms.versionAM
rioxxterms.licenseref.urihttps://creativecommons.org/CC BY-NC-ND 4.0
rioxxterms.licenseref.startdate2018-06-01
dc.source.volume42
dc.source.issue3
dc.source.beginpage343
dc.source.endpage354
refterms.dateFCD2018-10-19T08:43:46Z
refterms.versionFCDAM
refterms.dateFOA2018-06-01T00:00:00Z
html.description.abstractPurpose: This paper investigates whether machine learning induces gender biases in the sense of results that are more accurate for male authors than for female authors. It also investigates whether training separate male and female variants could improve the accuracy of machine learning for sentiment analysis. Design/methodology/approach: This article uses ratings-balanced sets of reviews of restaurants and hotels (3 sets) to train algorithms with and without gender selection. Findings: Accuracy is higher on female-authored reviews than on male-authored reviews for all data sets, so applications of sentiment analysis using mixed gender datasets will over represent the opinions of women. Training on same gender data improves performance less than having additional data from both genders. Practical implications: End users of sentiment analysis should be aware that its small gender biases can affect the conclusions drawn from it and apply correction factors when necessary. Users of systems that incorporate sentiment analysis should be aware that performance will vary by author gender. Developers do not need to create gender-specific algorithms unless they have more training data than their system can cope with. Originality/value: This is the first demonstration of gender bias in machine learning sentiment analysis.


Files in this item

Thumbnail
Name:
Gendered machine learning sentiment ...
Size:
796.1Kb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record

https://creativecommons.org/CC BY-NC-ND 4.0
Except where otherwise noted, this item's license is described as https://creativecommons.org/CC BY-NC-ND 4.0