Show simple item record

dc.contributor.authorMohamed, Emad
dc.contributor.authorMostafa, Safa
dc.date.accessioned2019-07-09T14:04:13Z
dc.date.available2019-07-09T14:04:13Z
dc.date.issued2019-07-03
dc.identifier.citationMohamed, E.; Mostafa, S.A. Computing Happiness from Textual Data. Stats 2019, 2, 347-370.en
dc.identifier.issn2571-905Xen
dc.identifier.doi10.3390/stats2030025en
dc.identifier.urihttp://hdl.handle.net/2436/622529
dc.description.abstractIn this paper, we use a corpus of about 100,000 happy moments written by people of different genders, marital statuses, parenthood statuses, and ages to explore the following questions: Are there differences between men and women, married and unmarried individuals, parents and non-parents, and people of different age groups in terms of their causes of happiness and how they express happiness? Can gender, marital status, parenthood status and/or age be predicted from textual data expressing happiness? The first question is tackled in two steps: first, we transform the happy moments into a set of topics, lemmas, part of speech sequences, and dependency relations; then, we use each set as predictors in multi-variable binary and multinomial logistic regressions to rank these predictors in terms of their influence on each outcome variable (gender, marital status, parenthood status and age). For the prediction task, we use character, lexical, grammatical, semantic, and syntactic features in a machine learning document classification approach. The classification algorithms used include logistic regression, gradient boosting, and fastText. Our results show that textual data expressing moments of happiness can be quite beneficial in understanding the “causes of happiness” for different social groups, and that social characteristics like gender, marital status, parenthood status, and, to some extent age, can be successfully predicted form such textual data. This research aims to bring together elements from philosophy and psychology to be examined by computational corpus linguistics methods in a way that promotes the use of Natural Language Processing for the Humanities.
dc.formatapplication/pdfen
dc.language.isoenen
dc.publisherMDPIen
dc.relation.urlhttps://www.mdpi.com/2571-905X/2/3/25en
dc.subjectfastTexten
dc.subjectgradient boostingen
dc.subjecthappinessen
dc.subjectlemmatizationen
dc.subjectLexical analysisen
dc.subjectLogistic regressionen
dc.subjecttopic modelingen
dc.titleComputing Happiness from Textual Dataen
dc.typeJournal articleen
dc.identifier.journalStatsen
dc.date.updated2019-07-02T12:07:25Z
dc.date.accepted2019-07-01
rioxxterms.funderJiscen
rioxxterms.identifier.project090719EMen
rioxxterms.versionAMen
rioxxterms.licenseref.urihttps://creativecommons.org/licenses/by/4.0/en
rioxxterms.licenseref.startdate2019-07-09en
dc.source.volume2
dc.source.issue3
dc.source.beginpage347
dc.source.endpage370
refterms.dateFCD2019-07-09T14:03:50Z
refterms.versionFCDAM
refterms.dateFOA2019-07-09T14:04:14Z


Files in this item

Thumbnail
Name:
Mohamed_Computing_happiness_20 ...
Size:
11.14Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record

https://creativecommons.org/licenses/by/4.0/
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by/4.0/