Show simple item record

dc.contributor.authorAlabbas, Waleed
dc.contributor.authoral-Khateeb, Haider
dc.contributor.authorMansour, Ali
dc.contributor.authorEpiphaniou, Gregory
dc.contributor.authorFrommholz, Ingo
dc.date.accessioned2019-03-11T09:58:39Z
dc.date.available2019-03-11T09:58:39Z
dc.date.issued2017-10-09
dc.identifier.citationW. Alabbas, H. M. al-Khateeb, A. Mansour, G. Epiphaniou, I. Frommholz, "Classification of Colloquial Arabic Tweets in real-time to detect high-risk floods", 2017 International Conference on Social Media, Wearable and Web Analytics (Social Media), June 19-20, 2017, IEEE, London, UK. doi: 10.1109/SOCIALMEDIA.2017.8057358en
dc.identifier.isbn9781509050574
dc.identifier.doi10.1109/SOCIALMEDIA.2017.8057358
dc.identifier.urihttp://hdl.handle.net/2436/622179
dc.description.abstractTwitter has eased real-time information flow for decision makers, it is also one of the key enablers for Open-source Intelligence (OSINT). Tweets mining has recently been used in the context of incident response to estimate the location and damage caused by hurricanes and earthquakes. We aim to research the detection of a specific type of high-risk natural disasters frequently occurring and causing casualties in the Arabian Peninsula, namely `floods'. Researching how we could achieve accurate classification suitable for short informal (colloquial) Arabic text (usually used on Twitter), which is highly inconsistent and received very little attention in this field. First, we provide a thorough technical demonstration consisting of the following stages: data collection (Twitter REST API), labelling, text pre-processing, data division and representation, and training models. This has been deployed using `R' in our experiment. We then evaluate classifiers' performance via four experiments conducted to measure the impact of different stemming techniques on the following classifiers SVM, J48, C5.0, NNET, NB and k-NN. The dataset used consisted of 1434 tweets in total. Our findings show that Support Vector Machine (SVM) was prominent in terms of accuracy (F1=0.933). Furthermore, applying McNemar's test shows that using SVM without stemming on Colloquial Arabic is significantly better than using stemming techniques.en
dc.formatapplication/PDFen
dc.language.isoenen
dc.publisherIEEEen
dc.relation.urlhttps://ieeexplore.ieee.org/document/8057358en
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
dc.subjectArabic text classificationen
dc.subjectbig dataen
dc.subjectcolloquialismen
dc.subjectevent detectionen
dc.subjectTwitteren
dc.subjectreal-timeen
dc.subjectstemmingen
dc.subjectSVMen
dc.titleClassification of colloquial Arabic tweets in real-time to detect high-risk floodsen
dc.typeConference contributionen
dc.identifier.journal2017 International Conference On Social Media, Wearable And Web Analytics (Social Media)en
dc.conference.name2017 International Conference On Social Media, Wearable And Web Analytics (Social Media)
pubs.place-of-publicationLondon, UK
pubs.start-date2017-06-19
pubs.start-date2017-06-20
refterms.dateFOA2019-03-11T09:58:39Z


Files in this item

Thumbnail
Name:
Classification of Colloquial ...
Size:
1023.Kb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States