Show simple item record

dc.contributor.authorSafder, Iqra
dc.contributor.authorBatool, Hafsa
dc.contributor.authorSarwar, Raheem
dc.contributor.authorZaman, Farooq
dc.contributor.authorAljohani, Naif Radi
dc.contributor.authorNawaz, Raheel
dc.contributor.authorGaber, Mohamed
dc.contributor.authorHassan, Saeed-Ul
dc.date.accessioned2021-11-05T14:15:09Z
dc.date.available2021-11-05T14:15:09Z
dc.date.issued2021-11-14
dc.identifier.citationSafder, I., Batool, H., Sarwar, R., Zaman, F., Aljohani, N.R., Nawaz, R., Gaber, M. and Hassan, S. (2021) Parsing AUC Result-Figures in Machine Learning Specific Scholarly Documents for Semantically-enriched Summarization, Applied Artificial Intelligence, DOI: 10.1080/08839514.2021.2004347en
dc.identifier.issn0883-9514en
dc.identifier.doi10.1080/08839514.2021.2004347
dc.identifier.urihttp://hdl.handle.net/2436/624436
dc.description© 2021 The Authors. Published by Taylor & Francis. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1080/08839514.2021.2004347en
dc.description.abstractMachine learning specific scholarly full-text documents contain a number of result-figures expressing valuable data, including experimental results, evaluations, and cross-model comparisons. The scholarly search system often overlooks this vital information while indexing important terms using conventional text-based content extraction approaches. In this paper, we propose creating semantically enriched document summaries by extracting meaningful data from the results-figures specific to the evaluation metric of the area under the curve (AUC) and their associated captions from full-text documents. At first, classify the extracted figures and analyze them by parsing the figure text, legends, and data plots – using a convolutional neural network classification model with a pre-trained ResNet-50 on 1.2 million Images from ImageNet. Next, we extract information from the result figures specific to AUC by approximating the region under the function's graph as a trapezoid and calculating its area, i.e., the trapezoidal rule. Using over 12,000 figures extracted from 1000 scholarly documents, we show that figure specialized summaries contain more enriched terms about figure semantics. Furthermore, we empirically show that the trapezoidal rule can calculate the area under the curve by dividing the curve into multiple intervals. Finally, we measure the quality of specialized summaries using ROUGE, Edit distance, and Jaccard Similarity metrics. Overall, we observed that figure specialized summaries are more comprehensive and semantically enriched. The applications of our research are enormous, including improved document searching, figure searching, and figure focused plagiarism. The data and code used in this paper can be accessed at the following URL: https://github.com/slab-itu/fig-ir/.en
dc.formatapplication/pdfen
dc.language.isoenen
dc.publisherTaylor & Francisen
dc.relation.urlhttps://www.tandfonline.com/doi/full/10.1080/08839514.2021.2004347?src=en
dc.subjectinformation retrievalen
dc.subjectscientific data managementen
dc.subjectknowledge discoveryen
dc.subjectfigure parsingen
dc.subjecttext summarisationen
dc.subjectfull-texten
dc.titleParsing AUC result-figures in machine learning specific scholarly documents for semantically-enriched summarizationen
dc.typeJournal articleen
dc.identifier.journalApplied Artificial Intelligenceen
dc.date.updated2021-11-05T06:13:27Z
dc.date.accepted2021-11-04
rioxxterms.funderUniversity of Wolverhampton
rioxxterms.identifier.projectUOW05112021RSen
rioxxterms.versionVoRen
rioxxterms.licenseref.urihttps://creativecommons.org/licenses/by/4.0/en
rioxxterms.licenseref.startdate2022-11-14en
refterms.dateFCD2021-11-05T14:14:32Z
refterms.versionFCDVoR
refterms.dateFOA2021-11-28T04:18:53Z


Files in this item

Thumbnail
Name:
08839514.2021.pdf
Size:
7.879Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record

https://creativecommons.org/licenses/by/4.0/
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by/4.0/