Show simple item record

dc.contributor.authorÜstün, Ahmet
dc.contributor.authorKurfalı, Murathan
dc.contributor.authorCan, Burcu
dc.date.accessioned2020-09-03T10:18:12Z
dc.date.available2020-09-03T10:18:12Z
dc.date.issued2018
dc.identifier.citationÜstün, A., Kurfalı, M. and Can, B. (2018) Characters or morphemes: how to represent words? In, Proceedings of The Third Workshop on Representation Learning for NLP, Augenstein, I., Cao, K., He, H., Hill, F. et al. Stroudsburg, PA: Association for Computational Linguistics, pp. 144-153.en
dc.identifier.isbn9781948087438en
dc.identifier.doi10.18653/v1/w18-3019en
dc.identifier.urihttp://hdl.handle.net/2436/623576
dc.description© 2018 The Authors. Published by Association for Computational Linguistics. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: http://dx.doi.org/10.18653/v1/W18-3019en
dc.description.abstractIn this paper, we investigate the effects of using subword information in representation learning. We argue that using syntactic subword units effects the quality of the word representations positively. We introduce a morpheme-based model and compare it against to word-based, character-based, and character n-gram level models. Our model takes a list of candidate segmentations of a word and learns the representation of the word based on different segmentations that are weighted by an attention mechanism. We performed experiments on Turkish as a morphologically rich language and English with a comparably poorer morphology. The results show that morpheme-based models are better at learning word representations of morphologically complex languages compared to character-based and character n-gram level models since the morphemes help to incorporate more syntactic knowledge in learning, that makes morpheme-based models better at syntactic tasks.en
dc.description.sponsorshipThis research was supported by TUBITAK (The Scientific and Technological Research Council of Turkey) grant number 115E464.en
dc.formatapplication/pdfen
dc.language.isoenen
dc.publisherAssociation for Computational Linguisticsen
dc.relation.urlhttps://www.aclweb.org/anthology/W18-3019/en
dc.titleCharacters or morphemes: how to represent words?en
dc.typeConference contributionen
dc.date.updated2020-08-26T08:20:54Z
dc.conference.nameProceedings of The Third Workshop on Representation Learning for NLP
pubs.finish-date2018-07
pubs.start-date2018-07
rioxxterms.funderTUBITAKen
rioxxterms.identifier.project115E464en
rioxxterms.versionVoRen
rioxxterms.licenseref.urihttp://creativecommons.org/licenses/by/4.0/en
rioxxterms.licenseref.startdate2020-09-03en
dc.description.versionPublished version
refterms.dateFCD2020-09-03T10:16:49Z
refterms.versionFCDVoR
refterms.dateFOA2020-09-03T00:00:00Z


Files in this item

Thumbnail
Name:
Buglalilar_Characters_Or_Morph ...
Size:
692.6Kb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record

http://creativecommons.org/licenses/by/4.0/
Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by/4.0/