Language resources for Italian: Towards the development of a corpus of annotated Italian multiword expressions
Abstract
This paper describes the first resource annotated for multiword expressions (MWEs) in Italian. Two versions of this dataset have been prepared: the first with a fast markup list of out-of-context MWEs, and the second with an in-context annotation, where the MWEs are entered with their contexts. The paper also discusses annotation issues and reports the inter-annotator agreement for both types of annotations. Finally, the results of the first exploitation of the new resource, namely the automatic extraction of Italian MWEs, are presented.Citation
Proceedings of Third Italian Conference on Computational Linguistics (CLiC-it 2016) & Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2016)Publisher
ceur-wsAdditional Links
http://ceur-ws.org/Vol-1749/Type
Conference contributionLanguage
enDescription
Napoli, Italy, December 5-7, 2016ISSN
1613-0073The following licence applies to the copyright and re-use of this item:
- Creative Commons
Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by-nc-nd/4.0/