Grammatical annotation of historical Portuguese: Generating a corpus-based diachronic dictionary
Abstract
In this paper, we present an automatic system for the morphosyntactic annotation and lexicographical evaluation of historical Portuguese corpora. Using rule-based orthographical normalization, we were able to apply a standard parser (PALAVRAS) to historical data (Colonia corpus) and to achieve accurate annotation for both POS and syntax. By aligning original and standardized word forms, our method allows to create tailor-made standardization dictionaries for historical Portuguese with optional period or author frequencies.Citation
Bick E., Zampieri M. (2016) Grammatical Annotation of Historical Portuguese: Generating a Corpus-Based Diachronic Dictionary. In: Sojka P., Horák A., Kopeček I., Pala K. (eds) Text, Speech, and Dialogue. TSD 2016. Lecture Notes in Computer Science, vol 9924. Springer, ChamPublisher
SpringerAdditional Links
https://link.springer.com/chapter/10.1007%2F978-3-319-45510-5_1Type
Chapter in bookLanguage
enDescription
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9924)ISBN
9783319455099The following licence applies to the copyright and re-use of this item:
- Creative Commons
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States