Grammatical annotation of historical Portuguese: Generating a corpus-based diachronic dictionary
AbstractIn this paper, we present an automatic system for the morphosyntactic annotation and lexicographical evaluation of historical Portuguese corpora. Using rule-based orthographical normalization, we were able to apply a standard parser (PALAVRAS) to historical data (Colonia corpus) and to achieve accurate annotation for both POS and syntax. By aligning original and standardized word forms, our method allows to create tailor-made standardization dictionaries for historical Portuguese with optional period or author frequencies.
CitationBick E., Zampieri M. (2016) Grammatical Annotation of Historical Portuguese: Generating a Corpus-Based Diachronic Dictionary. In: Sojka P., Horák A., Kopeček I., Pala K. (eds) Text, Speech, and Dialogue. TSD 2016. Lecture Notes in Computer Science, vol 9924. Springer, Cham
TypeChapter in book
DescriptionPart of the Lecture Notes in Computer Science book series (LNCS, volume 9924)
The following licence applies to the copyright and re-use of this item:
- Creative Commons
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States