Lemmatization and grammatical annotation of the Corpus Histórico Judeoespañol (CORHIJE): problems, solutions, and resolutions

Main Article Content

Aitor García Moreno
Francisco Javier Pueyo Mena
After a brief review of the most salient features of the Corpus Histórico Judeoespañol - CORHIJE —which was already presented at the 3rd Edition of the Congreso de Corpus Diacrónicos en lenguas Iberorrománicas (CODILI, Zurich 2014)—, this paper describes the ongoing process of lemmatization and grammatical annotation of the corpus. We focus on describing the challenges we have encountered during the annotation process and the solutions we have applied to them, which, in some cases, have led us to take relatively arbitrary resolutions in accordance with the description and analysis goals we were trying to achieve: problems, solutions, and resolutions that amplify the title of our presentation.
Keywords
Linguistic Corpora, Digital Corpus Design, Judeo-Spanish, Diachrony

Article Details

How to Cite
García Moreno, Aitor; and Pueyo Mena, Francisco Javier. “Lemmatization and grammatical annotation of the Corpus Histórico Judeoespañol (CORHIJE): problems, solutions, and resolutions”. Scriptum digital. Revista de corpus diacrònics i edició digital en Llengües iberoromàniques, no. 6, pp. 69-82, https://raco.cat/index.php/scriptumdigital/article/view/329260.