Parsing of Kumauni language sentences after modifying earley's algorithm

Main Article Content

Rakesh Pandey
Hoshiyar S. Dhami
Kumauni language is one of the regional languages of India, which is spoken in one of the Himalayan region Kumaun. Since the language is relatively understudied, in this study an attempt has been made to develop a parsing tool for use in Kumauni language studies. The eventual aim is help develop a technique for checking grammatical structures of Kumauni sentences. For this purpose, we have taken a set of pre-existing Kumauni sentences and derived rules of grammar from them. While selecting this set of sentences, effort has been made to select those sentences which are representative of the various possible tags of parts of speeches of the language, as used currently. This has been done to ensure that the sentences constitute all possible tags. These derived rules of Kumauni grammar have been converted to a mathematical model using Earley’s algorithm suitably modified by us. The mathematical model so developed has been tested on a separate set of pre-existing Kumauni sentences and thus verified. This mathematical model can be used for the purpose of parsing new Kumauni sentences, thus providing researchers a new parsing tool.
Paraules clau
Kumauni Language, Context-free Grammar, Earley’s Algorithm, Natural Language Processing, Parsing, lengua kumauni, gramática libre de contexto, algoritmo de Earley’s, Procesamiento del lenguaje natural, etiquetado

Article Details

Com citar
Pandey, Rakesh; and Dhami, Hoshiyar S. “Parsing of Kumauni language sentences after modifying earley’s algorithm”. Dialectologia: revista electrònica, no. 7, pp. 75-92, https://raco.cat/index.php/Dialectologia/article/view/247909.