Pāriet uz galveno navigāciju Pāriet uz meklēšanu Pāriet uz galveno saturu

Valodas korpusu izmantošana latviešu valodas uzdevumu automātiskā ģenerēšanā

  • Ilze Auziņa
  • , Roberts Darģis
  • , Inga Kaija
  • , Kristīne Levāne-Petrova
  • , Kristīne Pokratniece

Zinātniskās darbības rezultāts: Devums žurnālamZinātniskais raksts (žurnālā)koleģiāli recenzēts

Kopsavilkums

Today, language corpora are not only the empirical basis of research but can also be used in developing a variety of data-driven teaching materials and tools. The experience of other countries shows that the development of self-assessment exercises for language learning can be partially or fully automated using language corpora and natural language processing (NLP) tools, thus providing both a variety of exercises and support for teachers in the implementation of the curriculum. The Latvian Language Learners Corpus (LaVA) developed at the Institute of Mathematics and Computer Science, University of Latvia, includes more than 1000 texts created by foreign Latvian language learners studying at Latvian higher education institutions for the first or second semester reaching A1 (possibly A2) Latvian language proficiency level. The size of the corpus is more than 180 000 words. According to the LaVA data analysis, including learners error analysis, exercises and tests are generated. Data analysis allows us to identify problematic spelling, grammar, and vocabulary issues. The exercises are intended to help the language learner to strengthen the linguistic competence of Latvian language, for example, the use of verb forms in the indicative mood, both in indefinite and perfect tense forms. The article discusses the methodology according to which, based on the statistical and quantitative analysis of the LaVA corpus data, sample sentences are selected from different corpora of Latvian language, for example, The Balanced Corpus of Modern Latvian (LVK2018), The Corpus of Students’ Essays (SPK), as well describes the task-development algorithms and development of online self-assessment exercises site.

Tulkotais devuma nosaukumsUse of the Language Corpora in Automatic Generation of Latvian Language Exercises
OriģinālvalodaLatviešu
Lapas (no-līdz)264-283
Lapu skaits20
ŽurnālsLetonica
Sējums2022
Izdevuma numurs47
DOIs
Publikācijas statussPublicēts - 2022

Atslēgvārdi

  • Computational linguistics
  • Exercises
  • Language corpora
  • Latvian language acquisition
  • Sentence selection

Nospiedums

Uzziniet vairāk par pētniecības tēmām “Valodas korpusu izmantošana latviešu valodas uzdevumu automātiskā ģenerēšanā”. Kopā tie veido unikālu nospiedumu.

Citēt šo