Skip to main navigation Skip to search Skip to main content

New Possibilities for Exploring Early Latvian Texts: Switching to the NoSketchEngine

  • University of Latvia

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

The variable writing system in early Latvian texts is a bottleneck for non-linguists wishing to explore SENIE, the Corpus of early written Latvian texts. The writing system also poses many challenges for linguists. The Unicode version of SENIE, launched on the NoSketchEngine platform (https://nosketch.korpuss.lv/#dashboard?corpname=senie_unicode) in 2022, offers significant new possibilities. After the process of normalization of historical spelling the access to the Corpus has become more user-friendly. Queries made in the Latvian National Corpus Collection (LNCC) (https://korpuss.lv/search) display search results in the early texts as well.

Original languageEnglish
Pages (from-to)548-559
Number of pages12
JournalBaltic Journal of Modern Computing
Volume12
Issue number4
DOIs
Publication statusPublished - 2024

Keywords

  • diachronic corpora
  • digitization
  • historical writing
  • Latvian
  • normalization
  • NoSketchEngine
  • the Corpus of early written Latvian

OECD Field of Science

  • 6.2 Languages and Literature

Fingerprint

Dive into the research topics of 'New Possibilities for Exploring Early Latvian Texts: Switching to the NoSketchEngine'. Together they form a unique fingerprint.

Cite this