TY - GEN
T1 - Vairākvārdu leksēmu klasificēšana elektroniskajā vārdnīcā „Tēzaurs”
AU - Nešpore-Bērzkalne, Gunta
AU - Lokmane, Ilze
N1 - Publisher Copyright:
© 2025 University of Latvia. All rights reserved.
PY - 2025
Y1 - 2025
N2 - „Tēzaurs” is the largest Latvian electronic dictionary that consists of more than 400,000 entries, including more than 71,000 multi-word expressions (MWEs) from a wide range of sources; these expressions are very varied in form and content. Over the recent years, there has been ongoing work of sorting the MWEs into several groups: phrasemes, idioms, collocations, complex terms, taxonomic group names and multi-word nouns. This article describes the current results of sorting the MWEs and the challenges associated with borderline cases or overlapping categories, as well as unclear criteria for MWE division. It was concluded that additional criteria were needed to distinguish between phrasemes and idioms, as well as between phrasemes and collocations. There are several advantages to dividing MWEs into smaller groups. For dictionary users, the additional information has improved the browsing of the dictionary contents. For the developers of „Tēzaurs”, this process has given more insight into the contents of the MWE data, which enables them to analyse an entire group of MWEs at once, prevent duplicates or discrepancies, amend their positioning in the dictionary entries, and improve the overall quality of the dictionary. For linguists, the newly assembled and structured language material allows for more in-depth studies of each MWE group and highlights new directions of research. The current system of MWE classification is the first step towards an organised system of MWE description and classification, which will require further revision and improvement in the future.
AB - „Tēzaurs” is the largest Latvian electronic dictionary that consists of more than 400,000 entries, including more than 71,000 multi-word expressions (MWEs) from a wide range of sources; these expressions are very varied in form and content. Over the recent years, there has been ongoing work of sorting the MWEs into several groups: phrasemes, idioms, collocations, complex terms, taxonomic group names and multi-word nouns. This article describes the current results of sorting the MWEs and the challenges associated with borderline cases or overlapping categories, as well as unclear criteria for MWE division. It was concluded that additional criteria were needed to distinguish between phrasemes and idioms, as well as between phrasemes and collocations. There are several advantages to dividing MWEs into smaller groups. For dictionary users, the additional information has improved the browsing of the dictionary contents. For the developers of „Tēzaurs”, this process has given more insight into the contents of the MWE data, which enables them to analyse an entire group of MWEs at once, prevent duplicates or discrepancies, amend their positioning in the dictionary entries, and improve the overall quality of the dictionary. For linguists, the newly assembled and structured language material allows for more in-depth studies of each MWE group and highlights new directions of research. The current system of MWE classification is the first step towards an organised system of MWE description and classification, which will require further revision and improvement in the future.
KW - collocations
KW - electronic dictionary „Tēzaurs”
KW - multi-word expressions
KW - multi-word terms
KW - phraseology
UR - https://dspace.lu.lv/items/623e8537-fb7b-4896-998e-e68150565522
UR - https://www.scopus.com/pages/publications/105031934728
U2 - 10.22364/vnf.16.10
DO - 10.22364/vnf.16.10
M3 - Konferences zinātniskais raksts
AN - SCOPUS:105031934728
VL - 16
T3 - Valoda: Nozime un Forma
SP - 119
EP - 132
BT - Valoda: nozime un forma 16. Gramatika un valodas elektroniskie resursi = Language: Meaning and Form 16. Grammar and Electronic Resources of Language
A2 - Kalnaca, Andra
A2 - Lokmane, Ilze
A2 - Horiguchi, Daiki
PB - LU Akadēmiskais apgāds
CY - Rīga
T2 - VALODA: Nozime un Forma 16. Gramatika un Valodas Elektroniskie Resursi - LANGUAGE: Meaning and Form 16. Grammar and Electronic Resources of Language
Y2 - 26 December 2025 through 26 December 2025
ER -