Pāriet uz galveno navigāciju Pāriet uz meklēšanu Pāriet uz galveno saturu

Review of non-english corpora annotated for emotion classification in text

  • Viktorija Ļeonova

Pētījuma izpildes rezultāts: Nodaļa grāmatā/enciklopēdijā/konferences krājumāKonferences zinātniskais rakstsPētniecībakoleģiāli recenzēts

6 Atsauces (Scopus)

Kopsavilkums

In this paper we try to systematize the information about the available corpora for emotion classification in text for languages other than English with the goal to find what approaches could be used for low-resource languages with close to no existing works in the field. We analyze the corresponding volume, emotion classification schema, language of each corresponding corpus and methods employed for data preparation and annotation automation. We’ve systematized twenty-four papers representing the corpora and found that corpora were mostly for the most spoken world languages: Hindi, Chinese, Turkish, Arabic, Japanese etc. A typical corpus contained several thousand of manually-annotated entries, collected from a social network, annotated by three annotators each and was processed by a few machine learning methods, such as linear SVM and Naïve Bayes and (more recent ones) a couple of neural networks methods, such as CNN.

OriģinālvalodaAngļu
Rīkotāja publikācijas nosaukumsDatabases and Information Systems - 14th International Baltic Conference, DB and IS 2020, Proceedings
RedaktoriTarmo Robal, Hele-Mai Haav, Jaan Penjam, Raimundas Matulevicius
Lapas96-108
Sējums1243 CCIS
DOIs
Publikācijas statussPublicēts - 2020

Publikāciju sērijas

NosaukumsCommunications in Computer and Information Science
Sējums1243 CCIS
ISSN (Drukātā versija)1865-0929
ISSN (Elektroniskā versija)1865-0937

Nospiedums

Uzziniet vairāk par pētniecības tēmām “Review of non-english corpora annotated for emotion classification in text”. Kopā tie veido unikālu nospiedumu.

Citēt šo