Skip to main navigation Skip to search Skip to main content

ENGLISH-LATVIAN SMT: THE CHALLENGE OF TRANSLATING INTO A FREE WORD ORDER LANGUAGE

  • Maxim Khalilov*
  • , José A.R. Fonollosa
  • , Inguna Skadiņa
  • , Edgars Brālītis
  • , Lauma Pretkalniņa
  • *Corresponding author for this work
  • University of Amsterdam
  • Polytechnic University of Catalonia
  • University of Latvia

Research output: Contribution to conferencePaperpeer-review

2 Citations (Scopus)

Abstract

This paper presents a comparative study of two approaches to statistical machine translation (SMT) and their application to a task of English-to-Latvian translation, which is still an open research line in the field of automatic translation. We consider a state-of-the-art phrase-based SMT and an alternative N-gram-based SMT systems. The major differences between these two approaches lie in the distinct representations of bilingual units, which are the components of the bilingual model driving translation process and in the statistical modeling of the translation context. Latvian being a rather free word order language implies additional difficulties to the translation process. We contrast different reordering models and investigate how well they deal with the word ordering issue. Moving beyond automatic scores of translation quality that are classically presented in MT research papers, we contribute presenting a manual error analysis of MT systems output that helps to shed light on advantages and disadvantages of the SMT systems under consideration and identify the most prominent source of errors typical for both SMT systems.

Original languageEnglish
Pages87-94
Number of pages8
Publication statusPublished - 2010
Externally publishedYes
Event2nd Workshop on Spoken Language Technologies for Under-Resourced Languages, SLTU 2010 - Penang, Malaysia
Duration: 3 May 20105 May 2010

Conference

Conference2nd Workshop on Spoken Language Technologies for Under-Resourced Languages, SLTU 2010
Country/TerritoryMalaysia
CityPenang
Period3/05/105/05/10

Keywords

  • finite state machines
  • language processing
  • Natural languages
  • statistical machine translation

Fingerprint

Dive into the research topics of 'ENGLISH-LATVIAN SMT: THE CHALLENGE OF TRANSLATING INTO A FREE WORD ORDER LANGUAGE'. Together they form a unique fingerprint.

Cite this