Skip to main navigation Skip to search Skip to main content

Tracing Mistakes and Finding Gaps in Automatic Word Alignments for Latvian-English Translation

  • Valdis Girgzdis
  • , Maija Kale*
  • , Martins Vaicekauskis
  • , Ieva Zarina
  • , Inguna Skadiņa
  • *Corresponding author for this work
  • University of Latvia
  • Tilde Company

Research output: Chapter in Book/Report/Conference proceedingConference paperResearchpeer-review

5 Citations (Scopus)

Abstract

This paper aims to contribute to an in-depth understanding of computer based word alignment processes in machine translation (MT). The performance of word alignment, based on IBM models and incorporated in GIZA++, has been widely discussed in machine translation literature. The debate has lead towards a general consensus that GIZA++ does not provide sufficiently good results for word alignments. In this paper, we analyse the performance of GIZA++ and Fast Align for the Latvian-English pair against the manually aligned Gold Standard. Experiments showed that Fast Align proved to be approximately 2-3% more accurate and three times faster than GIZA++ in the alignment task. Where it concerns pre-processing, the removal of articles has a small, but positive, influence on alignment quality and machine translation output. We also present a Word Alignment Visualisation tool for analysis and editing of word alignments.

Original languageEnglish
Title of host publicationHuman Language Technologies - The Baltic Perspective
Subtitle of host publicationProceedings of the 6th International Conference Baltic HLT 2014
EditorsAndrius Utka, Gintare Grigonyte, Jurgita Kapociute-Dzikiene, Jurgita Vaicenoniene
PublisherIOS Press BV
Pages87-94
Number of pages8
ISBN (Electronic)9781614994411
DOIs
Publication statusPublished - 2014
Event6th International Conference on Human Language Technologies - The Baltic Perspective, Baltic HLT 2014 - Kaunas, Lithuania
Duration: 26 Sept 201427 Sept 2014

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume268
ISSN (Print)0922-6389
ISSN (Electronic)1879-8314

Conference

Conference6th International Conference on Human Language Technologies - The Baltic Perspective, Baltic HLT 2014
Country/TerritoryLithuania
CityKaunas
Period26/09/1427/09/14

Keywords

  • alignment guidelines
  • Latvian language
  • pre-processing
  • word alignment
  • Word Alignment Visualisation tool

Fingerprint

Dive into the research topics of 'Tracing Mistakes and Finding Gaps in Automatic Word Alignments for Latvian-English Translation'. Together they form a unique fingerprint.

Cite this