TY - GEN
T1 - Tracing Mistakes and Finding Gaps in Automatic Word Alignments for Latvian-English Translation
AU - Girgzdis, Valdis
AU - Kale, Maija
AU - Vaicekauskis, Martins
AU - Zarina, Ieva
AU - Skadiņa, Inguna
N1 - Publisher Copyright:
© 2014 The Authors and IOS Press.
PY - 2014
Y1 - 2014
N2 - This paper aims to contribute to an in-depth understanding of computer based word alignment processes in machine translation (MT). The performance of word alignment, based on IBM models and incorporated in GIZA++, has been widely discussed in machine translation literature. The debate has lead towards a general consensus that GIZA++ does not provide sufficiently good results for word alignments. In this paper, we analyse the performance of GIZA++ and Fast Align for the Latvian-English pair against the manually aligned Gold Standard. Experiments showed that Fast Align proved to be approximately 2-3% more accurate and three times faster than GIZA++ in the alignment task. Where it concerns pre-processing, the removal of articles has a small, but positive, influence on alignment quality and machine translation output. We also present a Word Alignment Visualisation tool for analysis and editing of word alignments.
AB - This paper aims to contribute to an in-depth understanding of computer based word alignment processes in machine translation (MT). The performance of word alignment, based on IBM models and incorporated in GIZA++, has been widely discussed in machine translation literature. The debate has lead towards a general consensus that GIZA++ does not provide sufficiently good results for word alignments. In this paper, we analyse the performance of GIZA++ and Fast Align for the Latvian-English pair against the manually aligned Gold Standard. Experiments showed that Fast Align proved to be approximately 2-3% more accurate and three times faster than GIZA++ in the alignment task. Where it concerns pre-processing, the removal of articles has a small, but positive, influence on alignment quality and machine translation output. We also present a Word Alignment Visualisation tool for analysis and editing of word alignments.
KW - alignment guidelines
KW - Latvian language
KW - pre-processing
KW - word alignment
KW - Word Alignment Visualisation tool
UR - https://www.scopus.com/pages/publications/84948686016
U2 - 10.3233/978-1-61499-442-8-87
DO - 10.3233/978-1-61499-442-8-87
M3 - Conference paper
AN - SCOPUS:84948686016
T3 - Frontiers in Artificial Intelligence and Applications
SP - 87
EP - 94
BT - Human Language Technologies - The Baltic Perspective
A2 - Utka, Andrius
A2 - Grigonyte, Gintare
A2 - Kapociute-Dzikiene, Jurgita
A2 - Vaicenoniene, Jurgita
PB - IOS Press BV
T2 - 6th International Conference on Human Language Technologies - The Baltic Perspective, Baltic HLT 2014
Y2 - 26 September 2014 through 27 September 2014
ER -