Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Informatics, School of >
Informatics thesis and dissertation collection >

Please use this identifier to cite or link to this item:

This item has been viewed 25 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Birch2011.pdf3.53 MBAdobe PDFView/Open
Title: Reordering metrics for statistical machine translation
Authors: Birch, Alexandra
Supervisor(s): Osborne, Miles
Koehn, Philipp
Issue Date: 30-Jun-2011
Publisher: The University of Edinburgh
Abstract: Natural languages display a great variety of different word orders, and one of the major challenges facing statistical machine translation is in modelling these differences. This thesis is motivated by a survey of 110 different language pairs drawn from the Europarl project, which shows that word order differences account for more variation in translation performance than any other factor. This wide ranging analysis provides compelling evidence for the importance of research into reordering. There has already been a great deal of research into improving the quality of the word order in machine translation output. However, there has been very little analysis of how best to evaluate this research. Current machine translation metrics are largely focused on evaluating the words used in translations, and their ability to measure the quality of word order has not been demonstrated. In this thesis we introduce novel metrics for quantitatively evaluating reordering. Our approach isolates the word order in translations by using word alignments. We reduce alignment information to permutations and apply standard distance metrics to compare the word order in the reference to that of the translation. We show that our metrics correlate more strongly with human judgements of word order quality than current machine translation metrics. We also show that a combined lexical and reordering metric, the LRscore, is useful for training translation model parameters. Humans prefer the output of models trained using the LRscore as the objective function, over those trained with the de facto standard translation metric, the BLEU score. The LRscore thus provides researchers with a reliable metric for evaluating the impact of their research on the quality of word order.
Sponsor(s): Economic and Social Research Council (ESRC)
Keywords: statistical machine translation
word order
standard distance metrics
Appears in Collections:Informatics thesis and dissertation collection

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0! Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh 2013, and/or the original authors. Privacy and Cookies Policy