|
Edinburgh Research Archive >
Philosophy, Psychology and Language Sciences, School of >
Linguistics and English Language >
Linguistics and English Language Masters thesis collection >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1842/3618
|
| Title: | Towards Statistical Machine Translation with Unification Grammars |
| Authors: | Williams, Philip |
| Supervisor(s): | Koehn, Philipp |
| Issue Date: | 26-Nov-2009 |
| Abstract: | Traditional Statistical Machine Translation (SMT) models account poorly for many linguistic phenomena, such as subject-verb agreement and differences in word-order between languages. Recent work, such as that in factored phrase-based models, has shown promising improvements in translation quality through the use of linguistically-richer models. Unification-based approaches to grammar offer a framework for modelling agreement, a particular problem in generating morphologically-rich languages, and so in order to gauge the potential gains available from their application to SMT we first consider how to automatically recognise and measure agreement failure. We focus upon the specific issue of declension in German noun phrases and propose a simple unification-based approach to the problem. We develop an agreement checker based on this approach and use it to assess the agreement failure rate of a hierachical phrase-based translation system trained on the small News Commentary corpus. Initially we find that our checker reports unreasonably high failure rates on the fluent training data, and through an incremental process of failure analysis and lexicon refinement we significantly reduce the number of spurious failures. We then apply the agreement checker directly to machine translation by incorporating it as a feature function of the log-linear model. We train our baseline system on the larger Europarl corpus and again measure failure rates before applying the agreement check as both a hard and soft constraint. The effects on translation are not large enough to reliably measure using standard automatic evaluation techniques and so we perform a manual analysis of the types of change introduced. |
| Keywords: | translation unification |
| URI: | http://hdl.handle.net/1842/3618 |
| Appears in Collections: | Linguistics and English Language Masters thesis collection
|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.
|