|
Edinburgh Research Archive >
Informatics, School of >
Informatics thesis and dissertation collection >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1842/4815
|
| Title: | Probabilistic Inference for Phrase-based Machine Translation: A Sampling Approach |
| Authors: | Arun, Abhishek |
| Supervisor(s): | Koehn, Philipp |
| Issue Date: | 30-Jun-2011 |
| Publisher: | The University of Edinburgh |
| Abstract: | Recent advances in statistical machine translation (SMT) have used dynamic programming
(DP) based beam search methods for approximate inference within probabilistic
translation models. Despite their success, these methods compromise the probabilistic
interpretation of the underlying model thus limiting the application of probabilistically
defined decision rules during training and decoding.
As an alternative, in this thesis, we propose a novel Monte Carlo sampling approach
for theoretically sound approximate probabilistic inference within these models. The
distribution we are interested in is the conditional distribution of a log-linear translation
model; however, often, there is no tractable way of computing the normalisation term
of the model. Instead, a Gibbs sampling approach for phrase-based machine translation
models is developed which obviates the need of computing this term yet produces
samples from the required distribution.
We establish that the sampler effectively explores the distribution defined by a
phrase-based models by showing that it converges in a reasonable amount of time to
the desired distribution, irrespective of initialisation. Empirical evidence is provided to
confirm that the sampler can provide accurate estimates of expectations of functions of
interest. The mix of high probability and low probability derivations obtained through
sampling is shown to provide a more accurate estimate of expectations than merely
using the n-most highly probable derivations.
Subsequently, we show that the sampler provides a tractable solution for finding the
maximum probability translation in the model. We also present a unified approach to
approximating two additional intractable problems: minimum risk training and minimum
Bayes risk decoding. Key to our approach is the use of the sampler which
allows us to explore the entire probability distribution and maintain a strict probabilistic
formulation through the translation pipeline. For these tasks, sampling allies
the simplicity of n-best list approaches with the extended view of the distribution that
lattice-based approaches benefit from, while avoiding the biases associated with beam
search. Our approach is theoretically well-motivated and can give better and more
stable results than current state of the art methods. |
| Keywords: | Statistical Machine Translation Machine Learning Discriminative training Markov Chain Monte Carlo Gibbs Sampling Minimum Risk Training Minimum Bayes Risk Decoding |
| URI: | http://hdl.handle.net/1842/4815 |
| Appears in Collections: | Informatics thesis and dissertation collection
|
This item is licensed under a Creative Commons License
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.
|