|
Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1842/2136
|
| Title: | Modeling prosodic features in language models for meetings. |
| Authors: | Huang, Songfang Renals, Steve |
| Issue Date: | 2007 |
| Citation: | Songfang Huang and Steve Renals. Modeling prosodic features in language models for meetings. In A. Popescu-Belis, S. Renals, and H. Bourlard, editors, Machine Learning for Multimodal Interaction IV, volume 4892 of Lecture Notes in Computer Science, pages 191-202. Springer, 2007. |
| Publisher: | Springer-Verlag Berlin Heidelberg |
| Abstract: | In this paper we investigate the application of a novel technique for language modeling - a hierarchical Bayesian language model (LM) based on the Pitman-Yor process - on automatic speech recognition (ASR) for multiparty meetings. The hierarchical Pitman-Yor language model (HPYLM), which was originally proposed in the machine learning field, provides a Bayesian interpretation to language modeling. An approximation to the HPYLM recovers the exact formulation of the interpolated Kneser-Ney smoothing method in n-gram models. This paper focuses on the application and scalability of HPYLM on a practical large vocabulary ASR system. Experimental results on NIST RT06s evaluation meeting data verify that HPYLM is a competitive and promising language modeling technique, which consistently performs better than interpolated Kneser-Ney and modified Kneser-Ney n-gram LMs in terms of both perplexity (PPL) and word error rate (WER). |
| Keywords: | speech technology Bayesian language model |
| URI: | http://hdl.handle.net/1842/2136 |
| Appears in Collections: | CSTR publications
|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.
|