Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/979

This item has been viewed 5 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Gotoh ICASSP.pdf391.09 kBAdobe PDFView/Open
Title: Variable word rate N-grams
Authors: Gotoh, Yoshihiko
Renals, Steve
Issue Date: Jun-2000
Citation: Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on, Volume 3, 5-9 June 2000 Page(s):1591 - 1594
Publisher: IEEE
Abstract: The rate of occurrence of words is not uniform but varies from document to document. Despite this observation, parameters for conventional N-gram language models are usually derived using the assumption of a constant word rate. In this paper we investigate the use of variable word rate assumption, modelled by a Poisson distribution or a continuous mixture of Poissons. We present an approach to estimating the relative frequencies of words or N-grams taking prior information of their occurrences into account. Discounting and smoothing schemes are also considered. Using the Broadcast News task, the approach demonstrates a reduction of perplexity up to 10%
URI: http://ieeexplore.ieee.org/
http://hdl.handle.net/1842/979
Appears in Collections:CSTR publications

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback