Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/1007

This item has been viewed 16 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Pagel_1998_a.pdf54.94 kBAdobe PDFView/Open
Title: Letter to Sound Rules for Accented Lexicon Compression
Authors: Pagel, Vincent
Lenzo, Kevin
Black, Alan W
Issue Date: Dec-1998
Citation: In ICSLP-1998, paper 0561.
Publisher: International Speech Communication Association
Abstract: This paper presents trainable methods for generating letter to sound rules from a given lexicon for use in pronouncing out-of-vocabulary words and as a method for lexicon compression. As the relationship between a string of letters and a string of phonemes representing its pronunciation for many languages is not trivial, we discuss two alignment procedures, one fully automatic and one hand-seeded which produce reasonable alignments of letters to phones. Top Down Induction Tree models are trained on the aligned entries. We show how combined phoneme/stress prediction is better than separate prediction processes, and still better when including in the model the last phonemes transcribed and part of speech information. For the lexicons we have tested, our models have a word accuracy (including stress) of 78% for OALD, 62% for CMU and 94% for BRULEX. The extremely high scores on the training sets allow substantial size reductions (more than 1/20). WWW site: http://tcts.fpms.ac.be/synthesis/mbrdico
URI: http://www.isca-speech.org/archive/icslp_1998/index.html
http://hdl.handle.net/1842/1007
Appears in Collections:CSTR publications

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback