Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >
Please use this identifier to cite or link to this item:
|Title: ||Issues in Building General Letter to Sound Rules|
|Authors: ||Black, Alan W|
|Issue Date: ||Nov-1998|
|Citation: ||Third ESCA/COCOSDA Workshop on Speech Synthesis, Jenolan Caves House, Blue Mountains, Australia, November 26-29, 1998. pp.77-80.|
|Publisher: ||International Speech Communication Association|
|Abstract: ||In general text-to-speech systems, it is not possible to guarantee that a lexicon will contain all words found in a text, therefore some system for predicting pronunciation from the word itself is necessary.
Here we present a general framework for building letter to sound (LTS) rules from a word list in a language. The technique can be fully automatic, though a small amount of hand seeding can give better results. We have applied this technique to English (UK and US), French and German. The generated models achieve, 75%, 58%, 93% and 89%, respectively, words correct for held out data from the word lists.
To test our models on more typical data we also analyzed general text, to find which words do not appear in our lexicon. These unknown words were used as a more realistic test corpus for our models. We also discuss the distribution and type of such unknown words.|
|Appears in Collections:||CSTR publications|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.