Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/1047

This item has been viewed 23 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Syrdal_1998_a.pdf133.07 kBAdobe PDFView/Open
Title: Three Methods of Intonation Modeling
Authors: Syrdal, Ann
Moehler, Greg
Dusterhoff, Kurt E
Conkie, Alistair
Black, Alan W
Issue Date: Nov-1998
Citation: Third ESCA/COCOSDA Workshop on Speech Synthesis, Jenolan Caves House, Blue Mountains, Australia, November 26-29, 1998. pp. 305-310.
Publisher: International Speech Communication Association
Abstract: This paper compares different methods of generating intonation for an American English Text-to-Speech synthesis system. We look at a primarily rule-based approach and two data-driven approaches. For data-driven modeling we used two separate data sets, each representing a somewhat different prosodic style. One database was recordings of a portion of 1989 Wall Street Journal text from the Penn Treebank Project. The second database was recordings of interactive prompts used in telephone network services. Both were read by the same female speaker. Approximately two and one-half hours of speech was phonetically and prosodically segmented and labeled (first automatically, and subsequently verified manually). The prosodic labeling used ToBI [1] tones and breaks. Three di erent intonation models were compared: (1) a predominantly rule-based model based on ToBI labels [3]; (2) a parametric model using the Tilt approach [2]; and (3) a Vector Quantized model based on an underlying parametric representation [4]. Sentences representative of both prosodic styles were synthesized with each of these models, and were presented to listeners for subjective ratings in a formal listening test. The results of the evaluation are reported.
URI: http://www.isca_speech.org/archive/ssw3
http://hdl.handle.net/1842/1047
Appears in Collections:CSTR publications

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback