Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >

Please use this identifier to cite or link to this item:

This item has been viewed 164 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
yamagishi-taslp09.pdf2.61 MBAdobe PDFView/Open
Title: Robust Speaker-Adaptive HMM-based Text-to-Speech Synthesis
Authors: Yamagishi, Junichi
Nose, Takashi
Zen, Heiga
Ling, Zhenhua
Toda, Tomoki
Tokuda, Keiichi
King, Simon
Renals, Steve
Issue Date: 2009
Journal Title: IEEE Transactions on Audio, Speech and Language Processing
Volume: 17
Issue: 6
Page Numbers: 1208--1230
Abstract: This paper describes a speaker-adaptive HMM-based speech synthesis system. The new system, called ``HTS-2007,'' employs speaker adaptation (CSMAPLR+MAP), feature-space adaptive training, mixed-gender modeling, and full-covariance modeling using CSMAPLR transforms, in addition to several other techniques that have proved effective in our previous systems. Subjective evaluation results show that the new system generates significantly better quality synthetic speech than speaker-dependent approaches with realistic amounts of speech data, and that it bears comparison with speaker-dependent approaches even when large amounts of speech data are available. In addition, a comparison study with several speech synthesis techniques shows the new system is very robust: It is able to build voices from less-than-ideal speech data and synthesize good-quality speech even for out-of-domain sentences.
Appears in Collections:CSTR publications

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0! Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh 2013, and/or the original authors. Privacy and Cookies Policy