Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/4544

This item has been viewed 9 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
JunichiICASSP10.pdf312.33 kBAdobe PDFView/Open
Title: Simple methods for improving speaker-similarity of HMM-based speech synthesis
Authors: Yamagishi, Junichi
King, Simon
Issue Date: 2010
Journal Title: Proc. ICASSP 2010
Abstract: In this paper we revisit some basic configuration choices of HMM based speech synthesis, such as waveform sampling rate, auditory frequency warping scale and the logarithmic scaling of F0, with the aim of improving speaker similarity which is an acknowledged weakness of current HMM-based speech synthesisers. All of the techniques investigated are simple but, as we demonstrate using perceptual tests, can make substantial differences to the quality of the synthetic speech. Contrary to common practice in automatic speech recognition, higher waveform sampling rates can offer enhanced feature extraction and improved speaker similarity for speech synthesis. In addition, a generalized logarithmic transform of F0 results in larger intra-utterance variance of F0 trajectories and hence more dynamic and natural-sounding prosody.
URI: http://hdl.handle.net/1842/4544
Appears in Collections:CSTR publications

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh 2013, and/or the original authors. Privacy and Cookies Policy