Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/4560

This item has been viewed 5 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
IS100361.pdf476.12 kBAdobe PDFView/Open
Title: Roles of the Average Voice in Speaker-adaptive HMM-based Speech Synthesis
Authors: Yamagishi, Junichi
Watts, Oliver
King, Simon
Usabaev, Bela
Issue Date: 2010
Journal Title: Proc. Interspeech 2010
Abstract: In speaker-adaptive HMM-based speech synthesis, there are typically a few speakers for which the output synthetic speech sounds worse than that of other speakers, despite having the same amount of adaptation data from within the same corpus. This paper investigates these fluctuations in quality and concludes that as melcepstral distance from the average voice becomes larger, the MOS naturalness scores generally become worse. Although this negative correlation is not that strong, it suggests a way to improve the training and adaptation strategies. We also draw comparisons between our findings and the work of other researchers regarding ``vocal attractiveness.''
URI: http://hdl.handle.net/1842/4560
Appears in Collections:CSTR publications

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback