|
Edinburgh Research Archive >
Informatics, School of >
Informatics Publications >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1842/4863
|
| Title: | Vocal Attractiveness Of Statistical Speech Synthesisers |
| Authors: | Andraszewicz, Sandra Yamagishi, Junichi King, Simon |
| Issue Date: | 2011 |
| Citation: | Proc. ICASSP 2011 (Prague, Czech Republic) |
| Publisher: | IEEE |
| Abstract: | Our previous analysis of speaker-adaptive HMM-based speech synthesis methods suggested that there are two possible reasons why average voices can obtain higher subjective scores than any individual adapted voice: 1) model adaptation degrades speech quality proportionally to the distance ‘moved’ by the transforms, and 2) psychoacoustic
effects relating to the attractiveness of the voice. This paper is a follow-on from that analysis and aims to separate these effects out. Our latest perceptual experiments focus on attractiveness, using
average voices and speaker-dependent voices without model transformation, and show that using several speakers to create a voice
improves smoothness (measured by Harmonics-to-Noise Ratio), reduces distance from the the average voice in the log F0-F1 space of
the final voice and hence makes it more attractive at the segmental level. However, this is weakened or overridden at supra-segmental or sentence levels. |
| Sponsor(s): | The European Community’s Seventh Framework Programme (FP7/2007-2013) under Grant agreement 213845 (the EMIME project) |
| Keywords: | average voice attractiveness speaker adaptation speech synthesis HMM |
| URI: | http://hdl.handle.net/1842/4863 |
| Appears in Collections: | Informatics Publications
|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.
|