|
Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1842/4662
|
| Title: | Transforming Voice Source Parameters in a HMM-based Speech Synthesiser with Glottal Post-Filtering |
| Authors: | Cabral, Joao P Renals, Steve Richmond, Korin Yamagishi, Junichi |
| Issue Date: | 2010 |
| Journal Title: | Proc. 7th ISCA Speech Synthesis Workshop (SSW7) |
| Abstract: | Control over voice quality, e.g. breathy and tense voice, is important for speech synthesis applications. For example, transformations can be used to modify aspects of the voice re- lated to speaker's identity and to improve expressiveness. How- ever, it is hard to modify voice characteristics of the synthetic speech, without degrading speech quality. State-of-the-art sta- tistical speech synthesisers, in particular, do not typically al- low control over parameters of the glottal source, which are strongly correlated with voice quality. Consequently, the con- trol of voice characteristics in these systems is limited. In con- trast, the HMM-based speech synthesiser proposed in this paper uses an acoustic glottal source model. The system passes the glottal signal through a whitening filter to obtain the excitation of voiced sounds. This technique, called glottal post-filtering, allows to transform voice characteristics of the synthetic speech by modifying the source model parameters. We evaluated the proposed synthesiser in a perceptual ex- periment, in terms of speech naturalness, intelligibility, and similarity to the original speaker's voice. The results show that it performed as well as a HMM-based synthesiser, which generates the speech signal with a commonly used high-quality speech vocoder. |
| URI: | http://hdl.handle.net/1842/4662 |
| Appears in Collections: | CSTR publications
|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.
|