Edinburgh Research Archive >
Philosophy, Psychology and Language Sciences, School of >
Linguistics and English Language >
Linguistics and English Language Masters thesis collection >
Please use this identifier to cite or link to this item:
|Title: ||HMM-based Speech Synthesis from Audio Book Data|
|Authors: ||Haag, Kathrin|
|Supervisor(s): ||King, Simon|
|Issue Date: ||Oct-2011|
|Publisher: ||The University of Edinburgh|
|Abstract: ||In contrast to hand-crafted speech databases, which contain short out-of-context sentences in fairly unemphatic speech style, audio books contain rich prosody including intonation contours, pitch accents and phrasing patterns, which is a good pre-requisite for building a natural sounding synthetic voice. The following paper will give an overview of the steps that are involved in building a synthetic voice from audio book data.
After an introduction to the theory of HMM-based speech synthesis, the properties of the speech database will be described in detail. It will be argued that it is necessary to model specific properties of the database, such as higher pitched speech or questions, to achieve a better quality synthetic voice. Furthermore, the acoustic modelling of these properties will be explained in detail. Finally, the synthetic voice is evaluated on the basis of an online listening test.|
|Keywords: ||speech synthesis|
|Appears in Collections:||Linguistics and English Language Masters thesis collection|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.