Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Philosophy, Psychology and Language Sciences, School of >
Linguistics and English Language >
Linguistics and English Language Masters thesis collection >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/2056

This item has been viewed 23 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Joanna Keating.pdf1.07 MBAdobe PDFView/Open
Title: Parameter tuning for unit selection speech synthesis
Authors: Keating, Joanna
Supervisor(s): Clark, Robert A J
Mayo, Cassie
Issue Date: 2005
Abstract: This project aims to contribute to current research on the quality of speech synthesis by conducting a perceptual experiment to discover a better set of target cost weights for the Festival speech synthesis system. From the experiment, the acoustic parameters that listeners use when judging synthetic speech will become clearer, as will the importance that each parameter has. The project uses unit selection synthesis, which chooses units for concatenation using a series of target and join costs. Each cost is assigned a weight value which indicates its importance in the overall cost. This project manipulates the target cost weight values in order to find a set of values that better represents the listeners' perception of the quality of the synthetic speech. Previous research shows that perceptual experiments are a common way of evaluating the quality of speech synthesis, and this project uses a listening experiment consisting of paired comparisons to reveal information about how listeners judge synthetic speech. The results from the experiment were analysed using multidimensional scaling to show the structure of the data and provide insight into the processes involved in speech perception. The results showed that when judging synthetic speech, participants pay attention to position in phrase, position in syllable, and stress parameters. It was also found that participants grouped the stimuli on the basis of which of these parameters was given the weight value of 1. The results also showed that a lack of weight on these parameters has more effect on the selection of units from the database than a large amount of weight. Through analysis of the results it was shown that position in syllable was the most important parameter for high quality speech.
Keywords: speech synthesis
selection synthesis
linguistics
URI: http://hdl.handle.net/1842/2056
Appears in Collections:Linguistics and English Language Masters thesis collection

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh 2013, and/or the original authors. Privacy and Cookies Policy