Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/1041

This item has been viewed 9 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
vepa_king_ieee2005.pdf158.85 kBAdobe PDFView/Open
Title: Subjective Evaluation of Join Cost and Smoothing Methods for Unit Selection Speech Synthesis
Authors: Vepa, Jithendra
King, Simon
Issue Date: 2005
Citation: IEEE Transactions on Speech and Audio Processing : Accepted for future publication
Publisher: IEEE Signal Processing Society Press
Abstract: In unit selection-based concatenative speech synthesis, join cost (also known as concatenation cost), which measures how well two units can be joined together, is one of the main criteria for selecting appropriate units from the inventory. Usually, some form of local parameter smoothing is also needed to disguise the remaining discontinuities. This paper presents a subjective evaluation of three join cost functions and three smoothing methods. We also describe the design and performance of a listening test. The three join cost functions were taken from our previous study, where we proposed join cost functions derived from spectral distances, which have good correlations with perceptual scores obtained for a range of concatenation discontinuities. This evaluation allows us to further validate their ability to predict concatenation discontinuities. The units for synthesis stimuli are obtained from a state-of-the-art unit selection text-to-speech system: rVoice from Rhetorical Systems Ltd. In this paper, we report listeners' preferences for each join cost in combination with each smoothing method.
Keywords: linear dynamic models (LDM)
perceptual listening tests
Join cost
smoothing
speech synthesis
unit selection
URI: Digital Object Identifier: 10.1109/TSA.2005.858548
http://hdl.handle.net/1842/1041
ISSN: 1063-6676
Appears in Collections:CSTR publications
Linguistics and English Language publications

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh 2013, and/or the original authors. Privacy and Cookies Policy