|
Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1842/983
|
| Title: | Concatenative Text-to-Speech Synthesis Based on Prototype Waveform Interpolation (A Time Frequency Approach) |
| Authors: | Morais, Edmilson Taylor, Paul Violaro, Fabio |
| Issue Date: | Oct-2000 |
| Citation: | In ICSLP-2000, vol.2, 387-390. |
| Publisher: | International Speech Communication Association |
| Abstract: | This paper presents some preliminary methods to apply the Time- Frequency Interpolation technique - TFI [3] to concatenative text-to-speech synthesis. The TFI technique described here is a pitch-synchronous time-frequency approach of the well known Prototype-Waveform Interpolation technique - PWI [2]. The basic concepts of representing the speech signal in the Time-Frequency Domain as well as techniques to perform Time-Scale and Pitch- Scale modifications are described. Using the flexibility of TFI technique to perform spectral smothing, a method was developed to minimize the spectral mismatch at the boundaries of the Synthesis-Units - SUs. The proposed system was evaluated using SUs (Diphones) and prosodic modifications generated by the Festival system [1]. An informal subjective test was performed, between the proposed TFI system and the standard TD-PSOLA system, highligthing the superior quality of the proposed system in comparasion with TD-PSOLA. |
| URI: | http://www.isca-speech.org/archive/icslp_2000/ http://hdl.handle.net/1842/983 |
| Appears in Collections: | CSTR publications
|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.
|