|
Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1842/920
|
| Title: | Informed Blending of Databases for Emotional Speech Synthesis |
| Authors: | Hofer, Gregor O Richmond, Korin Clark, Robert A J |
| Issue Date: | 2005 |
| Citation: | In Proceedings, Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005 |
| Publisher: | International Speech Communication Association |
| Abstract: | The goal of this project was to build a unit selection voice
that could portray emotions with varying intensities. A suitable
definition of an emotion was developed along with a descriptive
framework that supported the work carried out. A single
speaker was recorded portraying happy and angry speaking
styles. Additionally a neutral database was also recorded. A
target cost function was implemented that chose units according
to emotion mark-up in the database. The Dictionary of Affect
supported the emotional target cost function by providing an
emotion rating for words in the target utterance. If a word was
particularly ’emotional’, units from that emotion were favoured.
In addition intensity could be varied which resulted in a bias to
select a greater number emotional units. A perceptual evaluation
was carried out and subjects were able to recognise reliably
emotions with varying amounts of emotional units present in the
target utterance. |
| Keywords: | speech synthesis emotion |
| URI: | http://www.isca-speech.org/archive/interspeech_2005 http://hdl.handle.net/1842/920 |
| Appears in Collections: | CSTR publications
|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.
|