Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR thesis and dissertation collection >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/3786

This item has been viewed 70 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Hofer2009.pdfPhD thesis9.56 MBAdobe PDFView/Open
HoferSupplemental2009.zipFile not available for download36.09 MBUnknown
Title: Speech-driven animation using multi-modal hidden Markov models
Authors: Hofer, Gregor Otto
Supervisor(s): Hiroshi, Shimodaira
Renals, Steve
Issue Date: 2010
Publisher: The University of Edinburgh
Abstract: The main objective of this thesis was the synthesis of speech synchronised motion, in particular head motion. The hypothesis that head motion can be estimated from the speech signal was confirmed. In order to achieve satisfactory results, a motion capture data base was recorded, a definition of head motion in terms of articulation was discovered, a continuous stream mapping procedure was developed, and finally the synthesis was evaluated. Based on previous research into non-verbal behaviour basic types of head motion were invented that could function as modelling units. The stream mapping method investigated in this thesis is based on Hidden Markov Models (HMMs), which employ modelling units to map between continuous signals. The objective evaluation of the modelling parameters confirmed that head motion types could be predicted from the speech signal with an accuracy above chance, close to 70%. Furthermore, a special type ofHMMcalled trajectoryHMMwas used because it enables synthesis of continuous output. However head motion is a stochastic process therefore the trajectory HMM was further extended to allow for non-deterministic output. Finally the resulting head motion synthesis was perceptually evaluated. The effects of the “uncanny valley” were also considered in the evaluation, confirming that rendering quality has an influence on our judgement of movement of virtual characters. In conclusion a general method for synthesising speech-synchronised behaviour was invented that can applied to a whole range of behaviours.
Keywords: synthesis of speech synchronised motion
head motion
Hidden Markov Models
computer animation
URI: http://hdl.handle.net/1842/3786
Appears in Collections:CSTR thesis and dissertation collection

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh 2013, and/or the original authors. Privacy and Cookies Policy