|
Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1842/918
|
| Title: | Applying Vocal Tract Length Normalization to Meeting Recordings |
| Authors: | Garau, Giulia Renals, Steve Hain, Thomas |
| Issue Date: | 2005 |
| Citation: | In Proceedings, Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005 |
| Publisher: | International Speech Communication Association |
| Abstract: | Vocal Tract Length Normalisation (VTLN) is a commonly used
technique to normalise for inter-speaker variability. It is based
on the speaker-specific warping of the frequency axis, parameterised
by a scalar warp factor. This factor is typically estimated
using maximum likelihood. We discuss how VTLN may
be applied to multiparty conversations, reporting a substantial
decrease in word error rate in experiments using the ICSI meetings
corpus. We investigate the behaviour of the VTLN warping
factor and show that a stable estimate is not obtained. Instead it
appears to be influenced by the context of the meeting, in particular
the current conversational partner. These results are consistent
with predictions made by the psycholinguistic interactive
alignment account of dialogue, when applied at the acoustic and
phonological levels. |
| Keywords: | Vocal Tract Length Normalisation inter-speaker variability |
| URI: | http://www.isca-speech.org/archive/interspeech_2005 http://hdl.handle.net/1842/918 |
| Appears in Collections: | CSTR publications
|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.
|