Show simple item record

dc.contributor.authorFaria, Arlo
dc.contributor.authorGelbart, David
dc.date.accessioned2006-05-12T13:11:09Z
dc.date.available2006-05-12T13:11:09Z
dc.date.issued2005
dc.identifier.citationIn Proceedings, Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005en
dc.identifier.urihttp://hdl.handle.net/1842/1042
dc.description.abstractTo reduce inter-speaker variability, vocal tract length normalization (VTLN) is commonly used to transform acoustic features for automatic speech recognition (ASR). The warp factors used in this process are usually derived by maximum likelihood (ML) estimation, involving an exhaustive search over possible values. We describe an alternative approach: exploit the correlation between a speaker's average pitch and vocal tract length, and model the probability distribution of warp factors conditioned on pitch observations. This can be used directly for warp factor estimation, or as a smoothing prior in combination with ML estimates. Pitch-based warp factor estimation for VTLN is effective and requires relatively little memory and computation. Such an approach is well-suited for environments with constrained resources, or where pitch is already being computed for other purposes.en
dc.format.extent294254 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectvocal tract length normalizationen
dc.subjectautomatic speech recognitionen
dc.subjectspeechen
dc.subjectpitchen
dc.titleEfficient Pitch-based Estimation of VTLNWarp Factorsen
dc.typeConference Paperen


Files in this item

This item appears in the following Collection(s)

Show simple item record