Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Centre for Speech Technology Research >
CSTR publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/1998

This item has been viewed 38 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
livescu_icassp07_sum.pdf140.17 kBAdobe PDFView/Open
Title: Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop.
Authors: Livescu, Karen
Çetin, Ozgur
Hasegawa-Johnson, Mark
King, Simon
Bartels, Chris
Borges, Nash
Kantor, Arthur
Lal, Partha
Yung, Lisa
Bezman, Ari
Dawson-Haggerty, Stephen
Woods, Bronwyn
Frankel, Joe
Magimai-Doss, Mathew
Saenko, Kate
Issue Date: 2007
Citation: K. Livescu, O. Çetin, M. Hasegawa-Johnson, S. King, C. Bartels, N. Borges, A. Kantor, P. Lal, L. Yung, S. Bezman, Dawson-Haggerty, B. Woods, J. Frankel, M. Magimai-Doss, and K. Saenko. Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary from the 2006 JHU Summer Workshop. In Proc. ICASSP, Honolulu, April 2007.
Abstract: We report on investigations, conducted at the 2006 Johns HopkinsWorkshop, into the use of articulatory features (AFs) for observation and pronunciation models in speech recognition. In the area of observation modeling, we use the outputs of AF classiers both directly, in an extension of hybrid HMM/neural network models, and as part of the observation vector, an extension of the tandem approach. In the area of pronunciation modeling, we investigate a model having multiple streams of AF states with soft synchrony constraints, for both audio-only and audio-visual recognition. The models are implemented as dynamic Bayesian networks, and tested on tasks from the Small-Vocabulary Switchboard (SVitchboard) corpus and the CUAVE audio-visual digits corpus. Finally, we analyze AF classication and forced alignment using a newly collected set of feature-level manual transcriptions.
Keywords: speech technology
URI: http://hdl.handle.net/1842/1998
Appears in Collections:CSTR publications
Linguistics and English Language publications

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback