Show simple item record

Proc. ICASSP

dc.contributor.authorWang, Dong
dc.contributor.authorFrankel, Joe
dc.contributor.authorTejedor, Javier
dc.contributor.authorKing, Simon
dc.date.accessioned2010-10-05T11:11:01Z
dc.date.available2010-10-05T11:11:01Z
dc.date.issued2008en
dc.identifier.isbn978-1-4244-1483-3en
dc.identifier.urihttp://hdl.handle.net/1842/3837
dc.description.abstractWe propose grapheme-based sub-word units for spoken term detection (STD). Compared to phones, graphemes have a number of potential advantages. For out-of-vocabulary search terms, phone- based approaches must generate a pronunciation using letter-to-sound rules. Using graphemes obviates this potentially error-prone hard decision, shifting pronunciation modelling into the statistical models describing the observation space. In addition, long-span grapheme language models can be trained directly from large text corpora. We present experiments on Spanish and English data, comparing phone and grapheme-based STD. For Spanish, where phone and grapheme-based systems give similar transcription word error rates (WERs), grapheme-based STD significantly outperforms a phone- based approach. The converse is found for English, where the phone-based system outperforms a grapheme approach. However, we present additional analysis which suggests that phone-based STD performance levels may be achieved by a grapheme-based approach despite lower transcription accuracy, and that the two approaches may usefully be combined. We propose a number of directions for future development of these ideas, and suggest that if grapheme-based STD can match phone-based performance, the inherent flexibility in dealing with out-of-vocabulary terms makes this a desirable approach.en
dc.titleA comparison of phone and grapheme-based spoken term detectionen
dc.typeConference Paperen
dc.identifier.doi10.1109/ICASSP.2008.4518773en
rps.titleProc. ICASSPen
dc.extent.noOfPages4969 - 4972en
dc.date.updated2010-10-05T11:11:02Z
dc.identifier.eIssn1520-6149en
dc.date.openingDate2008-03-31
dc.date.closingDate2008-04-04


Files in this item

This item appears in the following Collection(s)

Show simple item record