Now showing items 1-20 of 446

  • The 1994 Abbot hybrid connectionist-HMM large vocabulary recognition system. 

    Hochberg, Mike; Cook, Gary; Renals, Steve; Robinson, Tony; Schechtman, R (1995)
  • The 1995 ABBOT LVCSR system for multiple unknown microphones 

    Kershaw, Dan; Robinson, Tony; Renals, Steve (IEEE, 1996-10)
    ABBOT is a hybrid (connectionist-hidden Markov model) large-vocabulary speech recognition (LVCSR) system, developed at Cambridge University. In this system, a recurrent network maps each acoustic vector to an estimate of ...
  • Accent Phrase Segmentation by Finding N-Best Sequences of Pitch Pattern Templates 

    Nakai, Mitsuru; Shimodaira, Hiroshi (International Speech Communication Association, 1994-09)
    This paper describes a prosodic method for segmenting continuous speech into accent phrases. Optimum sequences are obtained on the basis of least squared error criterion by using dynamic time warping between F0 contours ...
  • Accent phrase segmentation using transition probabilities between pitch pattern templates. 

    Shimodaira, Hiroshi; Nakai, Mitsuru (International Speech Communication Association, 1993)
    This paper proposes a novel method for segmenting continuous speech into accent phrases by using a prosodic feature 'pitch pattern'. The pitch pattern extracted from input speech signals is divided into the accent segments ...
  • An accent-independent lexicon for automatic speech recognition. 

    Van Bael, Christophe; King, Simon (International Congress of Phonetic Sciences, 2003)
    Recent work at the Centre for Speech Technology Re- search (CSTR) at the University of Edinburgh has de- veloped an accent-independent lexicon for speech synthesis (the Unisyn project). The main purpose of this lexicon is ...
  • Accessing the spoken word 

    Goldman, Jerry; Renals, Steve; Bird, Steven; de Jong, Franciska; Federico, Marcello; Fleischhauer, Carl; Kornbluh, Mark; Lamel, Lori; Oard, Douglas W; Stewart, Claire; Wright, Richard (Springer Berlin / Heidelberg, 2005-08)
    Spoken-word audio collections cover many domains, including radio and television broadcasts, oral narratives, governmental proceedings, lectures, and telephone conversations. The collection, access, and preservation of ...
  • Accurate Spectral Envelope Estimation for Articulation-to-Speech Synthesis 

    Shiga, Yoshinori; King, Simon (2004-06)
    This paper introduces a novel articulatory-acoustic mapping in which detailed spectral envelopes are estimated based on the cepstrum, inclusive of the high-quefrency elements which are discarded in conventional speech ...
  • Acoustic Confidence Measures for Segmenting Broadcast News 

    Barker, Jon; Williams, Gethin; Renals, Steve (International Speech Communication Association, 1998-12)
    In this paper we define an acoustic confidence measure based on the estimates of local posterior probabilities produced by a HMM/ANN large vocabulary continuous speech recognition system. We use this measure to segment ...
  • Acoustic Space Dimensionality Selection and Combination using the Maximum Entropy Principle 

    Abdel-Haleem, Yasser H; Renals, Steve; Lawrence, Neil D (IEEE Signal Processing Society, 2004)
    In this paper we propose a discriminative approach to acoustic space dimensionality selection based on maximum entropy modelling. We form a set of constraints by composing the acoustic space with the space of phone classes, ...
  • Acoustic-Articulatory Modelling with the Trajectory HMM 

    Zhang, Le; Renals, Steve (2008)
    In this letter, we introduce an hidden Markov model (HMM)-based inversion system to recovery articulatory movements from speech acoustics. Trajectory HMMs are used as generative models for modelling articulatory data. ...
  • Adult–child differences in acoustic cue weighting are influenced by segmental context: Children are not always perceptually biased towards transitions 

    Mayo, Catherine; Turk, Alice (Acoustical Society of America, 2004-06)
    It has been proposed that young children may have a perceptual preference for transitional cues [Nittrouer, S. (2002). J. Acoust. Soc. Am. 112, 711–719. According to this proposal, this preference can manifest itself either ...
  • An advanced integrated architecture for wireless voicemail retrieval. 

    Koumpis, Konstantinos; Ladas, C; Renals, Steve (IEEE Signal Processing Society, 2001)
    This paper describes an alternative architecture for voicemail data retrieval on the move. It is comprised of three distinct components: a speech recognizer, a text summarizer and a WAP push service initiator, enabling ...
  • Age Recognition for Spoken Dialogue Systems: Do We Need It? 

    Wolters, Maria; Vipperla, Ravichander; Renals, Steve (2009)
    When deciding whether to adapt relevant aspects of the system to the particular needs of older users, spoken dialogue systems often rely on automatic detection of chronological age. In this paper, we show that vocal ageing ...
  • Ageing voices: The effect of changes in voice parameters on ASR performance 

    Vipperla, Ravi Chander; Renals, Steve; Frankel, Joe (2010)
    With ageing, human voices undergo several changes which are typically characterized by increased hoarseness and changes in articulation patterns. In this study, we have examined the effect on Automatic Speech Recognition ...
  • Algorithms for analysing the temporal structure of discourse. 

    Hitzeman, Janet; Moens, Marc; Grover, Claire (1995)
    We describe a method for analsing the temporal structure of a discourse which takes into account the effects of tense, aspect, temporal adverbials and rhetorical structure and which minimises unnecessary ambiguity in the ...
  • The AMI System for the Transcription of Speech in Meetings 

    Hain, Thomas; Burget, Lukas; Dines, John; Garau, Giulia; Wan, Vincent; Karafiat, Martin; Vepa, Jithendra; Lincoln, Michael (2007)
    This paper describes the AMI transcription system for speech in meetings developed in collaboration by five research groups. The system includes generic techniques such as discriminative and speaker adaptive training, ...
  • Analysis and Synthesis of Head Motionfor Life like Conversational Agents 

    Shimodaira, Hiroshi; Uematsu, Keisuke; Kawamoto, Shin ichi; Hofer, Gregor O; Nakai, Mitsuru (2005)
    This study aims to investigate which and what motions of lifelike conversational agents play essential role to make the agents natural. Some preliminary experimental results and future plan are shown. Embodying ...
  • Analysis and synthesis of intonation using the tilt model 

    Taylor, Paul (Acoustical Society of America, 2000-03)
    This paper introduces the Tilt intonational model and describes how this model can be used to automatically analyze and synthesize intonation. In the model, intonation is represented as a linear sequence of events, which ...
  • Analysis of Speaker Adaptation Algorithms for HMM-based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm 

    Yamagishi, Junichi; Kobayashi, Takao; Yuji, Nakano; Ogata, Katsumi; Isogai, Juri (IEEE Signal Processing Society, 2009)
    In this paper we analyze the effects of several factors and configuration choices encountered during training and model construction when we want to obtain better and more stable adaptation in HMM-based speech synthesis. ...
  • Analysis of unknown words through morphological decomposition. 

    van de Plassche, J; Black, Alan W (1991)
    This paper describes a method of analysing words through morphological decomposition when the lexicon is incomplete. The method is used within a text-to-speech system to help generate pronunciations of unknown words. The ...