Auditory speaker recognition: a theoretical and experimental study
Brown, Roger S.
MetadataShow full item record
Speaker recognition is defined as the.ability to recognise a speaker's identity on the basis of hearing a sample of his speech. Previous approaches to the subject have concentrated on the experimental manipulation in isolation of acoustic features of the speech signal. The theoretical approach adopted here attempts to provide a conceptual framework for speaker recognition, in which emphasis is laid on auditory speaker recognition (as opposed to speaker recognition by machine or by the visual examination of spectrograms ("voiceprints")). The everyday use of speaker recognition is discussed in contrast to the possible artificialities of experimental formats. The nature and utilisation of phonetic speaker-characterising features of voice are examined within the context of (i) other levels of features (syntactic, semantic, lexical , etc.) (ii) other forms of indexical information (sex, age, regional origin, social status, etc.) and (iii) other identity characteristics (names, physical appearance, etc.). Attention is also focussed on two variables in the speaker recognition process which have been relatively neglected by previous writers and researchers: (i) The nature and implications of differences in the tasks which listeners perform. The culmination of this discussion is a model in Boolean logic of the decision-processes involved in speaker recognition. (ii) The possible effects caused by differences in the number, background and training of listeners. The experimental approach adopted exploits the simultaneous manipulation of parameters made possible by the use of synthetic speech. The relative weighting rather than the absolute potentiality of parameters as speaker-characterising features can thus be examined. Results from voice similarity judgment experiments employing a factorial design indicate that: (i) the parameters of mean pitch, mean formant position and formant bandwidth are important for speaker recognition, and (ii) despite overall performance differences in judgments of similarity and difference, the responses of the individual listeners show comparable reactions to factorial changes.