Edinburgh Research Archive >
Engineering, School of >
Engineering, School of >
Engineering thesis and dissertation collection >
Please use this identifier to cite or link to this item:
|Title: ||The removal of environmental noise in cellular communications by perceptual techniques|
|Authors: ||Tuffy, Mark|
|Supervisor(s): ||Laurenson, David|
|Issue Date: ||Jun-2000|
|Publisher: ||University of Edinburgh. College of Science and Engineering. School of Engineering and Electronics|
|Abstract: ||This thesis describes the application of a perceptually based spectral subtraction algorithm for
the enhancement of non-stationary noise corrupted speech. Through examination of speech enhancement
techniques, explanations are given for the choice of magnitude spectral subtraction
and how the human auditory system can be modelled for frequency domain speech enhancement.
It is discovered, that the cochlea provides the mechanical speech enhancement in the
auditory system, through the use of masking. Frequency masking is used in spectral subtraction,
to improve the algorithm execution time, and to shape the enhancement process making it
sound natural to the ear.
A new technique for estimation of background noise is presented, which operates during speech
sections as well as pauses. This uses two microphones placed on opposite ends of the cellular
handset. Using these, the algorithm determines whether the signal is speech, or noise, by
examining the current and next frames presented to each microphone. This allows operation in
non-stationary conditions, as the estimation is calculated for each frame, and a speech pause is
not required for updating. A voting decision process decides the presence of speech or noise
which determines which microphone the estimation is calculated from.
The importance of an accurate noise estimate is highlighted with a new technique to reduce
the effect of musical noise artifacts in the processed speech. This is a classic drawback of
spectral subtraction techniques, and it is shown, that the trade off between noise reduction and
speech distortion can be extended by this process. A new method for dealing with musical
noise is described, which uses a combination of energy and variance examination of the spectrogram
to segregate potential musical noise from desired speech sections. By examination of
the spectrogram points surrounding musical noise sections, perceptually relevant values replace
the corruption leading to cleaner enhanced speech.
Any perceptual speech system requires accurate estimates of the clean speech masking thresholds,
to prevent noisy sections being passed through the enhancement untouched. In this thesis, a
method for the calculation of the estimated clean speech masking thresholds is derived. Classically,
this requires an estimation of the clean speech before the thresholds can be derived,
but this results in inaccuracy due to the presence of musical noise and spectral nulls. The
new algorithm examines the thresholds produced by the corrupted speech, and the background
noise, and from these determines the relationship between the two, to produce an estimate of
the clean thresholds, with no operation performed on the actual speech signal. A discrepancy is
found between the results for male and female speech, which, by examination of the perceptual
process, is shown to be due to the different formant positions in male and female speech.
Following the development of these parts, the entire enhancement algorithm is tested on a range
of noise scenarios, using male and female speech. The results show, that the proposed algorithm
is able to provide adequate performance in terms of noise reduction and speech quality.|
|Appears in Collections:||Engineering thesis and dissertation collection|
Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.