907 research outputs found
System Identification with Applications in Speech Enhancement
As the increasing popularity of integrating hands-free telephony on mobile portable devices
and the rapid development of voice over internet protocol, identification of acoustic
systems has become desirable for compensating distortions introduced to speech signals
during transmission, and hence enhancing the speech quality. The objective of this research
is to develop system identification algorithms for speech enhancement applications
including network echo cancellation and speech dereverberation.
A supervised adaptive algorithm for sparse system identification is developed for
network echo cancellation. Based on the framework of selective-tap updating scheme
on the normalized least mean squares algorithm, the MMax and sparse partial update
tap-selection strategies are exploited in the frequency domain to achieve fast convergence
performance with low computational complexity. Through demonstrating how
the sparseness of the network impulse response varies in the transformed domain, the
multidelay filtering structure is incorporated to reduce the algorithmic delay.
Blind identification of SIMO acoustic systems for speech dereverberation in the
presence of common zeros is then investigated. First, the problem of common zeros is
defined and extended to include the presence of near-common zeros. Two clustering algorithms
are developed to quantify the number of these zeros so as to facilitate the study
of their effect on blind system identification and speech dereverberation. To mitigate such
effect, two algorithms are developed where the two-stage algorithm based on channel
decomposition identifies common and non-common zeros sequentially; and the forced
spectral diversity approach combines spectral shaping filters and channel undermodelling
for deriving a modified system that leads to an improved dereverberation performance.
Additionally, a solution to the scale factor ambiguity problem in subband-based blind system identification is developed, which motivates further research on subbandbased
dereverberation techniques. Comprehensive simulations and discussions demonstrate
the effectiveness of the aforementioned algorithms. A discussion on possible directions
of prospective research on system identification techniques concludes this thesis
Blind MultiChannel Identification and Equalization for Dereverberation and Noise Reduction based on Convolutive Transfer Function
This paper addresses the problems of blind channel identification and
multichannel equalization for speech dereverberation and noise reduction. The
time-domain cross-relation method is not suitable for blind room impulse
response identification, due to the near-common zeros of the long impulse
responses. We extend the cross-relation method to the short-time Fourier
transform (STFT) domain, in which the time-domain impulse responses are
approximately represented by the convolutive transfer functions (CTFs) with
much less coefficients. The CTFs suffer from the common zeros caused by the
oversampled STFT. We propose to identify CTFs based on the STFT with the
oversampled signals and the critical sampled CTFs, which is a good compromise
between the frequency aliasing of the signals and the common zeros problem of
CTFs. In addition, a normalization of the CTFs is proposed to remove the gain
ambiguity across sub-bands. In the STFT domain, the identified CTFs is used for
multichannel equalization, in which the sparsity of speech signals is
exploited. We propose to perform inverse filtering by minimizing the
-norm of the source signal with the relaxed -norm fitting error
between the micophone signals and the convolution of the estimated source
signal and the CTFs used as a constraint. This method is advantageous in that
the noise can be reduced by relaxing the -norm to a tolerance
corresponding to the noise power, and the tolerance can be automatically set.
The experiments confirm the efficiency of the proposed method even under
conditions with high reverberation levels and intense noise.Comment: 13 pages, 5 figures, 5 table
Sparseness-controlled adaptive algorithms for supervised and unsupervised system identification
In single-channel hands-free telephony, the acoustic coupling between the loudspeaker and
the microphone can be strong and this generates echoes that can degrade user experience.
Therefore, effective acoustic echo cancellation (AEC) is necessary to maintain a stable
system and hence improve the perceived voice quality of a call. Traditionally, adaptive
filters have been deployed in acoustic echo cancellers to estimate the acoustic impulse
responses (AIRs) using adaptive algorithms. The performances of a range of well-known
algorithms are studied in the context of both AEC and network echo cancellation (NEC).
It presents insights into their tracking performances under both time-invariant and time-varying
system conditions.
In the context of AEC, the level of sparseness in AIRs can vary greatly in a mobile
environment. When the response is strongly sparse, convergence of conventional
approaches is poor. Drawing on techniques originally developed for NEC, a class of time-domain
and a frequency-domain AEC algorithms are proposed that can not only work
well in both sparse and dispersive circumstances, but also adapt dynamically to the level
of sparseness using a new sparseness-controlled approach.
As it will be shown later that the early part of the acoustic echo path is sparse
while the late reverberant part of the acoustic path is dispersive, a novel approach to
an adaptive filter structure that consists of two time-domain partition blocks is proposed
such that different adaptive algorithms can be used for each part. By properly controlling
the mixing parameter for the partitioned blocks separately, where the block lengths are
controlled adaptively, the proposed partitioned block algorithm works well in both sparse
and dispersive time-varying circumstances.
A new insight into an analysis on the tracking performance of improved proportionate
NLMS (IPNLMS) is presented by deriving the expression for the mean-square error.
By employing the framework for both sparse and dispersive time-varying echo paths, this
work validates the analytic results in practical simulations for AEC.
The time-domain second-order statistic based blind SIMO identification algorithms,
which exploit the cross relation method, are investigated and then a technique with proportionate
step-size control for both sparse and dispersive system identification is also
developed
Canonical correlation analysis based on sparse penalty and through rank-1 matrix approximation
Canonical correlation analysis (CCA) is a well-known technique used to characterize the relationship between two sets of multidimensional variables by finding linear combinations of variables with maximal correlation. Sparse CCA and smooth or regularized CCA are two widely used variants of CCA because of the improved interpretability of the former and the better performance of the later. So far the cross-matrix product of the two sets of multidimensional variables has been widely used for the derivation of these variants. In this paper two new algorithms for sparse CCA and smooth CCA are proposed. These algorithms differ from the existing ones in their derivation which is based on penalized rank one matrix approximation and the orthogonal projectors onto the space spanned by the columns of the two sets of multidimensional variables instead of the simple cross-matrix product. The performance and effectiveness of the proposed algorithms are tested on simulated experiments. On these results it can be observed that they outperforms the state of the art sparse CCA algorithms
Sparse Nonlinear MIMO Filtering and Identification
In this chapter system identification algorithms for sparse nonlinear multi input multi output (MIMO) systems are developed. These algorithms are potentially useful in a variety of application areas including digital transmission systems incorporating power amplifier(s) along with multiple antennas, cognitive processing, adaptive control of nonlinear multivariable systems, and multivariable biological systems. Sparsity is a key constraint imposed on the model. The presence of sparsity is often dictated by physical considerations as in wireless fading channel-estimation. In other cases it appears as a pragmatic modelling approach that seeks to cope with the curse of dimensionality, particularly acute in nonlinear systems like Volterra type series. Three dentification approaches are discussed: conventional identification based on both input and output samples, semi–blind identification placing emphasis on minimal input resources and blind identification whereby only output samples are available plus a–priori information on input characteristics. Based on this taxonomy a variety of algorithms, existing and new, are studied and evaluated by simulation
- …