529 research outputs found
Sparseness-controlled adaptive algorithms for supervised and unsupervised system identification
In single-channel hands-free telephony, the acoustic coupling between the loudspeaker and
the microphone can be strong and this generates echoes that can degrade user experience.
Therefore, effective acoustic echo cancellation (AEC) is necessary to maintain a stable
system and hence improve the perceived voice quality of a call. Traditionally, adaptive
filters have been deployed in acoustic echo cancellers to estimate the acoustic impulse
responses (AIRs) using adaptive algorithms. The performances of a range of well-known
algorithms are studied in the context of both AEC and network echo cancellation (NEC).
It presents insights into their tracking performances under both time-invariant and time-varying
system conditions.
In the context of AEC, the level of sparseness in AIRs can vary greatly in a mobile
environment. When the response is strongly sparse, convergence of conventional
approaches is poor. Drawing on techniques originally developed for NEC, a class of time-domain
and a frequency-domain AEC algorithms are proposed that can not only work
well in both sparse and dispersive circumstances, but also adapt dynamically to the level
of sparseness using a new sparseness-controlled approach.
As it will be shown later that the early part of the acoustic echo path is sparse
while the late reverberant part of the acoustic path is dispersive, a novel approach to
an adaptive filter structure that consists of two time-domain partition blocks is proposed
such that different adaptive algorithms can be used for each part. By properly controlling
the mixing parameter for the partitioned blocks separately, where the block lengths are
controlled adaptively, the proposed partitioned block algorithm works well in both sparse
and dispersive time-varying circumstances.
A new insight into an analysis on the tracking performance of improved proportionate
NLMS (IPNLMS) is presented by deriving the expression for the mean-square error.
By employing the framework for both sparse and dispersive time-varying echo paths, this
work validates the analytic results in practical simulations for AEC.
The time-domain second-order statistic based blind SIMO identification algorithms,
which exploit the cross relation method, are investigated and then a technique with proportionate
step-size control for both sparse and dispersive system identification is also
developed
System Identification with Applications in Speech Enhancement
As the increasing popularity of integrating hands-free telephony on mobile portable devices
and the rapid development of voice over internet protocol, identification of acoustic
systems has become desirable for compensating distortions introduced to speech signals
during transmission, and hence enhancing the speech quality. The objective of this research
is to develop system identification algorithms for speech enhancement applications
including network echo cancellation and speech dereverberation.
A supervised adaptive algorithm for sparse system identification is developed for
network echo cancellation. Based on the framework of selective-tap updating scheme
on the normalized least mean squares algorithm, the MMax and sparse partial update
tap-selection strategies are exploited in the frequency domain to achieve fast convergence
performance with low computational complexity. Through demonstrating how
the sparseness of the network impulse response varies in the transformed domain, the
multidelay filtering structure is incorporated to reduce the algorithmic delay.
Blind identification of SIMO acoustic systems for speech dereverberation in the
presence of common zeros is then investigated. First, the problem of common zeros is
defined and extended to include the presence of near-common zeros. Two clustering algorithms
are developed to quantify the number of these zeros so as to facilitate the study
of their effect on blind system identification and speech dereverberation. To mitigate such
effect, two algorithms are developed where the two-stage algorithm based on channel
decomposition identifies common and non-common zeros sequentially; and the forced
spectral diversity approach combines spectral shaping filters and channel undermodelling
for deriving a modified system that leads to an improved dereverberation performance.
Additionally, a solution to the scale factor ambiguity problem in subband-based blind system identification is developed, which motivates further research on subbandbased
dereverberation techniques. Comprehensive simulations and discussions demonstrate
the effectiveness of the aforementioned algorithms. A discussion on possible directions
of prospective research on system identification techniques concludes this thesis
Enhanced Blind Maximum Ratio Combining in Broadcasting Systems
We propose an enhanced blind maximum ratio combiner (BMRC) allowing for a transmit signal independent diversity combining in multi-antenna receivers. The underlying Multi-Channel Frequency Least Mean Squares (MCFLMS) algorithm comes with reasonable computational complexity and estimates the channel impulse response for each receive antenna iteratively by means of second order statistics. In literature, the MCFLMS algorithm is mainly applied to audio signals. In this work, we describe several enhancements of this algorithm to ensure its proper convergence with oversampled communication signals which are distorted by frequency-selective fast-fading channels. In addition, we provide BER simulation results for a 1x2 SIMO DVB-T2 system and show that our blind MRC can even outperform conventional pilot-based MRC at the receiver side
Under-modelled blind system identification for time delay estimation in reverberant environments
In multichannel systems, acoustic time delay estimation (TDE) is a challenging problem in reverberant environments. Although blind system identification (BSI) based methods have been proposed which utilize a realistic signal model for the room impulse response (RIR), their TDE performance depends strongly on that of the BSI, which is often inaccurate in practice when the identified responses are under-modelled. In this paper, we propose a new under-modelled BSI based method for TDE in reverberant environments. An under-modelled BSI algorithm is derived, which is based on maximizing the cross-correlation of the cross-filtered signals rather than minimizing the cross-relation error, and also exploits the sparsity of the early part of the RIR. For TDE, this new criterion can be viewed as a generalization of conventional cross-correlation-based TDE methods by considering a more realistic model for the early RIR. Depending on the microphone spacing, only a short early part of each RIR is identified, and the time delays are estimated based on the peak locations in the identified early RIRs. Experiments in different reverberant environments with speech source signals demonstrate the effectiveness of the proposed method
- …