3,187 research outputs found
Blind MultiChannel Identification and Equalization for Dereverberation and Noise Reduction based on Convolutive Transfer Function
This paper addresses the problems of blind channel identification and
multichannel equalization for speech dereverberation and noise reduction. The
time-domain cross-relation method is not suitable for blind room impulse
response identification, due to the near-common zeros of the long impulse
responses. We extend the cross-relation method to the short-time Fourier
transform (STFT) domain, in which the time-domain impulse responses are
approximately represented by the convolutive transfer functions (CTFs) with
much less coefficients. The CTFs suffer from the common zeros caused by the
oversampled STFT. We propose to identify CTFs based on the STFT with the
oversampled signals and the critical sampled CTFs, which is a good compromise
between the frequency aliasing of the signals and the common zeros problem of
CTFs. In addition, a normalization of the CTFs is proposed to remove the gain
ambiguity across sub-bands. In the STFT domain, the identified CTFs is used for
multichannel equalization, in which the sparsity of speech signals is
exploited. We propose to perform inverse filtering by minimizing the
-norm of the source signal with the relaxed -norm fitting error
between the micophone signals and the convolution of the estimated source
signal and the CTFs used as a constraint. This method is advantageous in that
the noise can be reduced by relaxing the -norm to a tolerance
corresponding to the noise power, and the tolerance can be automatically set.
The experiments confirm the efficiency of the proposed method even under
conditions with high reverberation levels and intense noise.Comment: 13 pages, 5 figures, 5 table
Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings
We tackle the multi-party speech recovery problem through modeling the
acoustic of the reverberant chambers. Our approach exploits structured sparsity
models to perform room modeling and speech recovery. We propose a scheme for
characterizing the room acoustic from the unknown competing speech sources
relying on localization of the early images of the speakers by sparse
approximation of the spatial spectra of the virtual sources in a free-space
model. The images are then clustered exploiting the low-rank structure of the
spectro-temporal components belonging to each source. This enables us to
identify the early support of the room impulse response function and its unique
map to the room geometry. To further tackle the ambiguity of the reflection
ratios, we propose a novel formulation of the reverberation model and estimate
the absorption coefficients through a convex optimization exploiting joint
sparsity model formulated upon spatio-spectral sparsity of concurrent speech
representation. The acoustic parameters are then incorporated for separating
individual speech signals through either structured sparse recovery or inverse
filtering the acoustic channels. The experiments conducted on real data
recordings demonstrate the effectiveness of the proposed approach for
multi-party speech recovery and recognition.Comment: 31 page
Robust equalization of multichannel acoustic systems
In most real-world acoustical scenarios, speech signals captured by distant microphones from a source are reverberated due to multipath propagation, and the reverberation may impair speech intelligibility. Speech dereverberation can be achieved
by equalizing the channels from the source to microphones. Equalization systems can
be computed using estimates of multichannel acoustic impulse responses. However,
the estimates obtained from system identification always include errors; the fact that
an equalization system is able to equalize the estimated multichannel acoustic system does not mean that it is able to equalize the true system. The objective of this
thesis is to propose and investigate robust equalization methods for multichannel
acoustic systems in the presence of system identification errors.
Equalization systems can be computed using the multiple-input/output inverse theorem or multichannel least-squares method. However, equalization systems
obtained from these methods are very sensitive to system identification errors. A
study of the multichannel least-squares method with respect to two classes of characteristic channel zeros is conducted. Accordingly, a relaxed multichannel least-
squares method is proposed. Channel shortening in connection with the multiple-
input/output inverse theorem and the relaxed multichannel least-squares method is
discussed.
Two algorithms taking into account the system identification errors are developed. Firstly, an optimally-stopped weighted conjugate gradient algorithm is
proposed. A conjugate gradient iterative method is employed to compute the equalization system. The iteration process is stopped optimally with respect to system identification errors. Secondly, a system-identification-error-robust equalization
method exploring the use of error models is presented, which incorporates system
identification error models in the weighted multichannel least-squares formulation
Sparseness-controlled adaptive algorithms for supervised and unsupervised system identification
In single-channel hands-free telephony, the acoustic coupling between the loudspeaker and
the microphone can be strong and this generates echoes that can degrade user experience.
Therefore, effective acoustic echo cancellation (AEC) is necessary to maintain a stable
system and hence improve the perceived voice quality of a call. Traditionally, adaptive
filters have been deployed in acoustic echo cancellers to estimate the acoustic impulse
responses (AIRs) using adaptive algorithms. The performances of a range of well-known
algorithms are studied in the context of both AEC and network echo cancellation (NEC).
It presents insights into their tracking performances under both time-invariant and time-varying
system conditions.
In the context of AEC, the level of sparseness in AIRs can vary greatly in a mobile
environment. When the response is strongly sparse, convergence of conventional
approaches is poor. Drawing on techniques originally developed for NEC, a class of time-domain
and a frequency-domain AEC algorithms are proposed that can not only work
well in both sparse and dispersive circumstances, but also adapt dynamically to the level
of sparseness using a new sparseness-controlled approach.
As it will be shown later that the early part of the acoustic echo path is sparse
while the late reverberant part of the acoustic path is dispersive, a novel approach to
an adaptive filter structure that consists of two time-domain partition blocks is proposed
such that different adaptive algorithms can be used for each part. By properly controlling
the mixing parameter for the partitioned blocks separately, where the block lengths are
controlled adaptively, the proposed partitioned block algorithm works well in both sparse
and dispersive time-varying circumstances.
A new insight into an analysis on the tracking performance of improved proportionate
NLMS (IPNLMS) is presented by deriving the expression for the mean-square error.
By employing the framework for both sparse and dispersive time-varying echo paths, this
work validates the analytic results in practical simulations for AEC.
The time-domain second-order statistic based blind SIMO identification algorithms,
which exploit the cross relation method, are investigated and then a technique with proportionate
step-size control for both sparse and dispersive system identification is also
developed
System Identification with Applications in Speech Enhancement
As the increasing popularity of integrating hands-free telephony on mobile portable devices
and the rapid development of voice over internet protocol, identification of acoustic
systems has become desirable for compensating distortions introduced to speech signals
during transmission, and hence enhancing the speech quality. The objective of this research
is to develop system identification algorithms for speech enhancement applications
including network echo cancellation and speech dereverberation.
A supervised adaptive algorithm for sparse system identification is developed for
network echo cancellation. Based on the framework of selective-tap updating scheme
on the normalized least mean squares algorithm, the MMax and sparse partial update
tap-selection strategies are exploited in the frequency domain to achieve fast convergence
performance with low computational complexity. Through demonstrating how
the sparseness of the network impulse response varies in the transformed domain, the
multidelay filtering structure is incorporated to reduce the algorithmic delay.
Blind identification of SIMO acoustic systems for speech dereverberation in the
presence of common zeros is then investigated. First, the problem of common zeros is
defined and extended to include the presence of near-common zeros. Two clustering algorithms
are developed to quantify the number of these zeros so as to facilitate the study
of their effect on blind system identification and speech dereverberation. To mitigate such
effect, two algorithms are developed where the two-stage algorithm based on channel
decomposition identifies common and non-common zeros sequentially; and the forced
spectral diversity approach combines spectral shaping filters and channel undermodelling
for deriving a modified system that leads to an improved dereverberation performance.
Additionally, a solution to the scale factor ambiguity problem in subband-based blind system identification is developed, which motivates further research on subbandbased
dereverberation techniques. Comprehensive simulations and discussions demonstrate
the effectiveness of the aforementioned algorithms. A discussion on possible directions
of prospective research on system identification techniques concludes this thesis
Sub-Nyquist Sampling: Bridging Theory and Practice
Sampling theory encompasses all aspects related to the conversion of
continuous-time signals to discrete streams of numbers. The famous
Shannon-Nyquist theorem has become a landmark in the development of digital
signal processing. In modern applications, an increasingly number of functions
is being pushed forward to sophisticated software algorithms, leaving only
those delicate finely-tuned tasks for the circuit level.
In this paper, we review sampling strategies which target reduction of the
ADC rate below Nyquist. Our survey covers classic works from the early 50's of
the previous century through recent publications from the past several years.
The prime focus is bridging theory and practice, that is to pinpoint the
potential of sub-Nyquist strategies to emerge from the math to the hardware. In
that spirit, we integrate contemporary theoretical viewpoints, which study
signal modeling in a union of subspaces, together with a taste of practical
aspects, namely how the avant-garde modalities boil down to concrete signal
processing systems. Our hope is that this presentation style will attract the
interest of both researchers and engineers in the hope of promoting the
sub-Nyquist premise into practical applications, and encouraging further
research into this exciting new frontier.Comment: 48 pages, 18 figures, to appear in IEEE Signal Processing Magazin
- …