439 research outputs found
Robust equalization of multichannel acoustic systems
In most real-world acoustical scenarios, speech signals captured by distant microphones from a source are reverberated due to multipath propagation, and the reverberation may impair speech intelligibility. Speech dereverberation can be achieved
by equalizing the channels from the source to microphones. Equalization systems can
be computed using estimates of multichannel acoustic impulse responses. However,
the estimates obtained from system identification always include errors; the fact that
an equalization system is able to equalize the estimated multichannel acoustic system does not mean that it is able to equalize the true system. The objective of this
thesis is to propose and investigate robust equalization methods for multichannel
acoustic systems in the presence of system identification errors.
Equalization systems can be computed using the multiple-input/output inverse theorem or multichannel least-squares method. However, equalization systems
obtained from these methods are very sensitive to system identification errors. A
study of the multichannel least-squares method with respect to two classes of characteristic channel zeros is conducted. Accordingly, a relaxed multichannel least-
squares method is proposed. Channel shortening in connection with the multiple-
input/output inverse theorem and the relaxed multichannel least-squares method is
discussed.
Two algorithms taking into account the system identification errors are developed. Firstly, an optimally-stopped weighted conjugate gradient algorithm is
proposed. A conjugate gradient iterative method is employed to compute the equalization system. The iteration process is stopped optimally with respect to system identification errors. Secondly, a system-identification-error-robust equalization
method exploring the use of error models is presented, which incorporates system
identification error models in the weighted multichannel least-squares formulation
Blind MultiChannel Identification and Equalization for Dereverberation and Noise Reduction based on Convolutive Transfer Function
This paper addresses the problems of blind channel identification and
multichannel equalization for speech dereverberation and noise reduction. The
time-domain cross-relation method is not suitable for blind room impulse
response identification, due to the near-common zeros of the long impulse
responses. We extend the cross-relation method to the short-time Fourier
transform (STFT) domain, in which the time-domain impulse responses are
approximately represented by the convolutive transfer functions (CTFs) with
much less coefficients. The CTFs suffer from the common zeros caused by the
oversampled STFT. We propose to identify CTFs based on the STFT with the
oversampled signals and the critical sampled CTFs, which is a good compromise
between the frequency aliasing of the signals and the common zeros problem of
CTFs. In addition, a normalization of the CTFs is proposed to remove the gain
ambiguity across sub-bands. In the STFT domain, the identified CTFs is used for
multichannel equalization, in which the sparsity of speech signals is
exploited. We propose to perform inverse filtering by minimizing the
-norm of the source signal with the relaxed -norm fitting error
between the micophone signals and the convolution of the estimated source
signal and the CTFs used as a constraint. This method is advantageous in that
the noise can be reduced by relaxing the -norm to a tolerance
corresponding to the noise power, and the tolerance can be automatically set.
The experiments confirm the efficiency of the proposed method even under
conditions with high reverberation levels and intense noise.Comment: 13 pages, 5 figures, 5 table
Raking the Cocktail Party
We present the concept of an acoustic rake receiver---a microphone beamformer
that uses echoes to improve the noise and interference suppression. The rake
idea is well-known in wireless communications; it involves constructively
combining different multipath components that arrive at the receiver antennas.
Unlike spread-spectrum signals used in wireless communications, speech signals
are not orthogonal to their shifts. Therefore, we focus on the spatial
structure, rather than temporal. Instead of explicitly estimating the channel,
we create correspondences between early echoes in time and image sources in
space. These multiple sources of the desired and the interfering signal offer
additional spatial diversity that we can exploit in the beamformer design.
We present several "intuitive" and optimal formulations of acoustic rake
receivers, and show theoretically and numerically that the rake formulation of
the maximum signal-to-interference-and-noise beamformer offers significant
performance boosts in terms of noise and interference suppression. Beyond
signal-to-noise ratio, we observe gains in terms of the \emph{perceptual
evaluation of speech quality} (PESQ) metric for the speech quality. We
accompany the paper by the complete simulation and processing chain written in
Python. The code and the sound samples are available online at
\url{http://lcav.github.io/AcousticRakeReceiver/}.Comment: 12 pages, 11 figures, Accepted for publication in IEEE Journal on
Selected Topics in Signal Processing (Special Issue on Spatial Audio
Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function
This paper addresses the problem of speech separation and enhancement from
multichannel convolutive and noisy mixtures, \emph{assuming known mixing
filters}. We propose to perform the speech separation and enhancement task in
the short-time Fourier transform domain, using the convolutive transfer
function (CTF) approximation. Compared to time-domain filters, CTF has much
less taps, consequently it has less near-common zeros among channels and less
computational complexity. The work proposes three speech-source recovery
methods, namely: i) the multichannel inverse filtering method, i.e. the
multiple input/output inverse theorem (MINT), is exploited in the CTF domain,
and for the multi-source case, ii) a beamforming-like multichannel inverse
filtering method applying single source MINT and using power minimization,
which is suitable whenever the source CTFs are not all known, and iii) a
constrained Lasso method, where the sources are recovered by minimizing the
-norm to impose their spectral sparsity, with the constraint that the
-norm fitting cost, between the microphone signals and the mixing model
involving the unknown source signals, is less than a tolerance. The noise can
be reduced by setting a tolerance onto the noise power. Experiments under
various acoustic conditions are carried out to evaluate the three proposed
methods. The comparison between them as well as with the baseline methods is
presented.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech and Language
Processin
Frequency Spreading Equalization in Multicarrier Massive MIMO
Application of filter bank multicarrier (FBMC) as an effective method for
signaling over massive MIMO channels has been recently proposed. This paper
further expands the application of FBMC to massive MIMO by applying frequency
spreading equalization (FSE) to these channels. FSE allows us to achieve a more
accurate equalization. Hence, higher number of bits per symbol can be
transmitted and the bandwidth of each subcarrier can be widened. Widening the
bandwidth of each subcarrier leads to (i) higher bandwidth efficiency; (ii)
lower complexity; (iii) lower sensitivity to carrier frequency offset (CFO);
(iv) reduced peak-to-average power ratio (PAPR); and (iv) reduced latency. All
these appealing advantages have a direct impact on the digital as well as
analog circuitry that is needed for the system implementation. In this paper,
we develop the mathematical formulation of the minimum mean square error (MMSE)
FSE for massive MIMO systems. This analysis guides us to decide on the number
of subcarriers that will be sufficient for practical channel models.Comment: Accepted in IEEE ICC 2015 - Workshop on 5G & Beyond - Enabling
Technologies and Application
Multicarrier communication over underwater acoustic channels with nonuniform Doppler shifts
Author Posting. © IEEE, 2008. This article is posted here by permission of IEEE for personal use, not for redistribution. The definitive version was published in IEEE Journal of Oceanic Engineering 33 (2008): 198-209, doi:10.1109/JOE.2008.920471.Underwater acoustic (UWA) channels are wideband in nature due to the small ratio of the carrier frequency to the signal bandwidth, which introduces frequency-dependent Doppler shifts. In this paper, we treat the channel as having a common Doppler scaling factor on all propagation paths, and propose a two-step approach to mitigating the Doppler effect: 1) nonuniform Doppler compensation via resampling that converts a "wideband" problem into a "narrowband" problem and 2) high-resolution uniform compensation of the residual Doppler. We focus on zero-padded orthogonal frequency-division multiplexing (OFDM) to minimize the transmission power. Null subcarriers are used to facilitate Doppler compensation, and pilot subcarriers are used for channel estimation. The receiver is based on block-by-block processing, and does not rely on channel dependence across OFDM blocks; thus, it is suitable for fast-varying UWA channels. The data from two shallow-water experiments near Woods Hole, MA, are used to demonstrate the receiver performance. Excellent performance results are obtained even when the transmitter and the receiver are moving at a relative speed of up to 10 kn, at which the Doppler shifts are greater than the OFDM subcarrier spacing. These results suggest that OFDM is a viable option for high-rate communications over wideband UWA channels with nonuniform Doppler shifts.B.
Li and S. Zhou are supported by the ONR YIP grant N00014-07-1-0805
and the NSF grant ECCS-0725562. M. Stojanovic is supported by the ONR
grant N00014-07-1-0202. L. Freitag is supported by the ONR grants N00014-
02-6-0201 and N00014-07-1-0229. P. Willett is supported by the ONR
grant N00014-07-1-0055
Time-domain equalization for underwater acoustic OFDM systems with insufficient cyclic prefix
Master'sMASTER OF ENGINEERIN
Reverberation reduction in a room for multiple positions
Reverberation in a room occurs when the direct path sound from a sound source undergoes multiple reflections from the walls of the room before reaching the listener. An impulse response of the room can be measured called the room impulse response (RIR) which captures the effects of the room. This can be represented digitally on a computer. A filter is designed to cancel the effects of the room using the information in the room impulse response. This filter is called an equalization filter and is usually placed between the source signal and loudspeaker to perform the equalization. The RIR changes for varying source and listener locations, hence an equalization filter designed for one RIR will not perform equalization for multiple positions. This thesis explores methods to perform equalization for multiple positions. One of the simplest methods is spatial averaging equalization, which was used to perform the equalization for multiple positions. Equalizing RIR is only concerned about trying to flatten the frequency spectrum and stabilizing the inverse RIR by looking at its minimum-phase component. Other methods are explored which consider the masking effects of the human auditory system which relates to the perception of sound by the human ear. One such method is impulse response shortening/reshaping which emphasizes the direct path component in the RIR relative to the rest of the components using p-norm and infinity-norm optimization which is an iterative algorithm. This concept is extended for performing reshaping on RIR for multiple positions using the idea in spatial averaging equalization by using RIR\u27s measure for different positions --Abstract, page iii
- …