439 research outputs found

    Application of channel shortening to acoustic channel equalization in the presence of noise and estimation error

    Full text link

    Robust equalization of multichannel acoustic systems

    Get PDF
    In most real-world acoustical scenarios, speech signals captured by distant microphones from a source are reverberated due to multipath propagation, and the reverberation may impair speech intelligibility. Speech dereverberation can be achieved by equalizing the channels from the source to microphones. Equalization systems can be computed using estimates of multichannel acoustic impulse responses. However, the estimates obtained from system identification always include errors; the fact that an equalization system is able to equalize the estimated multichannel acoustic system does not mean that it is able to equalize the true system. The objective of this thesis is to propose and investigate robust equalization methods for multichannel acoustic systems in the presence of system identification errors. Equalization systems can be computed using the multiple-input/output inverse theorem or multichannel least-squares method. However, equalization systems obtained from these methods are very sensitive to system identification errors. A study of the multichannel least-squares method with respect to two classes of characteristic channel zeros is conducted. Accordingly, a relaxed multichannel least- squares method is proposed. Channel shortening in connection with the multiple- input/output inverse theorem and the relaxed multichannel least-squares method is discussed. Two algorithms taking into account the system identification errors are developed. Firstly, an optimally-stopped weighted conjugate gradient algorithm is proposed. A conjugate gradient iterative method is employed to compute the equalization system. The iteration process is stopped optimally with respect to system identification errors. Secondly, a system-identification-error-robust equalization method exploring the use of error models is presented, which incorporates system identification error models in the weighted multichannel least-squares formulation

    Blind MultiChannel Identification and Equalization for Dereverberation and Noise Reduction based on Convolutive Transfer Function

    Get PDF
    This paper addresses the problems of blind channel identification and multichannel equalization for speech dereverberation and noise reduction. The time-domain cross-relation method is not suitable for blind room impulse response identification, due to the near-common zeros of the long impulse responses. We extend the cross-relation method to the short-time Fourier transform (STFT) domain, in which the time-domain impulse responses are approximately represented by the convolutive transfer functions (CTFs) with much less coefficients. The CTFs suffer from the common zeros caused by the oversampled STFT. We propose to identify CTFs based on the STFT with the oversampled signals and the critical sampled CTFs, which is a good compromise between the frequency aliasing of the signals and the common zeros problem of CTFs. In addition, a normalization of the CTFs is proposed to remove the gain ambiguity across sub-bands. In the STFT domain, the identified CTFs is used for multichannel equalization, in which the sparsity of speech signals is exploited. We propose to perform inverse filtering by minimizing the 1\ell_1-norm of the source signal with the relaxed 2\ell_2-norm fitting error between the micophone signals and the convolution of the estimated source signal and the CTFs used as a constraint. This method is advantageous in that the noise can be reduced by relaxing the 2\ell_2-norm to a tolerance corresponding to the noise power, and the tolerance can be automatically set. The experiments confirm the efficiency of the proposed method even under conditions with high reverberation levels and intense noise.Comment: 13 pages, 5 figures, 5 table

    Raking the Cocktail Party

    Get PDF
    We present the concept of an acoustic rake receiver---a microphone beamformer that uses echoes to improve the noise and interference suppression. The rake idea is well-known in wireless communications; it involves constructively combining different multipath components that arrive at the receiver antennas. Unlike spread-spectrum signals used in wireless communications, speech signals are not orthogonal to their shifts. Therefore, we focus on the spatial structure, rather than temporal. Instead of explicitly estimating the channel, we create correspondences between early echoes in time and image sources in space. These multiple sources of the desired and the interfering signal offer additional spatial diversity that we can exploit in the beamformer design. We present several "intuitive" and optimal formulations of acoustic rake receivers, and show theoretically and numerically that the rake formulation of the maximum signal-to-interference-and-noise beamformer offers significant performance boosts in terms of noise and interference suppression. Beyond signal-to-noise ratio, we observe gains in terms of the \emph{perceptual evaluation of speech quality} (PESQ) metric for the speech quality. We accompany the paper by the complete simulation and processing chain written in Python. The code and the sound samples are available online at \url{http://lcav.github.io/AcousticRakeReceiver/}.Comment: 12 pages, 11 figures, Accepted for publication in IEEE Journal on Selected Topics in Signal Processing (Special Issue on Spatial Audio

    Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function

    Get PDF
    This paper addresses the problem of speech separation and enhancement from multichannel convolutive and noisy mixtures, \emph{assuming known mixing filters}. We propose to perform the speech separation and enhancement task in the short-time Fourier transform domain, using the convolutive transfer function (CTF) approximation. Compared to time-domain filters, CTF has much less taps, consequently it has less near-common zeros among channels and less computational complexity. The work proposes three speech-source recovery methods, namely: i) the multichannel inverse filtering method, i.e. the multiple input/output inverse theorem (MINT), is exploited in the CTF domain, and for the multi-source case, ii) a beamforming-like multichannel inverse filtering method applying single source MINT and using power minimization, which is suitable whenever the source CTFs are not all known, and iii) a constrained Lasso method, where the sources are recovered by minimizing the 1\ell_1-norm to impose their spectral sparsity, with the constraint that the 2\ell_2-norm fitting cost, between the microphone signals and the mixing model involving the unknown source signals, is less than a tolerance. The noise can be reduced by setting a tolerance onto the noise power. Experiments under various acoustic conditions are carried out to evaluate the three proposed methods. The comparison between them as well as with the baseline methods is presented.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processin

    Frequency Spreading Equalization in Multicarrier Massive MIMO

    Full text link
    Application of filter bank multicarrier (FBMC) as an effective method for signaling over massive MIMO channels has been recently proposed. This paper further expands the application of FBMC to massive MIMO by applying frequency spreading equalization (FSE) to these channels. FSE allows us to achieve a more accurate equalization. Hence, higher number of bits per symbol can be transmitted and the bandwidth of each subcarrier can be widened. Widening the bandwidth of each subcarrier leads to (i) higher bandwidth efficiency; (ii) lower complexity; (iii) lower sensitivity to carrier frequency offset (CFO); (iv) reduced peak-to-average power ratio (PAPR); and (iv) reduced latency. All these appealing advantages have a direct impact on the digital as well as analog circuitry that is needed for the system implementation. In this paper, we develop the mathematical formulation of the minimum mean square error (MMSE) FSE for massive MIMO systems. This analysis guides us to decide on the number of subcarriers that will be sufficient for practical channel models.Comment: Accepted in IEEE ICC 2015 - Workshop on 5G & Beyond - Enabling Technologies and Application

    Multicarrier communication over underwater acoustic channels with nonuniform Doppler shifts

    Get PDF
    Author Posting. © IEEE, 2008. This article is posted here by permission of IEEE for personal use, not for redistribution. The definitive version was published in IEEE Journal of Oceanic Engineering 33 (2008): 198-209, doi:10.1109/JOE.2008.920471.Underwater acoustic (UWA) channels are wideband in nature due to the small ratio of the carrier frequency to the signal bandwidth, which introduces frequency-dependent Doppler shifts. In this paper, we treat the channel as having a common Doppler scaling factor on all propagation paths, and propose a two-step approach to mitigating the Doppler effect: 1) nonuniform Doppler compensation via resampling that converts a "wideband" problem into a "narrowband" problem and 2) high-resolution uniform compensation of the residual Doppler. We focus on zero-padded orthogonal frequency-division multiplexing (OFDM) to minimize the transmission power. Null subcarriers are used to facilitate Doppler compensation, and pilot subcarriers are used for channel estimation. The receiver is based on block-by-block processing, and does not rely on channel dependence across OFDM blocks; thus, it is suitable for fast-varying UWA channels. The data from two shallow-water experiments near Woods Hole, MA, are used to demonstrate the receiver performance. Excellent performance results are obtained even when the transmitter and the receiver are moving at a relative speed of up to 10 kn, at which the Doppler shifts are greater than the OFDM subcarrier spacing. These results suggest that OFDM is a viable option for high-rate communications over wideband UWA channels with nonuniform Doppler shifts.B. Li and S. Zhou are supported by the ONR YIP grant N00014-07-1-0805 and the NSF grant ECCS-0725562. M. Stojanovic is supported by the ONR grant N00014-07-1-0202. L. Freitag is supported by the ONR grants N00014- 02-6-0201 and N00014-07-1-0229. P. Willett is supported by the ONR grant N00014-07-1-0055

    Time-domain equalization for underwater acoustic OFDM systems with insufficient cyclic prefix

    Get PDF
    Master'sMASTER OF ENGINEERIN

    Reverberation reduction in a room for multiple positions

    Get PDF
    Reverberation in a room occurs when the direct path sound from a sound source undergoes multiple reflections from the walls of the room before reaching the listener. An impulse response of the room can be measured called the room impulse response (RIR) which captures the effects of the room. This can be represented digitally on a computer. A filter is designed to cancel the effects of the room using the information in the room impulse response. This filter is called an equalization filter and is usually placed between the source signal and loudspeaker to perform the equalization. The RIR changes for varying source and listener locations, hence an equalization filter designed for one RIR will not perform equalization for multiple positions. This thesis explores methods to perform equalization for multiple positions. One of the simplest methods is spatial averaging equalization, which was used to perform the equalization for multiple positions. Equalizing RIR is only concerned about trying to flatten the frequency spectrum and stabilizing the inverse RIR by looking at its minimum-phase component. Other methods are explored which consider the masking effects of the human auditory system which relates to the perception of sound by the human ear. One such method is impulse response shortening/reshaping which emphasizes the direct path component in the RIR relative to the rest of the components using p-norm and infinity-norm optimization which is an iterative algorithm. This concept is extended for performing reshaping on RIR for multiple positions using the idea in spatial averaging equalization by using RIR\u27s measure for different positions --Abstract, page iii
    corecore