33 research outputs found

    Adaptive Algorithms for Intelligent Acoustic Interfaces

    Get PDF
    Modern speech communications are evolving towards a new direction which involves users in a more perceptive way. That is the immersive experience, which may be considered as the “last-mile” problem of telecommunications. One of the main feature of immersive communications is the distant-talking, i.e. the hands-free (in the broad sense) speech communications without bodyworn or tethered microphones that takes place in a multisource environment where interfering signals may degrade the communication quality and the intelligibility of the desired speech source. In order to preserve speech quality intelligent acoustic interfaces may be used. An intelligent acoustic interface may comprise multiple microphones and loudspeakers and its peculiarity is to model the acoustic channel in order to adapt to user requirements and to environment conditions. This is the reason why intelligent acoustic interfaces are based on adaptive filtering algorithms. The acoustic path modelling entails a set of problems which have to be taken into account in designing an adaptive filtering algorithm. Such problems may be basically generated by a linear or a nonlinear process and can be tackled respectively by linear or nonlinear adaptive algorithms. In this work we consider such modelling problems and we propose novel effective adaptive algorithms that allow acoustic interfaces to be robust against any interfering signals, thus preserving the perceived quality of desired speech signals. As regards linear adaptive algorithms, a class of adaptive filters based on the sparse nature of the acoustic impulse response has been recently proposed. We adopt such class of adaptive filters, named proportionate adaptive filters, and derive a general framework from which it is possible to derive any linear adaptive algorithm. Using such framework we also propose some efficient proportionate adaptive algorithms, expressly designed to tackle problems of a linear nature. On the other side, in order to address problems deriving from a nonlinear process, we propose a novel filtering model which performs a nonlinear transformations by means of functional links. Using such nonlinear model, we propose functional link adaptive filters which provide an efficient solution to the modelling of a nonlinear acoustic channel. Finally, we introduce robust filtering architectures based on adaptive combinations of filters that allow acoustic interfaces to more effectively adapt to environment conditions, thus providing a powerful mean to immersive speech communications

    Adaptive Algorithms for Intelligent Acoustic Interfaces

    Get PDF
    Modern speech communications are evolving towards a new direction which involves users in a more perceptive way. That is the immersive experience, which may be considered as the “last-mile” problem of telecommunications. One of the main feature of immersive communications is the distant-talking, i.e. the hands-free (in the broad sense) speech communications without bodyworn or tethered microphones that takes place in a multisource environment where interfering signals may degrade the communication quality and the intelligibility of the desired speech source. In order to preserve speech quality intelligent acoustic interfaces may be used. An intelligent acoustic interface may comprise multiple microphones and loudspeakers and its peculiarity is to model the acoustic channel in order to adapt to user requirements and to environment conditions. This is the reason why intelligent acoustic interfaces are based on adaptive filtering algorithms. The acoustic path modelling entails a set of problems which have to be taken into account in designing an adaptive filtering algorithm. Such problems may be basically generated by a linear or a nonlinear process and can be tackled respectively by linear or nonlinear adaptive algorithms. In this work we consider such modelling problems and we propose novel effective adaptive algorithms that allow acoustic interfaces to be robust against any interfering signals, thus preserving the perceived quality of desired speech signals. As regards linear adaptive algorithms, a class of adaptive filters based on the sparse nature of the acoustic impulse response has been recently proposed. We adopt such class of adaptive filters, named proportionate adaptive filters, and derive a general framework from which it is possible to derive any linear adaptive algorithm. Using such framework we also propose some efficient proportionate adaptive algorithms, expressly designed to tackle problems of a linear nature. On the other side, in order to address problems deriving from a nonlinear process, we propose a novel filtering model which performs a nonlinear transformations by means of functional links. Using such nonlinear model, we propose functional link adaptive filters which provide an efficient solution to the modelling of a nonlinear acoustic channel. Finally, we introduce robust filtering architectures based on adaptive combinations of filters that allow acoustic interfaces to more effectively adapt to environment conditions, thus providing a powerful mean to immersive speech communications

    Adaptive beamforming and switching in smart antenna systems

    Get PDF
    The ever increasing requirement for providing large bandwidth and seamless data access to commuters has prompted new challenges to wireless solution providers. The communication channel characteristics between mobile clients and base station change rapidly with the increasing traveling speed of vehicles. Smart antenna systems with adaptive beamforming and switching technology is the key component to tackle the challenges. As a spatial filter, beamformer has long been widely used in wireless communication, radar, acoustics, medical imaging systems to enhance the received signal from a particular looking direction while suppressing noise and interference from other directions. The adaptive beamforming algorithm provides the capability to track the varying nature of the communication channel characteristics. However, the conventional adaptive beamformer assumes that the Direction of Arrival (DOA) of the signal of interest changes slowly, although the interference direction could be changed dynamically. The proliferation of High Speed Rail (HSR) and seamless wireless communication between infrastructure ( roadside, trackside equipment) and the vehicles (train, car, boat etc.) brings a unique challenge for adaptive beamforming due to its rapid change of DOA. For a HSR train with 250km/h, the DOA change speed can be up to 4⁰ per millisecond. To address these unique challenges, faster algorithms to calculate the beamforming weight based on the rapid-changing DOA are needed. In this dissertation, two strategies are adopted to address the challenges. The first one is to improve the weight calculation speed. The second strategy is to improve the speed of DOA estimation for the impinging signal by leveraging on the predefined constrained route for the transportation market. Based on these concepts, various algorithms in beampattern generation and adaptive weight control are evaluated and investigated in this thesis. The well known Generalized Sidelobe Cancellation (GSC) architecture is adopted in this dissertation. But it faces serious signal cancellation problem when the estimated DOA deviates from the actual DOA which is severe in high mobility scenarios as in the transportation market. Algorithms to improve various parts of the GSC are proposed in this dissertation. Firstly, a Cyclic Variable Step Size (CVSS) algorithm for adjusting the Least Mean Square (LMS) step size with simplicity for implementation is proposed and evaluated. Secondly, a Kalman filter based solution to fuse different sensor information for a faster estimation and tracking of the DOA is investigated and proposed. Thirdly, to address the DOA mismatch issue caused by the rapid DOA change, a fast blocking matrix generation algorithm named Simplifized Zero Placement Algorithm (SZPA) is proposed to mitigate the signal cancellation in GSC. Fourthly, to make the beam pattern robust against DOA mismatch, a fast algorithm for the generation of at beam pattern named Zero Placement Flat Top (ZPFT) for the fixed beamforming path in GSC is proposed. Finally, to evaluate the effectiveness and performance of the beamforming algorithms, wireless channel simulation is needed. One of the challenging aspects for wireless simulation is the coupling between Probability Density Function (PDF) and Power Spectral Density (PSD) for a random variable. In this regard, a simplified solution to simulate Non Gaussian wireless channel is proposed, proved and evaluated for the effectiveness of the algorithm. With the above optimizations, the controlled simulation shows that the at top beampattern can be generated 380 times faster than iterative optimization method and blocking matrix can be generated 9 times faster than normal SVD method while the same overall optimum state performance can be achieved

    Multichannel Speech Enhancement

    Get PDF

    Robust adaptive filtering algorithms for system identification and array signal processing in non-Gaussian environment

    Get PDF
    This dissertation proposes four new algorithms based on fractionally lower order statistics for adaptive filtering in a non-Gaussian interference environment. One is the affine projection sign algorithm (APSA) based on L₁ norm minimization, which combines the ability of decorrelating colored input and suppressing divergence when an outlier occurs. The second one is the variable-step-size normalized sign algorithm (VSS-NSA), which adjusts its step size automatically by matching the L₁ norm of the a posteriori error to that of noise. The third one adopts the same variable-step-size scheme but extends L₁ minimization to Lp minimization and the variable step-size normalized fractionally lower-order moment (VSS-NFLOM) algorithms are generalized. Instead of variable step size, the variable order is another trial to facilitate adaptive algorithms where no a priori statistics are available, which leads to the variable-order least mean pth norm (VO-LMP) algorithm, as the fourth one. These algorithms are applied to system identification for impulsive interference suppression, echo cancelation, and noise reduction. They are also applied to a phased array radar system with space-time adaptive processing (beamforming) to combat heavy-tailed non-Gaussian clutters. The proposed algorithms are tested by extensive computer simulations. The results demonstrate significant performance improvements in terms of convergence rate, steady-state error, computational simplicity, and robustness against impulsive noise and interference --Abstract, page iv

    Single- and multi-microphone speech dereverberation using spectral enhancement

    Get PDF
    In speech communication systems, such as voice-controlled systems, hands-free mobile telephones, and hearing aids, the received microphone signals are degraded by room reverberation, background noise, and other interferences. This signal degradation may lead to total unintelligibility of the speech and decreases the performance of automatic speech recognition systems. In the context of this work reverberation is the process of multi-path propagation of an acoustic sound from its source to one or more microphones. The received microphone signal generally consists of a direct sound, reflections that arrive shortly after the direct sound (commonly called early reverberation), and reflections that arrive after the early reverberation (commonly called late reverberation). Reverberant speech can be described as sounding distant with noticeable echo and colouration. These detrimental perceptual effects are primarily caused by late reverberation, and generally increase with increasing distance between the source and microphone. Conversely, early reverberations tend to improve the intelligibility of speech. In combination with the direct sound it is sometimes referred to as the early speech component. Reduction of the detrimental effects of reflections is evidently of considerable practical importance, and is the focus of this dissertation. More specifically the dissertation deals with dereverberation techniques, i.e., signal processing techniques to reduce the detrimental effects of reflections. In the dissertation, novel single- and multimicrophone speech dereverberation algorithms are developed that aim at the suppression of late reverberation, i.e., at estimation of the early speech component. This is done via so-called spectral enhancement techniques that require a specific measure of the late reverberant signal. This measure, called spectral variance, can be estimated directly from the received (possibly noisy) reverberant signal(s) using a statistical reverberation model and a limited amount of a priori knowledge about the acoustic channel(s) between the source and the microphone(s). In our work an existing single-channel statistical reverberation model serves as a starting point. The model is characterized by one parameter that depends on the acoustic characteristics of the environment. We show that the spectral variance estimator that is based on this model, can only be used when the source-microphone distance is larger than the so-called critical distance. This is, crudely speaking, the distance where the direct sound power is equal to the total reflective power. A generalization of the statistical reverberation model in which the direct sound is incorporated is developed. This model requires one additional parameter that is related to the ratio between the direct sound energy and the sound energy of all reflections. The generalized model is used to derive a novel spectral variance estimator. When the novel estimator is used for dereverberation rather than the existing estimator, and the source-microphone distance is smaller than the critical distance, the dereverberation performance is significantly increased. Single-microphone systems only exploit the temporal and spectral diversity of the received signal. Reverberation, of course, also induces spatial diversity. To additionally exploit this diversity, multiple microphones must be used, and their outputs must be combined by a suitable spatial processor such as the so-called delay and sum beamformer. It is not a priori evident whether spectral enhancement is best done before or after the spatial processor. For this reason we investigate both possibilities, as well as a merge of the spatial processor and the spectral enhancement technique. An advantage of the latter option is that the spectral variance estimator can be further improved. Our experiments show that the use of multiple microphones affords a significant improvement of the perceptual speech quality. The applicability of the theory developed in this dissertation is demonstrated using a hands-free communication system. Since hands-free systems are often used in a noisy and reverberant environment, the received microphone signal does not only contain the desired signal but also interferences such as room reverberation that is caused by the desired source, background noise, and a far-end echo signal that results from a sound that is produced by the loudspeaker. Usually an acoustic echo canceller is used to cancel the far-end echo. Additionally a post-processor is used to suppress background noise and residual echo, i.e., echo which could not be cancelled by the echo canceller. In this work a novel structure and post-processor for an acoustic echo canceller are developed. The post-processor suppresses late reverberation caused by the desired source, residual echo, and background noise. The late reverberation and late residual echo are estimated using the generalized statistical reverberation model. Experimental results convincingly demonstrate the benefits of the proposed system for suppressing late reverberation, residual echo and background noise. The proposed structure and post-processor have a low computational complexity, a highly modular structure, can be seamlessly integrated into existing hands-free communication systems, and affords a significant increase of the listening comfort and speech intelligibility

    Spatio-Temporal Analysis of Spontaneous Speech with Microphone Arrays

    Get PDF
    Accurate detection, localization and tracking of multiple moving speakers permits a wide spectrum of applications. Techniques are required that are versatile, robust to environmental variations, and not constraining for non-technical end-users. Based on distant recording of spontaneous multiparty conversations, this thesis focuses on the use of microphone arrays to address the question Who spoke where and when?. The speed, the versatility and the robustness of the proposed techniques are tested on a variety of real indoor recordings, including multiple moving speakers as well as seated speakers in meetings. Optimized implementations are provided in most cases. We propose to discretize the physical space into a few sectors, and for each time frame, to determine which sectors contain active acoustic sources (Where? When?). A topological interpretation of beamforming is proposed, which permits both to evaluate the average acoustic energy in a sector for a negligible cost, and to locate precisely a speaker within an active sector. One additional contribution that goes beyond the eld of microphone arrays is a generic, automatic threshold selection method, which does not require any training data. On the speaker detection task, the new approach is dramatically superior to the more classical approach where a threshold is set on training data. We use the new approach into an integrated system for multispeaker detection-localization. Another generic contribution is a principled, threshold-free, framework for short-term clustering of multispeaker location estimates, which also permits to detect where and when multiple trajectories intersect. On multi-party meeting recordings, using distant microphones only, short-term clustering yields a speaker segmentation performance similar to that of close-talking microphones. The resulting short speech segments are then grouped into speaker clusters (Who?), through an extension of the Bayesian Information Criterion to merge multiple modalities. On meeting recordings, the speaker clustering performance is signicantly improved by merging the classical mel-cepstrum information with the short-term speaker location information. Finally, a close analysis of the speaker clustering results suggests that future research should investigate the effect of human acoustic radiation characteristics on the overall transmission channel, when a speaker is a few meters away from a microphone

    Remote Sensing

    Get PDF
    This dual conception of remote sensing brought us to the idea of preparing two different books; in addition to the first book which displays recent advances in remote sensing applications, this book is devoted to new techniques for data processing, sensors and platforms. We do not intend this book to cover all aspects of remote sensing techniques and platforms, since it would be an impossible task for a single volume. Instead, we have collected a number of high-quality, original and representative contributions in those areas
    corecore