200 research outputs found

    Adaptive Algorithms for Intelligent Acoustic Interfaces

    Get PDF
    Modern speech communications are evolving towards a new direction which involves users in a more perceptive way. That is the immersive experience, which may be considered as the “last-mile” problem of telecommunications. One of the main feature of immersive communications is the distant-talking, i.e. the hands-free (in the broad sense) speech communications without bodyworn or tethered microphones that takes place in a multisource environment where interfering signals may degrade the communication quality and the intelligibility of the desired speech source. In order to preserve speech quality intelligent acoustic interfaces may be used. An intelligent acoustic interface may comprise multiple microphones and loudspeakers and its peculiarity is to model the acoustic channel in order to adapt to user requirements and to environment conditions. This is the reason why intelligent acoustic interfaces are based on adaptive filtering algorithms. The acoustic path modelling entails a set of problems which have to be taken into account in designing an adaptive filtering algorithm. Such problems may be basically generated by a linear or a nonlinear process and can be tackled respectively by linear or nonlinear adaptive algorithms. In this work we consider such modelling problems and we propose novel effective adaptive algorithms that allow acoustic interfaces to be robust against any interfering signals, thus preserving the perceived quality of desired speech signals. As regards linear adaptive algorithms, a class of adaptive filters based on the sparse nature of the acoustic impulse response has been recently proposed. We adopt such class of adaptive filters, named proportionate adaptive filters, and derive a general framework from which it is possible to derive any linear adaptive algorithm. Using such framework we also propose some efficient proportionate adaptive algorithms, expressly designed to tackle problems of a linear nature. On the other side, in order to address problems deriving from a nonlinear process, we propose a novel filtering model which performs a nonlinear transformations by means of functional links. Using such nonlinear model, we propose functional link adaptive filters which provide an efficient solution to the modelling of a nonlinear acoustic channel. Finally, we introduce robust filtering architectures based on adaptive combinations of filters that allow acoustic interfaces to more effectively adapt to environment conditions, thus providing a powerful mean to immersive speech communications

    Adaptive Algorithms for Intelligent Acoustic Interfaces

    Get PDF
    Modern speech communications are evolving towards a new direction which involves users in a more perceptive way. That is the immersive experience, which may be considered as the “last-mile” problem of telecommunications. One of the main feature of immersive communications is the distant-talking, i.e. the hands-free (in the broad sense) speech communications without bodyworn or tethered microphones that takes place in a multisource environment where interfering signals may degrade the communication quality and the intelligibility of the desired speech source. In order to preserve speech quality intelligent acoustic interfaces may be used. An intelligent acoustic interface may comprise multiple microphones and loudspeakers and its peculiarity is to model the acoustic channel in order to adapt to user requirements and to environment conditions. This is the reason why intelligent acoustic interfaces are based on adaptive filtering algorithms. The acoustic path modelling entails a set of problems which have to be taken into account in designing an adaptive filtering algorithm. Such problems may be basically generated by a linear or a nonlinear process and can be tackled respectively by linear or nonlinear adaptive algorithms. In this work we consider such modelling problems and we propose novel effective adaptive algorithms that allow acoustic interfaces to be robust against any interfering signals, thus preserving the perceived quality of desired speech signals. As regards linear adaptive algorithms, a class of adaptive filters based on the sparse nature of the acoustic impulse response has been recently proposed. We adopt such class of adaptive filters, named proportionate adaptive filters, and derive a general framework from which it is possible to derive any linear adaptive algorithm. Using such framework we also propose some efficient proportionate adaptive algorithms, expressly designed to tackle problems of a linear nature. On the other side, in order to address problems deriving from a nonlinear process, we propose a novel filtering model which performs a nonlinear transformations by means of functional links. Using such nonlinear model, we propose functional link adaptive filters which provide an efficient solution to the modelling of a nonlinear acoustic channel. Finally, we introduce robust filtering architectures based on adaptive combinations of filters that allow acoustic interfaces to more effectively adapt to environment conditions, thus providing a powerful mean to immersive speech communications

    Online source separation in reverberant environments exploiting known speaker locations

    Get PDF
    This thesis concerns blind source separation techniques using second order statistics and higher order statistics for reverberant environments. A focus of the thesis is algorithmic simplicity with a view to the algorithms being implemented in their online forms. The main challenge of blind source separation applications is to handle reverberant acoustic environments; a further complication is changes in the acoustic environment such as when human speakers physically move. A novel time-domain method which utilises a pair of finite impulse response filters is proposed. The method of principle angles is defined which exploits a singular value decomposition for their design. The pair of filters are implemented within a generalised sidelobe canceller structure, thus the method can be considered as a beamforming method which cancels one source. An adaptive filtering stage is then employed to recover the remaining source, by exploiting the output of the beamforming stage as a noise reference. A common approach to blind source separation is to use methods that use higher order statistics such as independent component analysis. When dealing with realistic convolutive audio and speech mixtures, processing in the frequency domain at each frequency bin is required. As a result this introduces the permutation problem, inherent in independent component analysis, across the frequency bins. Independent vector analysis directly addresses this issue by modeling the dependencies between frequency bins, namely making use of a source vector prior. An alternative source prior for real-time (online) natural gradient independent vector analysis is proposed. A Student's t probability density function is known to be more suited for speech sources, due to its heavier tails, and is incorporated into a real-time version of natural gradient independent vector analysis. The final algorithm is realised as a real-time embedded application on a floating point Texas Instruments digital signal processor platform. Moving sources, along with reverberant environments, cause significant problems in realistic source separation systems as mixing filters become time variant. A method which employs the pair of cancellation filters, is proposed to cancel one source coupled with an online natural gradient independent vector analysis technique to improve average separation performance in the context of step-wise moving sources. This addresses `dips' in performance when sources move. Results show the average convergence time of the performance parameters is improved. Online methods introduced in thesis are tested using impulse responses measured in reverberant environments, demonstrating their robustness and are shown to perform better than established methods in a variety of situations

    The Subwoofer Room Impulse Response (SUBRIR) Database

    Full text link

    Polynomial eigenvalue decomposition for multichannel broadband signal processing

    Get PDF
    This article is devoted to the polynomial eigenvalue decomposition (PEVD) and its applications in broadband multichannel signal processing, motivated by the optimum solutions provided by the eigenvalue decomposition (EVD) for the narrow-band case [1], [2]. In general, the successful techniques from narrowband problems can also be applied to broadband ones, leading to improved solutions. Multichannel broadband signals arise at the core of many essential commercial applications such as telecommunications, speech processing, healthcare monitoring, astronomy and seismic surveillance, and military technologies like radar, sonar and communications [3]. The success of these applications often depends on the performance of signal processing tasks, including data compression [4], source localization [5], channel coding [6], signal enhancement [7], beamforming [8], and source separation [9]. In most cases and for narrowband signals, performing an EVD is the key to the signal processing algorithm. Therefore, this paper aims to introduce PEVD as a novel mathematical technique suitable for many broadband signal processing applications

    Spatial Multizone Soundfield Reproduction Design

    No full text
    It is desirable for people sharing a physical space to access different multimedia information streams simultaneously. For a good user experience, the interference of the different streams should be held to a minimum. This is straightforward for the video component but currently difficult for the audio sound component. Spatial multizone soundfield reproduction, which aims to provide an individual sound environment to each of a set of listeners without the use of physical isolation or headphones, has drawn significant attention of researchers in recent years. The realization of multizone soundfield reproduction is a conceptually challenging problem as currently most of the soundfield reproduction techniques concentrate on a single zone. This thesis considers the theory and design of a multizone soundfield reproduction system using arrays of loudspeakers in given complex environments. We first introduce a novel method for spatial multizone soundfield reproduction based on describing the desired multizone soundfield as an orthogonal expansion of formulated basis functions over the desired reproduction region. This provides the theoretical basis of both 2-D (height invariant) and 3-D soundfield reproduction for this work. We then extend the reproduction of the multizone soundfield over the desired region to reverberant environments, which is based on the identification of the acoustic transfer function (ATF) from the loudspeaker over the desired reproduction region using sparse methods. The simulation results confirm that the method leads to a significantly reduced number of required microphones for an accurate multizone sound reproduction compared with the state of the art, while it also facilitates the reproduction over a wide frequency range. In addition, we focus on the improvements of the proposed multizone reproduction system with regard to practical implementation. The so-called 2.5D multizone oundfield reproduction is considered to accurately reproduce the desired multizone soundfield over a selected 2-D plane at the height approximately level with the listener’s ears using a single array of loudspeakers with 3-D reverberant settings. Then, we propose an adaptive reverberation cancelation method for the multizone soundfield reproduction within the desired region and simplify the prior soundfield measurement process. Simulation results suggest that the proposed method provides a faster convergence rate than the comparative approaches under the same hardware provision. Finally, we conduct the real-world implementation based on the proposed theoretical work. The experimental results show that we can achieve a very noticeable acoustic energy contrast between the signals recorded in the bright zone and the quiet zone, especially for the system implementation with reverberation equalization

    Wavelet-based filtration procedure for denoising the predicted CO2 waveforms in smart home within the Internet of Things

    Get PDF
    The operating cost minimization of smart homes can be achieved with the optimization of the management of the building's technical functions by determination of the current occupancy status of the individual monitored spaces of a smart home. To respect the privacy of the smart home residents, indirect methods (without using cameras and microphones) are possible for occupancy recognition of space in smart homes. This article describes a newly proposed indirect method to increase the accuracy of the occupancy recognition of monitored spaces of smart homes. The proposed procedure uses the prediction of the course of CO2 concentration from operationally measured quantities (temperature indoor and relative humidity indoor) using artificial neural networks with a multilayer perceptron algorithm. The mathematical wavelet transformation method is used for additive noise canceling from the predicted course of the CO2 concentration signal with an objective increase accuracy of the prediction. The calculated accuracy of CO2 concentration waveform prediction in the additive noise-canceling application was higher than 98% in selected experiments.Web of Science203art. no. 62
    corecore