150 research outputs found

    Combined Sparse Regularization for Nonlinear Adaptive Filters

    Full text link
    Nonlinear adaptive filters often show some sparse behavior due to the fact that not all the coefficients are equally useful for the modeling of any nonlinearity. Recently, a class of proportionate algorithms has been proposed for nonlinear filters to leverage sparsity of their coefficients. However, the choice of the norm penalty of the cost function may be not always appropriate depending on the problem. In this paper, we introduce an adaptive combined scheme based on a block-based approach involving two nonlinear filters with different regularization that allows to achieve always superior performance than individual rules. The proposed method is assessed in nonlinear system identification problems, showing its effectiveness in taking advantage of the online combined regularization.Comment: This is a corrected version of the paper presented at EUSIPCO 2018 and published on IEEE https://ieeexplore.ieee.org/document/855295

    Adaptive Algorithms for Intelligent Acoustic Interfaces

    Get PDF
    Modern speech communications are evolving towards a new direction which involves users in a more perceptive way. That is the immersive experience, which may be considered as the “last-mile” problem of telecommunications. One of the main feature of immersive communications is the distant-talking, i.e. the hands-free (in the broad sense) speech communications without bodyworn or tethered microphones that takes place in a multisource environment where interfering signals may degrade the communication quality and the intelligibility of the desired speech source. In order to preserve speech quality intelligent acoustic interfaces may be used. An intelligent acoustic interface may comprise multiple microphones and loudspeakers and its peculiarity is to model the acoustic channel in order to adapt to user requirements and to environment conditions. This is the reason why intelligent acoustic interfaces are based on adaptive filtering algorithms. The acoustic path modelling entails a set of problems which have to be taken into account in designing an adaptive filtering algorithm. Such problems may be basically generated by a linear or a nonlinear process and can be tackled respectively by linear or nonlinear adaptive algorithms. In this work we consider such modelling problems and we propose novel effective adaptive algorithms that allow acoustic interfaces to be robust against any interfering signals, thus preserving the perceived quality of desired speech signals. As regards linear adaptive algorithms, a class of adaptive filters based on the sparse nature of the acoustic impulse response has been recently proposed. We adopt such class of adaptive filters, named proportionate adaptive filters, and derive a general framework from which it is possible to derive any linear adaptive algorithm. Using such framework we also propose some efficient proportionate adaptive algorithms, expressly designed to tackle problems of a linear nature. On the other side, in order to address problems deriving from a nonlinear process, we propose a novel filtering model which performs a nonlinear transformations by means of functional links. Using such nonlinear model, we propose functional link adaptive filters which provide an efficient solution to the modelling of a nonlinear acoustic channel. Finally, we introduce robust filtering architectures based on adaptive combinations of filters that allow acoustic interfaces to more effectively adapt to environment conditions, thus providing a powerful mean to immersive speech communications

    Adaptive Algorithms for Intelligent Acoustic Interfaces

    Get PDF
    Modern speech communications are evolving towards a new direction which involves users in a more perceptive way. That is the immersive experience, which may be considered as the “last-mile” problem of telecommunications. One of the main feature of immersive communications is the distant-talking, i.e. the hands-free (in the broad sense) speech communications without bodyworn or tethered microphones that takes place in a multisource environment where interfering signals may degrade the communication quality and the intelligibility of the desired speech source. In order to preserve speech quality intelligent acoustic interfaces may be used. An intelligent acoustic interface may comprise multiple microphones and loudspeakers and its peculiarity is to model the acoustic channel in order to adapt to user requirements and to environment conditions. This is the reason why intelligent acoustic interfaces are based on adaptive filtering algorithms. The acoustic path modelling entails a set of problems which have to be taken into account in designing an adaptive filtering algorithm. Such problems may be basically generated by a linear or a nonlinear process and can be tackled respectively by linear or nonlinear adaptive algorithms. In this work we consider such modelling problems and we propose novel effective adaptive algorithms that allow acoustic interfaces to be robust against any interfering signals, thus preserving the perceived quality of desired speech signals. As regards linear adaptive algorithms, a class of adaptive filters based on the sparse nature of the acoustic impulse response has been recently proposed. We adopt such class of adaptive filters, named proportionate adaptive filters, and derive a general framework from which it is possible to derive any linear adaptive algorithm. Using such framework we also propose some efficient proportionate adaptive algorithms, expressly designed to tackle problems of a linear nature. On the other side, in order to address problems deriving from a nonlinear process, we propose a novel filtering model which performs a nonlinear transformations by means of functional links. Using such nonlinear model, we propose functional link adaptive filters which provide an efficient solution to the modelling of a nonlinear acoustic channel. Finally, we introduce robust filtering architectures based on adaptive combinations of filters that allow acoustic interfaces to more effectively adapt to environment conditions, thus providing a powerful mean to immersive speech communications

    Combinations of adaptive filters

    Get PDF
    Adaptive filters are at the core of many signal processing applications, ranging from acoustic noise supression to echo cancelation [1], array beamforming [2], channel equalization [3], to more recent sensor network applications in surveillance, target localization, and tracking. A trending approach in this direction is to recur to in-network distributed processing in which individual nodes implement adaptation rules and diffuse their estimation to the network [4], [5].The work of Jerónimo Arenas-García and Luis Azpicueta-Ruiz was partially supported by the Spanish Ministry of Economy and Competitiveness (under projects TEC2011-22480 and PRI-PIBIN-2011-1266. The work of Magno M.T. Silva was partially supported by CNPq under Grant 304275/2014-0 and by FAPESP under Grant 2012/24835-1. The work of Vítor H. Nascimento was partially supported by CNPq under grant 306268/2014-0 and FAPESP under grant 2014/04256-2. The work of Ali Sayed was supported in part by NSF grants CCF-1011918 and ECCS-1407712. We are grateful to the colleagues with whom we have shared discussions and coauthorship of papers along this research line, especially Prof. Aníbal R. Figueiras-Vidal

    Study of L0-norm constraint normalized subband adaptive filtering algorithm

    Full text link
    Limited by fixed step-size and sparsity penalty factor, the conventional sparsity-aware normalized subband adaptive filtering (NSAF) type algorithms suffer from trade-off requirements of high filtering accurateness and quicker convergence behavior. To deal with this problem, this paper proposes variable step-size L0-norm constraint NSAF algorithms (VSS-L0-NSAFs) for sparse system identification. We first analyze mean-square-deviation (MSD) statistics behavior of the L0-NSAF algorithm innovatively in according to a novel recursion form and arrive at corresponding expressions for the cases that background noise variance is available and unavailable, where correlation degree of system input is indicated by scaling parameter r. Based on derivations, we develop an effective variable step-size scheme through minimizing the upper bounds of the MSD under some reasonable assumptions and lemma. To realize performance improvement, an effective reset strategy is incorporated into presented algorithms to tackle with non-stationary situations. Finally, numerical simulations corroborate that the proposed algorithms achieve better performance in terms of estimation accurateness and tracking capability in comparison with existing related algorithms in sparse system identification and adaptive echo cancellation circumstances.Comment: 15 pages,15 figure

    Linear and nonlinear room compensation of audio rendering systems

    Full text link
    [EN] Common audio systems are designed with the intent of creating real and immersive scenarios that allow the user to experience a particular acoustic sensation that does not depend on the room he is perceiving the sound. However, acoustic devices and multichannel rendering systems working inside a room, can impair the global audio effect and thus the 3D spatial sound. In order to preserve the spatial sound characteristics of multichannel rendering techniques, adaptive filtering schemes are presented in this dissertation to compensate these electroacoustic effects and to achieve the immersive sensation of the desired acoustic system. Adaptive filtering offers a solution to the room equalization problem that is doubly interesting. First of all, it iteratively solves the room inversion problem, which can become computationally complex to obtain when direct methods are used. Secondly, the use of adaptive filters allows to follow the time-varying room conditions. In this regard, adaptive equalization (AE) filters try to cancel the echoes due to the room effects. In this work, we consider this problem and propose effective and robust linear schemes to solve this equalization problem by using adaptive filters. To do this, different adaptive filtering schemes are introduced in the AE context. These filtering schemes are based on three strategies previously introduced in the literature: the convex combination of filters, the biasing of the filter weights and the block-based filtering. More specifically, and motivated by the sparse nature of the acoustic impulse response and its corresponding optimal inverse filter, we introduce different adaptive equalization algorithms. In addition, since audio immersive systems usually require the use of multiple transducers, the multichannel adaptive equalization problem should be also taken into account when new single-channel approaches are presented, in the sense that they can be straightforwardly extended to the multichannel case. On the other hand, when dealing with audio devices, consideration must be given to the nonlinearities of the system in order to properly equalize the electroacoustic system. For that purpose, we propose a novel nonlinear filtered-x approach to compensate both room reverberation and nonlinear distortion with memory caused by the amplifier and loudspeaker devices. Finally, it is important to validate the algorithms proposed in a real-time implementation. Thus, some initial research results demonstrate that an adaptive equalizer can be used to compensate room distortions.[ES] Los sistemas de audio actuales están diseñados con la idea de crear escenarios reales e inmersivos que permitan al usuario experimentar determinadas sensaciones acústicas que no dependan de la sala o situación donde se esté percibiendo el sonido. Sin embargo, los dispositivos acústicos y los sistemas multicanal funcionando dentro de salas, pueden perjudicar el efecto global sonoro y de esta forma, el sonido espacial 3D. Para poder preservar las características espaciales sonoras de los sistemas de reproducción multicanal, en esta tesis se presentan los esquemas de filtrado adaptativo para compensar dichos efectos electroacústicos y conseguir la sensación inmersiva del sistema sonoro deseado. El filtrado adaptativo ofrece una solución al problema de salas que es interesante por dos motivos. Por un lado, resuelve de forma iterativa el problema de inversión de salas, que puede llegar a ser computacionalmente costoso para los métodos de inversión directos existentes. Por otro lado, el uso de filtros adaptativos permite seguir las variaciones cambiantes de los efectos de la sala de escucha. A este respecto, los filtros de ecualización adaptativa (AE) intentan cancelar los ecos introducidos por la sala de escucha. En esta tesis se considera este problema y se proponen esquemas lineales efectivos y robustos para resolver el problema de ecualización mediante filtros adaptativos. Para conseguirlo, se introducen diferentes esquemas de filtrado adaptativo para AE. Estos esquemas de filtrado se basan en tres estrategias ya usadas en la literatura: la combinación convexa de filtros, el sesgado de los coeficientes del filtro y el filtrado basado en bloques. Más especificamente y motivado por la naturaleza dispersiva de las respuestas al impulso acústicas y de sus correspondientes filtros inversos óptimos, se presentan diversos algoritmos adaptativos de ecualización específicos. Además, ya que los sistemas de audio inmersivos requieren usar normalmente múltiples trasductores, se debe considerar también el problema de ecualización multicanal adaptativa cuando se diseñan nuevas estrategias de filtrado adaptativo para sistemas monocanal, ya que éstas deben ser fácilmente extrapolables al caso multicanal. Por otro lado, cuando se utilizan dispositivos acústicos, se debe considerar la existencia de no linearidades en el sistema elactroacústico, para poder ecualizarlo correctamente. Por este motivo, se propone un nuevo modelo no lineal de filtrado-x que compense a la vez la reverberación introducida por la sala y la distorsión no lineal con memoria provocada por el amplificador y el altavoz. Por último, es importante validar los algoritmos propuestos mediante implementaciones en tiempo real, para asegurarnos que pueden realizarse. Para ello, se presentan algunos resultados experimentales iniciales que muestran la idoneidad de la ecualización adaptativa en problemas de compensación de salas.[CA] Els sistemes d'àudio actuals es dissenyen amb l'objectiu de crear ambients reals i immersius que permeten a l'usuari experimentar una sensació acústica particular que no depèn de la sala on està percebent el so. No obstant això, els dispositius acústics i els sistemes de renderització multicanal treballant dins d'una sala poden arribar a modificar l'efecte global de l'àudio i per tant, l'efecte 3D del so a l'espai. Amb l'objectiu de conservar les característiques espacials del so obtingut amb tècniques de renderització multicanal, aquesta tesi doctoral presenta esquemes de filtrat adaptatiu per a compensar aquests efectes electroacústics i aconseguir una sensació immersiva del sistema acústic desitjat. El filtrat adaptatiu presenta una solució al problema d'equalització de sales que es interessant baix dos punts de vista. Per una banda, el filtrat adaptatiu resol de forma iterativa el problema inversió de sales, que pot arribar a ser molt complexe computacionalment quan s'utilitzen mètodes directes. Per altra banda, l'ús de filtres adaptatius permet fer un seguiment de les condicions canviants de la sala amb el temps. Més concretament, els filtres d'equalització adaptatius (EA) intenten cancel·lar els ecos produïts per la sala. A aquesta tesi, considerem aquest problema i proposem esquemes lineals efectius i robustos per a resoldre aquest problema d'equalització mitjançant filtres adaptatius. Per aconseguir-ho, diferent esquemes de filtrat adaptatiu es presenten dins del context del problema d'EA. Aquests esquemes de filtrat es basen en tres estratègies ja presentades a l'estat de l'art: la combinació convexa de filtres, el sesgat dels pesos del filtre i el filtrat basat en blocs. Més concretament, i motivat per la naturalesa dispersa de la resposta a l'impuls acústica i el corresponent filtre òptim invers, presentem diferents algorismes d'equalització adaptativa. A més a més, com que els sistemes d'àudio immersiu normalment requereixen l'ús de múltiples transductors, cal considerar també el problema d'equalització adaptativa multicanal quan es presenten noves solucions de canal simple, ja que aquestes s'han de poder estendre fàcilment al cas multicanal. Un altre aspecte a considerar quan es treballa amb dispositius d'àudio és el de les no linealitats del sistema a l'hora d'equalitzar correctament el sistema electroacústic. Amb aquest objectiu, a aquesta tesi es proposa una nova tècnica basada en filtrat-x no lineal, per a compensar tant la reverberació de la sala com la distorsió no lineal amb memòria introduïda per l'amplificador i els altaveus. Per últim, és important validar la implementació en temps real dels algorismes proposats. Amb aquest objectiu, alguns resultats inicials demostren la idoneïtat de l'equalització adaptativa en problemes de compensació de sales.Fuster Criado, L. (2015). Linear and nonlinear room compensation of audio rendering systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/5945

    Sparse nonlinear optimization for signal processing and communications

    Get PDF
    This dissertation proposes three classes of new sparse nonlinear optimization algorithms for network echo cancellation (NEC), 3-D synthetic aperture radar (SAR) image reconstruction, and adaptive turbo equalization in multiple-input multiple-output (MIMO) underwater acoustic (UWA) communications, respectively. For NEC, the proposed two proportionate affine projection sign algorithms (APSAs) utilize the sparse nature of the network impulse response (NIR). Benefiting from the characteristics of l₁-norm optimization, affine projection, and proportionate matrix, the new algorithms are more robust to impulsive interferences and colored input than the conventional adaptive algorithms. For 3-D SAR image reconstruction, the proposed two compressed sensing (CS) approaches exploit the sparse nature of the SAR holographic image. Combining CS with the range migration algorithms (RMAs), these approaches can decrease the load of data acquisition while recovering satisfactory 3-D SAR image through l₁-norm optimization. For MIMO UWA communications, a robust iterative channel estimation based minimum mean-square-error (MMSE) turbo equalizer is proposed for large MIMO detection. The MIMO channel estimation is performed jointly with the MMSE equalizer and the maximum a posteriori probability (MAP) decoder. The proposed MIMO detection scheme has been tested by experimental data and proved to be robust against tough MIMO channels. --Abstract, page iv

    Steady-State Performance of an Adaptive Combined MISO Filter Using the Multichannel Affine Projection Algorithm

    Get PDF
    The combination of adaptive filters is an effective approach to improve filtering performance. In this paper, we investigate the performance of an adaptive combined scheme between two adaptive multiple-input single-output (MISO) filters, which can be easily extended to the case of multiple outputs. In order to generalize the analysis, we consider the multichannel affine projection algorithm (APA) to update the coefficients of the MISO filters, which increases the possibility of exploiting the capabilities of the filtering scheme. Using energy conservation relations, we derive a theoretical behavior of the proposed adaptive combination scheme at steady state. Such analysis entails some further theoretical insights with respect to the single channel combination scheme. Simulation results prove both the validity of the theoretical steady-state analysis and the effectiveness of the proposed combined scheme.The work of Danilo Comminiello, Michele Scarpiniti and Aurelio Uncini has been supported by the project: “Vehicular Fog energy-efficient QoS mining and dissemination of multimedia Big Data streams (V-FoG and V-Fog2)”, funded by Sapienza University of Rome Bando 2016 and 2017. The work of Michele Scarpiniti and Aurelio Uncini has been also supported by the project: “GAUChO – A Green Adaptive Fog Computing and networking Architectures” funded by the MIUR Progetti di Ricerca di Rilevante Interesse Nazionale (PRIN) Bando 2015, grant 2015YPXH4W_004. The work of Luis A. Azpicueta-Ruiz is partially supported by the Spanish Ministry of Economy and Competitiveness (under grant DAMA (TIN2015-70308-REDT) and grants TEC2014-52289-R and TEC2017-83838-R), and by the European Union

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique

    An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony

    Get PDF
    In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique
    corecore