74 research outputs found

    System approach to robust acoustic echo cancellation through semi-blind source separation based on independent component analysis

    Get PDF
    We live in a dynamic world full of noises and interferences. The conventional acoustic echo cancellation (AEC) framework based on the least mean square (LMS) algorithm by itself lacks the ability to handle many secondary signals that interfere with the adaptive filtering process, e.g., local speech and background noise. In this dissertation, we build a foundation for what we refer to as the system approach to signal enhancement as we focus on the AEC problem. We first propose the residual echo enhancement (REE) technique that utilizes the error recovery nonlinearity (ERN) to "enhances" the filter estimation error prior to the filter adaptation. The single-channel AEC problem can be viewed as a special case of semi-blind source separation (SBSS) where one of the source signals is partially known, i.e., the far-end microphone signal that generates the near-end acoustic echo. SBSS optimized via independent component analysis (ICA) leads to the system combination of the LMS algorithm with the ERN that allows for continuous and stable adaptation even during double talk. Second, we extend the system perspective to the decorrelation problem for AEC, where we show that the REE procedure can be applied effectively in a multi-channel AEC (MCAEC) setting to indirectly assist the recovery of lost AEC performance due to inter-channel correlation, known generally as the "non-uniqueness" problem. We develop a novel, computationally efficient technique of frequency-domain resampling (FDR) that effectively alleviates the non-uniqueness problem directly while introducing minimal distortion to signal quality and statistics. We also apply the system approach to the multi-delay filter (MDF) that suffers from the inter-block correlation problem. Finally, we generalize the MCAEC problem in the SBSS framework and discuss many issues related to the implementation of an SBSS system. We propose a constrained batch-online implementation of SBSS that stabilizes the convergence behavior even in the worst case scenario of a single far-end talker along with the non-uniqueness condition on the far-end mixing system. The proposed techniques are developed from a pragmatic standpoint, motivated by real-world problems in acoustic and audio signal processing. Generalization of the orthogonality principle to the system level of an AEC problem allows us to relate AEC to source separation that seeks to maximize the independence, hence implicitly the orthogonality, not only between the error signal and the far-end signal, but rather, among all signals involved. The system approach, for which the REE paradigm is just one realization, enables the encompassing of many traditional signal enhancement techniques in analytically consistent yet practically effective manner for solving the enhancement problem in a very noisy and disruptive acoustic mixing environment.PhDCommittee Chair: Biing-Hwang Juang; Committee Member: Brani Vidakovic; Committee Member: David V. Anderson; Committee Member: Jeff S. Shamma; Committee Member: Xiaoli M

    The analysis and improvement of focused source reproduction with wave field synthesis

    Get PDF
    This thesis presents a treatise on the rendering of focused sources using wave field synthesis (WFS). The thesis describes the fundamental theory of WFS and presents a thorough derivation of focused source driving functions including, monopoles, dipoles and pistonic sources. The principle characteristics of focused sources including, array truncation, spatial aliasing, pre-echo artefacts, colouration and amplitude errors are analysed in depth and a new spatial aliasing criterion is presented for focused sources. Additionally a new secondary source selection protocol is presented allowing for directed and symmetrically rendered sources. This thesis also describes how the low frequency rendering of focused sources is limited by the focusing ability of the loudspeaker array and thus derives a formula to predict the focusing limits and the corresponding focal shift that occurs at low frequencies and with short arrays. Subsequently a frequency dependent position correction is derived which increases the positional accuracy of the source. Other characteristics and issues with the rendering of focused sources are also described including the use of large arrays, rendering of moving focused sources, issues with multiple focused sources in the scene, the phase response, and the focal point size of focused sound field.The perceptual characteristics are also covered, with a review of the literature and a series of subjective tests into the localisation of focused sources. It is shown that an improvement in the localisation can be achieved by including the virtual first order images as point sources into the WFS rendering.Practical rendering of focused sources is generally done in compromised scenarios such as in non-anechoic, reverberant rooms which contain various scattering objects. These issues are also covered in this thesis with the aid of finite difference time domain models which allow the characterisation of room effects on the reproduced field, it is shown that room effects can actually even out spatial aliasing artefacts and therefore reduce the perception of colouration. Scattering objects can also be included in the model, thus the effects of scattering are also shown and a method of correcting for the scattering is suggested. Also covered is the rendering of focused sources using elevated arrays which can introduce position errors in the rendering

    Understanding sorting algorithms using music and spatial distribution

    Get PDF
    This thesis is concerned with the communication of information using auditory techniques. In particular, a music-based interface has been used to communicate the operation of a number of sorting algorithms to users. This auditory interface has been further enhanced by the creation of an auditory scene including a sound wall, which enables the auditory interface to utilise music parameters in conjunction with 2D/3D spatial distribution to communicate the essential processes in the algorithms. The sound wall has been constructed from a grid of measurements using a human head to create a spatial distribution. The algorithm designer can therefore communicate events using pitch, rhythm and timbre and associate these with particular positions in space. A number of experiments have been carried out to investigate the usefulness of music and the sound wall in communicating information relevant to the algorithms. Further, user understanding of the six algorithms has been tested. In all experiments the effects of previous musical experience has been allowed for. The results show that users can utilise musical parameters in understanding algorithms and that in all cases improvements have been observed using the sound wall. Different user performance was observed with different algorithms and it is concluded that certain types of information lend themselves more readily to communication through auditory interfaces than others. As a result of the experimental analysis, recommendations are given on how to improve the sound wall and user understanding by improved choice of the musical mappings

    The creation of a binaural spatialization tool

    Get PDF
    The main focus of the research presented within this thesis is, as the title suggests, binaural spatialization. Binaural technology and, especially, the binaural recording technique are not particu-larly recent. Nevertheless, the interest in this technology has lately become substantial due to the increase in the calculation power of personal computers, which started to allow the complete and accurate real-time simulation of three-dimensional sound-fields over headphones. The goals of this body of research have been determined in order to provide elements of novelty and of contribution to the state of the art in the field of binaural spatialization. A brief summary of these is found in the following list: ā€¢ The development and implementation of a binaural spatialization technique with Distance Simulation, based on the individual simulation of the distance cues and Binaural Reverb, in turn based on the weighted mix between the signals convolved with the different HRIR and BRIR sets; ā€¢ The development and implementation of a characterization process for modifying a BRIR set in order to simulate different environments with different characteristics in terms of frequency response and reverb time; ā€¢ The creation of a real-time and offline binaural spatialization application, imple-menting the techniques cited in the previous points, and including a set of multichannel(and Ambisonics)-to-binaural conversion tools. ā€¢ The performance of a perceptual evaluation stage to verify the effectiveness, realism, and quality of the techniques developed, and ā€¢ The application and use of the developed tools within both scientific and artistic ā€œcase studiesā€. In the following chapters, sections, and subsections, the research performed between January 2006 and March 2010 will be described, outlining the different stages before, during, and after the development of the software platform, analysing the results of the perceptual evaluations and drawing conclusions that could, in the future, be considered the starting point for new and innovative research projects

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    The development of a design tool for 5-speaker surround sound decoders

    Get PDF
    This thesis presents the development of a software-based decoder design tool (DDT) for producing Ambisonic decoders optimised for playback over 5-speaker layouts. The research specifically focuses on developing decoders for irregular layouts with loudspeakers at a constant radial distance from the central listening position. It was motivated by the desire to provide better surround sound over the standard ITU 5-speaker layout for listeners in the sweet spot and off-centre positions. A wide-ranging literature review is presented revealing the need for such work. The DDT employs the Tabu Search algorithm to seek improved decoder parameters according to a multi-objective fitness function. The fitness function encapsulates criteria from psychoacoustic models as a set of objectives. In order to ensure the objectives were treated equally a method known as ā€žrange-removalā€Ÿ was used for the first time in Ambisonic decoder design. A companion technique termed ā€žimportanceā€Ÿ allows the systematic prioritisation of range-removed objectives giving a designer control over desired decoder criteria. Additional elements exist in the DDT that can be turned on or off in different combinations. They include: a novel component for producing decoders with even performance by angle, a novel component for producing performance that correlates with the pattern of human spatial resolution estimated in previous Minimum Audible Angle experiments, and the ability to produce frequency dependent or independent decoders of different orders. Moreover, the user of the DDT can optimise performance for a single listener or multiple distributed listeners. To make the DDT as interactive as possible searches can optionally run on a High Performance Computer. This thesis also details the extensive testing of Ambisonic decoders for the ITU layout. Decoders have been assessed subjectively in listening tests and objectively using binaural measurements which has verified the methods developed in this research and the DDTā€Ÿs concept. Furthermore, decoders derived by the DDT have been compared to existing decoders and the results show they give equal or better performance. The development of a fully-functioning DDT which incorporates techniques for range-removal, importance, even performance by angle, minimum audible angle, off-centre listeners and their use in any combination represent the key outcomes of this work.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    • ā€¦
    corecore