442 research outputs found

    Matrix of Polynomials Model based Polynomial Dictionary Learning Method for Acoustic Impulse Response Modeling

    Get PDF
    We study the problem of dictionary learning for signals that can be represented as polynomials or polynomial matrices, such as convolutive signals with time delays or acoustic impulse responses. Recently, we developed a method for polynomial dictionary learning based on the fact that a polynomial matrix can be expressed as a polynomial with matrix coefficients, where the coefficient of the polynomial at each time lag is a scalar matrix. However, a polynomial matrix can be also equally represented as a matrix with polynomial elements. In this paper, we develop an alternative method for learning a polynomial dictionary and a sparse representation method for polynomial signal reconstruction based on this model. The proposed methods can be used directly to operate on the polynomial matrix without having to access its coefficients matrices. We demonstrate the performance of the proposed method for acoustic impulse response modeling.Comment: 5 pages, 2 figure

    Binary Sparse Coding of Convolutive Mixtures for Sound Localization and Separation via Spatialization

    Get PDF
    We propose a sparse coding approach to address the problem of source-sensor localization and speech reconstruction. This approach relies on designing a dictionary of spatialized signals by projecting the microphone array recordings into the array manifolds characterized for different locations in a reverberant enclosure using the image model. Sparse representation over this dictionary enables identifying the subspace of the actual recordings and its correspondence to the source and sensor locations. The speech signal is reconstructed by inverse filtering the acoustic channels associated to the array manifolds. We provide rigorous analysis on the optimality of speech reconstruction by elucidating the links between inverse filtering and source separation followed by deconvolution. This procedure is evaluated for localization, reconstruction and recognition of simultaneous speech sources using real data recordings. The results demonstrate the effectiveness of the proposed approach and compare favorably against beamforming and independent component analysis techniques

    A Tutorial on Auditory Attention Identification Methods

    Get PDF
    Auditory attention identification methods attempt to identify the sound source of a listener's interest by analyzing measurements of electrophysiological data. We present a tutorial on the numerous techniques that have been developed in recent decades, and we present an overview of current trends in multivariate correlation-based and model-based learning frameworks. The focus is on the use of linear relations between electrophysiological and audio data. The way in which these relations are computed differs. For example, canonical correlation analysis (CCA) finds a linear subset of electrophysiological data that best correlates to audio data and a similar subset of audio data that best correlates to electrophysiological data. Model-based (encoding and decoding) approaches focus on either of these two sets. We investigate the similarities and differences between these linear model philosophies. We focus on (1) correlation-based approaches (CCA), (2) encoding/decoding models based on dense estimation, and (3) (adaptive) encoding/decoding models based on sparse estimation. The specific focus is on sparsity-driven adaptive encoding models and comparing the methodology in state-of-the-art models found in the auditory literature. Furthermore, we outline the main signal processing pipeline for how to identify the attended sound source in a cocktail party environment from the raw electrophysiological data with all the necessary steps, complemented with the necessary MATLAB code and the relevant references for each step. Our main aim is to compare the methodology of the available methods, and provide numerical illustrations to some of them to get a feeling for their potential. A thorough performance comparison is outside the scope of this tutorial

    Listening to Distances and Hearing Shapes:Inverse Problems in Room Acoustics and Beyond

    Get PDF
    A central theme of this thesis is using echoes to achieve useful, interesting, and sometimes surprising results. One should have no doubts about the echoes' constructive potential; it is, after all, demonstrated masterfully by Nature. Just think about the bat's intriguing ability to navigate in unknown spaces and hunt for insects by listening to echoes of its calls, or about similar (albeit less well-known) abilities of toothed whales, some birds, shrews, and ultimately people. We show that, perhaps contrary to conventional wisdom, multipath propagation resulting from echoes is our friend. When we think about it the right way, it reveals essential geometric information about the sources--channel--receivers system. The key idea is to think of echoes as being more than just delayed and attenuated peaks in 1D impulse responses; they are actually additional sources with their corresponding 3D locations. This transformation allows us to forget about the abstract \emph{room}, and to replace it by more familiar \emph{point sets}. We can then engage the powerful machinery of Euclidean distance geometry. A problem that always arises is that we do not know \emph{a priori} the matching between the peaks and the points in space, and solving the inverse problem is achieved by \emph{echo sorting}---a tool we developed for learning correct labelings of echoes. This has applications beyond acoustics, whenever one deals with waves and reflections, or more generally, time-of-flight measurements. Equipped with this perspective, we first address the ``Can one hear the shape of a room?'' question, and we answer it with a qualified ``yes''. Even a single impulse response uniquely describes a convex polyhedral room, whereas a more practical algorithm to reconstruct the room's geometry uses only first-order echoes and a few microphones. Next, we show how different problems of localization benefit from echoes. The first one is multiple indoor sound source localization. Assuming the room is known, we show that discretizing the Helmholtz equation yields a system of sparse reconstruction problems linked by the common sparsity pattern. By exploiting the full bandwidth of the sources, we show that it is possible to localize multiple unknown sound sources using only a single microphone. We then look at indoor localization with known pulses from the geometric echo perspective introduced previously. Echo sorting enables localization in non-convex rooms without a line-of-sight path, and localization with a single omni-directional sensor, which is impossible without echoes. A closely related problem is microphone position calibration; we show that echoes can help even without assuming that the room is known. Using echoes, we can localize arbitrary numbers of microphones at unknown locations in an unknown room using only one source at an unknown location---for example a finger snap---and get the room's geometry as a byproduct. Our study of source localization outgrew the initial form factor when we looked at source localization with spherical microphone arrays. Spherical signals appear well beyond spherical microphone arrays; for example, any signal defined on Earth's surface lives on a sphere. This resulted in the first slight departure from the main theme: We develop the theory and algorithms for sampling sparse signals on the sphere using finite rate-of-innovation principles and apply it to various signal processing problems on the sphere

    Sparsity in Linear Predictive Coding of Speech

    Get PDF
    nrpages: 197status: publishe

    Convolutive Blind Source Separation Methods

    Get PDF
    In this chapter, we provide an overview of existing algorithms for blind source separation of convolutive audio mixtures. We provide a taxonomy, wherein many of the existing algorithms can be organized, and we present published results from those algorithms that have been applied to real-world audio separation tasks

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Get PDF
    The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist1
    • 

    corecore