157 research outputs found

    Informed Source Separation from compressed mixtures using spatial wiener filter and quantization noise estimation

    No full text
    International audienceIn a previous work, we proposed an Informed Source Separation sys- tem based on Wiener filtering for active listening of music from un- compressed (16-bit PCM) multichannel mix signals. In the present work, the system is improved to work with (MPEG-2 AAC) com- pressed mix signals: quantization noise is estimated from the AAC bitstream at the decoder and explicitly taken into account in the source separation process. Also a direct MDCT-to-STFT transform is used to optimize the computational efficiency of the process in the STFT domain from AAC-decoded MDCT coefficients

    Rétroingénierie du son pour l écoute active et autres applications

    Get PDF
    Ce travail s intéresse au problème de la rétroingénierie du son pour l écoute active. Le format considéré correspond au CD audio. Le contenu musical est vu comme le résultat d un enchaînement de la composition, l enregistrement, le mixage et le mastering. L inversion des deux dernières étapes constitue le fond du problème présent. Le signal audio est traité comme un mélange post-non-linéaire. Ainsi, le mélange est décompressé avant d'être décomposé en pistes audio. Le problème est abordé dans un contexte informé : l inversion est accompagnée d'une information qui est spécifique à la production du contenu. De cette manière, la qualité de l inversion est significativement améliorée. L information est réduite de taille en se servant des méthodes de quantification, codage, et des faits sur la psychoacoustique. Les méthodes proposées s appliquent en temps réel et montrent une complexité basse. Les résultats obtenus améliorent l état de l art et contribuent aux nouvelles connaissances.This work deals with the problem of reverse audio engineering for active listening. The format under consideration corresponds to the audio CD. The musical content is viewed as the result of a concatenation of the composition, the recording, the mixing, and the mastering. The inversion of the two latter stages constitutes the core of the problem at hand. The audio signal is treated as a post-nonlinear mixture. Thus, the mixture is decompressed before being decomposed into audio tracks. The problem is tackled in an informed context: The inversion is accompanied by information which is specific to the content production. In this manner, the quality of the inversion is significantly improved. The information is reduced in size by the use of quantification and coding methods, and some facts on psychoacoustics. The proposed methods are applicable in real time and have a low complexity. The obtained results advance the state of the art and contribute new insights.BORDEAUX1-Bib.electronique (335229901) / SudocSudocFranceF

    Quantization-aware Parameter Estimation for Audio Upmixing

    Get PDF
    International audienceUpmixing consists in extracting audio objects out of their downmix, given some parameters computed beforehand at a coding stage. It is an important task in audio processing with many applications in the entertainment industry. One particularly successful approach for this purpose is to compress the audio objects through nonnegative matrix factorization (NMF) parameters at the coder, to be used for separating the downmix at the decoder. In this paper, we focus on such NMF methods for audio compression, which operate at very low parameter bitrates. In existing methods, parameter estimation and quantization are conducted independently. Here, we propose two extensions: first, we jointly estimate and quantize the parameters at the coder to ensure good reconstruction at the decoder. Second, we propose a parameter refinement method operated at the decoder, that benefits from priors induced by quantization to yield better performance. We show that our contributions outperform existing baseline methods

    Object-based Modeling of Audio for Coding and Source Separation

    Get PDF
    This thesis studies several data decomposition algorithms for obtaining an object-based representation of an audio signal. The estimation of the representation parameters are coupled with audio-specific criteria, such as the spectral redundancy, sparsity, perceptual relevance and spatial position of sounds. The objective is to obtain an audio signal representation that is composed of meaningful entities called audio objects that reflect the properties of real-world sound objects and events. The estimation of the object-based model is based on magnitude spectrogram redundancy using non-negative matrix factorization with extensions to multichannel and complex-valued data. The benefits of working with object-based audio representations over the conventional time-frequency bin-wise processing are studied. The two main applications of the object-based audio representations proposed in this thesis are spatial audio coding and sound source separation from multichannel microphone array recordings. In the proposed spatial audio coding algorithm, the audio objects are estimated from the multichannel magnitude spectrogram. The audio objects are used for recovering the content of each original channel from a single downmixed signal, using time-frequency filtering. The perceptual relevance of modeling the audio signal is considered in the estimation of the parameters of the object-based model, and the sparsity of the model is utilized in encoding its parameters. Additionally, a quantization of the model parameters is proposed that reflects the perceptual relevance of each quantized element. The proposed object-based spatial audio coding algorithm is evaluated via listening tests and comparing the overall perceptual quality to conventional time-frequency block-wise methods at the same bitrates. The proposed approach is found to produce comparable coding efficiency while providing additional functionality via the object-based coding domain representation, such as the blind separation of the mixture of sound sources in the encoded channels. For the sound source separation from multichannel audio recorded by a microphone array, a method combining an object-based magnitude model and spatial covariance matrix estimation is considered. A direction of arrival-based model for the spatial covariance matrices of the sound sources is proposed. Unlike the conventional approaches, the estimation of the parameters of the proposed spatial covariance matrix model ensures a spatially coherent solution for the spatial parameterization of the sound sources. The separation quality is measured with objective criteria and the proposed method is shown to improve over the state-of-the-art sound source separation methods, with recordings done using a small microphone array

    Source Separation in the Presence of Side-information

    Get PDF
    The source separation problem involves the separation of unknown signals from their mixture. This problem is relevant in a wide range of applications from audio signal processing, communication, biomedical signal processing and art investigation to name a few. There is a vast literature on this problem which is based on either making strong assumption on the source signals or availability of additional data. This thesis proposes new algorithms for source separation with side information where one observes the linear superposition of two source signals plus two additional signals that are correlated with the mixed ones. The first algorithm is based on two ingredients: first, we learn a Gaussian mixture model (GMM) for the joint distribution of a source signal and the corresponding correlated side information signal; second, we separate the signals using standard computationally efficient conditional mean estimators. This also puts forth new recovery guarantees for this source separation algorithm. In particular, under the assumption that the signals can be perfectly described by a GMM model, we characterize necessary and sufficient conditions for reliable source separation in the asymptotic regime of low-noise as a function of the geometry of the underlying signals and their interaction. It is shown that if the subspaces spanned by the innovation components of the source signals with respect to the side information signals have zero intersection, provided that we observe a certain number of linear measurements from the mixture, then we can reliably separate the sources; otherwise we cannot. The second algorithms is based on deep learning where we introduce a novel self-supervised algorithm for the source separation problem. Source separation is intrinsically unsupervised and the lack of training data makes it a difficult task for artificial intelligence to solve. The proposed framework takes advantage of the available data and delivers near perfect separation results in real data scenarios. Our proposed frameworks – which provide new ways to incorporate side information to aid the solution of the source separation problem – are also employed in a real-world art investigation application involving the separation of mixtures of X-Ray images. The simulation results showcase the superiority of our algorithm against other state-of-the-art algorithms

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Principled methods for mixtures processing

    Get PDF
    This document is my thesis for getting the habilitation à diriger des recherches, which is the french diploma that is required to fully supervise Ph.D. students. It summarizes the research I did in the last 15 years and also provides the short­term research directions and applications I want to investigate. Regarding my past research, I first describe the work I did on probabilistic audio modeling, including the separation of Gaussian and α­stable stochastic processes. Then, I mention my work on deep learning applied to audio, which rapidly turned into a large effort for community service. Finally, I present my contributions in machine learning, with some works on hardware compressed sensing and probabilistic generative models.My research programme involves a theoretical part that revolves around probabilistic machine learning, and an applied part that concerns the processing of time series arising in both audio and life sciences

    Design of large polyphase filters in the Quadratic Residue Number System

    Full text link

    Temperature aware power optimization for multicore floating-point units

    Full text link
    • …
    corecore