138 research outputs found

    An investigation of discrete-state discriminant approaches to single-sensor source separation

    Get PDF
    International audienceThis paper investigated a new scheme for single-sensor audio source separation. This framework is introduced comparatively to the existing Gaussian mixture model generative approach and is focusing on the mixture states rather than on the source states, resulting in a discrete, joint state discriminant approach. The study establishes the theoretical performance bounds of the proposed scheme and an actual source separation system is designed. The performance is computed on a set of musical recordings and a discussion is proposed, including the question of the source correlation and the possible drawbacks of the method

    Optimal spectral transportation with application to music transcription

    Get PDF
    International audienceMany spectral unmixing methods rely on the non-negative decomposition of spectral data onto a dictionary of spectral templates. In particular, state-of-the-art music transcription systems decompose the spectrogram of the input signal onto a dictionary of representative note spectra. The typical measures of fit used to quantify the adequacy of the decomposition compare the data and template entries frequency-wise. As such, small displacements of energy from a frequency bin to another as well as variations of timbre can disproportionally harm the fit. We address these issues by means of optimal transportation and propose a new measure of fit that treats the frequency distributions of energy holistically as opposed to frequency-wise. Building on the harmonic nature of sound, the new measure is invariant to shifts of energy to harmonically-related frequencies, as well as to small and local displacements of energy. Equipped with this new measure of fit, the dictionary of note templates can be considerably simplified to a set of Dirac vectors located at the target fundamental frequencies (musical pitch values). This in turns gives ground to a very fast and simple decomposition algorithm that achieves state-of-the-art performance on real musical data. 1 Context Many of nowadays spectral unmixing techniques rely on non-negative matrix decompositions. This concerns for example hyperspectral remote sensing (with applications in Earth observation, astronomy, chemistry, etc.) or audio signal processing. The spectral sample v n (the spectrum of light observed at a given pixel n, or the audio spectrum in a given time frame n) is decomposed onto a dictionary W of elementary spectral templates, characteristic of pure materials or sound objects, such that v n ≈ Wh n. The composition of sample n can be inferred from the non-negative expansion coefficients h n. This paradigm has led to state-of-the-art results for various tasks (recognition, classification, denoising, separation) in the aforementioned areas, and in particular in music transcription, the central application of this paper. In state-of-the-art music transcription systems, the spectrogram V (with columns v n) of a musical signal is decomposed onto a dictionary of pure notes (in so-called multi-pitch estimation) or chords. V typically consists of (power-)magnitude values of a regular short-time Fourier transform (Smaragdis and Brown, 2003). It may also consists of an audio-specific spectral transform such as the Mel-frequency transform, like in (Vincent et al., 2010), or the Q-constant based transform, like in (Oudre et al., 2011). The success of the transcription system depends of course on the adequacy of the time-frequency transform & the dictionary to represent the data V

    Fast Solving of the Group-Lasso via Dynamic Screening

    Get PDF
    International audienceWe propose to extend the dynamic screening principle, initially designed for the Lasso, to the Group-Lasso

    A Dynamic Screening Principle for the Lasso

    Get PDF
    International audienceThe Lasso is an optimization problem devoted to finding a sparse representation of some signal with respect to a predefined dictionary. An original and computationally-efficient method is proposed here to solve this problem, based on a dynamic screening principle. It makes it possible to accelerate a large class of optimization algorithms by iteratively reducing the size of the dictionary during the optimization process, discarding elements that are provably known not to belong to the solution of the Lasso. The iterative reduction of the dictionary is what we call dynamic screening. As this screening step is inexpensive, the computational cost of the algorithm using our dynamic screening strategy is lower than that of the base algorithm. Numerical experiments on synthetic and real data support the relevance of this approach

    Dynamic Screening: Accelerating First-Order Algorithms for the Lasso and Group-Lasso

    Get PDF
    Recent computational strategies based on screening tests have been proposed to accelerate algorithms addressing penalized sparse regression problems such as the Lasso. Such approaches build upon the idea that it is worth dedicating some small computational effort to locate inactive atoms and remove them from the dictionary in a preprocessing stage so that the regression algorithm working with a smaller dictionary will then converge faster to the solution of the initial problem. We believe that there is an even more efficient way to screen the dictionary and obtain a greater acceleration: inside each iteration of the regression algorithm, one may take advantage of the algorithm computations to obtain a new screening test for free with increasing screening effects along the iterations. The dictionary is henceforth dynamically screened instead of being screened statically, once and for all, before the first iteration. We formalize this dynamic screening principle in a general algorithmic scheme and apply it by embedding inside a number of first-order algorithms adapted existing screening tests to solve the Lasso or new screening tests to solve the Group-Lasso. Computational gains are assessed in a large set of experiments on synthetic data as well as real-world sounds and images. They show both the screening efficiency and the gain in terms running times

    Sparse underwater acoustic imaging: a case study

    Get PDF
    International audienceUnderwater acoustic imaging is traditionally performed with beam- forming: beams are formed at emission to insonify limited angular regions; beams are (synthetically) formed at reception to form the image. We propose to exploit a natural sparsity prior to perform 3D underwater imaging using a newly built ïŹ‚exible-conïŹguration sonar device. The computational challenges raised by the high- dimensionality of the problem are highlighted, and we describe a strategy to overcome them. As a proof of concept, the proposed approach is used on real data acquired with the new sonar to obtain an image of an underwater target. We discuss the merits of the obtained image in comparison with standard beamforming, as well as the main challenges lying ahead, and the bottlenecks that will need to be solved before sparse methods can be fully exploited in the context of underwater compressed 3D sonar imaging

    The recent history of an insular bat population reveals an environmental disequilibrium and conservation concerns

    Get PDF
    With the global pandemic of Covid-19, the putative threats related to the increasing contact between wild animals, including bats, and human populations have been highlighted. Bats are indeed known to carry several zoonoses, but at the same time, many species are currently facing the risk of extinction. In this context, being able to monitor the evolution of bat populations in the long term and predict future potential contact with humans has important implications for conservation and public health. In this study, we attempt to demonstrate the usefulness of a small-scale paleobiological approach to track the evolution of an insular population of Antillean fruit-eating bats (Brachyphylla cavernarum), known to carry zoonoses, by documenting the temporal evolution of a cave roosting site and its approximately 250 000 individuals bat colony. To do so, we conducted a stratigraphic analysis of the sedimentary infilling of the cave, as well as a taphonomic and paleobiological analysis of the bone contents of the sediment. Additionally, we performed a neotaphonomic study of an assemblage of scats produced by cats that had consumed bats on-site. Our results reveal the effects of human-induced environmental disturbances, as well as conservation policies, on the bat colony. They also demonstrate that the roosting site is currently filling at a very fast pace, which may lead to the displacement of the bat colony and increased contact between bats and human populations in the near future. Our research outcomes advocate for a better consideration of retrospective paleobiological data to address conservation questions related to bat populations

    Sparse reconstruction techniques for near-field underwater acoustic imaging

    Get PDF
    International audienceThe use of sparse priors has shown interesting potential in various acoustic or radar imaging applications. In this paper, sparse reconstruction is applied for underwater acoustic imaging using a newly built flexible sonar device. We investigate several models concerning the linear mapping between the image domain and the observation domain. In particular, we define a point-scatterer model in which the apparent back-scatter coefficient of a given reflector varies with respect to the specific emitter and receiver locations. To handle this problem, we adapt a multi-channel version of the orthogonal matching pursuit and we apply it on real data in order to obtain images of an underwater target placed at a small distance from the sonar. The techniques are shown to overcome bottlenecks that are apparent with more standard approaches that assume far-field conditions when building the image

    Underwater acoustic imaging: sparse models and implementation issues

    Get PDF
    Projet ANR : ANR-09-EMER-001International audienceWe present recent work on sparse models for underwater acoustic imaging and on implementation of imaging methods with real data. By considering physical issues like non-isotropic scattering and non-optimal calibration, we have designed several structured sparse models. Greedy algorithms are used to estimate the sparse representations. Our work includes the design of real experiments in a tank. Several series of data have been collected and processed. For such a realistic scenario, data and representations live in high-dimensional spaces. We introduce algorithmic adaptations to deal with the resulting computational issues. The imaging results obtained by our methods are finally compared to standard beamforming imaging

    A Reproducible Research Framework for Audio Inpainting

    Get PDF
    International audienceWe introduce a unified framework for the restoration of distorted audio data, leveraging the Image Inpainting concept and covering existing audio applications. In this framework, termed Audio Inpainting, the distorted data is considered missing and its location is assumed to be known. We further introduce baseline approaches based on sparse representations. For this new audio inpainting concept, we provide reproducible-research tools including: the handling of audio inpainting tasks as inverse problems, embedded in a frame-based scheme similar to patch-based image processing; several experimental settings; speech and music material; OMP-like algorithms, with two dictionaries, for general audio inpainting or specifically-enhanced declipping
    • 

    corecore