214 research outputs found

    Spectral analysis for nonstationary audio

    Full text link
    A new approach for the analysis of nonstationary signals is proposed, with a focus on audio applications. Following earlier contributions, nonstationarity is modeled via stationarity-breaking operators acting on Gaussian stationary random signals. The focus is on time warping and amplitude modulation, and an approximate maximum-likelihood approach based on suitable approximations in the wavelet transform domain is developed. This paper provides theoretical analysis of the approximations, and introduces JEFAS, a corresponding estimation algorithm. The latter is tested and validated on synthetic as well as real audio signal.Comment: IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, In pres

    Sparsity and persistence in time-frequency sound representations

    No full text
    13 pagesInternational audienceIt is a well known fact that the time-frequency domain is very well adapted for representing audio signals. The main two features of time-frequency representations of many classes of audio signals are sparsity (signals are generally well approximated using a small number of coefficients) and persistence (significant coefficients are not isolated, and tend to form clusters). This contribution presents signal approximation algorithms that exploit these properties, in the framework of hierarchical probabilistic models. Given a time-frequency frame (i.e. a Gabor frame, or a union of several Gabor frames or time-frequency bases), coefficients are first gathered into groups. A group of coefficients is then modeled as a random vector, whose distribution is governed by a hidden state associated with the group. Algorithms for parameter inference and hidden state estimation from analysis coefficients are described. The role of the chosen dictionary, and more particularly its structure, is also investigated. The proposed approach bears some resemblance with variational approaches previously proposed by the authors (in particular the variational approach exploiting mixed norms based regularization terms). In the framework of audio signal applications, the time-frequency frame under consideration is a union of two MDCT bases or two Gabor frames, in order to generate estimates for tonal and transient layers. Groups corresponding to tonal (resp. transient) coefficients are constant frequency (resp. constant time) time-frequency coefficients of a frequency-selective (resp. time-selective) MDCT basis or Gabor frame

    Detecting single-trial EEG evoked potential using a wavelet domain linear mixed model: application to error potentials classification

    Full text link
    Objective. The main goal of this work is to develop a model for multi-sensor signals such as MEG or EEG signals, that accounts for the inter-trial variability, suitable for corresponding binary classification problems. An important constraint is that the model be simple enough to handle small size and unbalanced datasets, as often encountered in BCI type experiments. Approach. The method involves linear mixed effects statistical model, wavelet transform and spatial filtering, and aims at the characterization of localized discriminant features in multi-sensor signals. After discrete wavelet transform and spatial filtering, a projection onto the relevant wavelet and spatial channels subspaces is used for dimension reduction. The projected signals are then decomposed as the sum of a signal of interest (i.e. discriminant) and background noise, using a very simple Gaussian linear mixed model. Main results. Thanks to the simplicity of the model, the corresponding parameter estimation problem is simplified. Robust estimates of class-covariance matrices are obtained from small sample sizes and an effective Bayes plug-in classifier is derived. The approach is applied to the detection of error potentials in multichannel EEG data, in a very unbalanced situation (detection of rare events). Classification results prove the relevance of the proposed approach in such a context. Significance. The combination of linear mixed model, wavelet transform and spatial filtering for EEG classification is, to the best of our knowledge, an original approach, which is proven to be effective. This paper improves on earlier results on similar problems, and the three main ingredients all play an important role

    On the time-frequency representation of operators and generalized Gabor multiplier approximations

    No full text
    28 pagesInternational audienceStarting from a general operator representation in the time-frequency do- main, this paper addresses the problem of approximating linear operators by operators that are diagonal or band-diagonal with respect to Gabor frames. A characterization of operators that can be realized as Gabor multipliers is given and necessary conditions for the existence of (Hilbert-Schmidt) optimal Gabor multiplier approximations are discussed and an efficient method for the calculation of an operator's best approximation by a Gabor multiplier is derived. The spreading function of Gabor multipliers yields new error estimates for these approximations. Generalizations (multiple Gabor multipliers) are introduced for better approximation of overspread operators. The Riesz property of the projection operators involved in generalized Gabor multipliers is characterized, and a method for obtaining an operator's best approximation by a multiple Gabor multiplier is suggested. Finally, it is shown that in certain situations, generalized Gabor multipliers reduce to a finite sum of regular Gabor multipliers with adapted windows

    Determining local transientness of audio signals

    No full text
    International audienceWe describe a new method for estimating the degree of “transientness” and “tonality” of a class of compound signals involving simultaneously transient and harmonic features. The key assumption is that both transient and tonal layers admit sparse expansions, respectively in wavelet and local cosine bases. The estimation is performed using particular form of entropy (or theoretical dimension) functions. We provide theoretical estimates on the behavior of the proposed estimators, as well as numerical simulations. Audio signal coding provides a natural field of application

    A family of random waveform models for audio coding

    No full text
    International audienceWe study the behavior of hybrid random waveform models for audio signals, involving sparse random series of waveforms, with random coefficients. Similar approaches have been considered in the recent years. However, these do generally not rely on explicit models, and are of more “algorithmical” nature. The models we propose allow us to analyze mathematical properties of such signals and corresponding estimators, and derive estimation algorithms, which do not rely on complex optimization techniques

    Representation of operators by sampling in the time-frequency domain

    No full text
    International audienceGabor multipliers are well-suited for the approximation of certain time-variant systems. However, this class of systems is rather restricted. To overcome this restriction, multiple Gabor multipliers allowing for more than one synthesis windows are introduced. The influence of the choice of the various parameters involved on approximation quality is studied for both classical and multiple Gabor multipliers
    • …
    corecore