359 research outputs found
Denoising sound signals in a bioinspired non-negative spectro-temporal domain
The representation of sound signals at the cochlea and auditory cortical level has been studied as an alternative to classical analysis methods. In this work, we put forward a recently proposed feature extraction method called approximate auditory cortical representation, based on an approximation to the statistics of discharge patterns at the primary auditory cortex. The approach here proposed estimates a non-negative sparse coding with a combined dictionary of atoms. These atoms represent the spectro-temporal receptive fields of the auditory cortical neurons, and are calculated from the auditory spectrograms of clean signal and noise. The denoising is carried out on noisy signals by the reconstruction of the signal discarding the atoms corresponding to the noise. Experiments are presented using synthetic (chirps) and real data (speech), in the presence of additive noise. For the evaluation of the new method and its variants, we used two objective measures: the perceptual evaluation of speech quality and the segmental signal-to-noise ratio. Results show that the proposed method improves the quality of the signals, mainly under severe degradation.Fil: MartÃnez, César Ernesto. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Goddard, J.. Universidad Autónoma Metropolitana; MéxicoFil: Di Persia, Leandro Ezequiel. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina. Universidad Nacional de Entre RÃos. Facultad de IngenierÃa; Argentin
A dynamic latent variable model for source separation
We propose a novel latent variable model for learning latent bases for time-varying non-negative data. Our model uses a mixture multinomial as the likelihood function and proposes a Dirichlet distribution with dynamic parameters as a prior, which we call the dynamic Dirichlet prior. An expectation maximization (EM) algorithm is developed for estimating the parameters of the proposed model. Furthermore, we connect our proposed dynamic Dirichlet latent variable model (dynamic DLVM) to the two popular latent basis learning methods - probabilistic latent component analysis (PLCA) and non-negative matrix factorization (NMF). We show that (i) PLCA is a special case of the dynamic DLVM, and (ii) dynamic DLVM can be interpreted as a dynamic version of NMF. The effectiveness of the proposed model is demonstrated through extensive experiments on speaker source separation, and speech-noise separation. In both cases, our method performs better than relevant and competitive baselines. For speaker separation, dynamic DLVM shows 1.38 dB improvement in terms of source to interference ratio, and 1 dB improvement in source to artifact ratio
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Audio Source Separation with Discriminative Scattering Networks
In this report we describe an ongoing line of research for solving
single-channel source separation problems. Many monaural signal decomposition
techniques proposed in the literature operate on a feature space consisting of
a time-frequency representation of the input data. A challenge faced by these
approaches is to effectively exploit the temporal dependencies of the signals
at scales larger than the duration of a time-frame. In this work we propose to
tackle this problem by modeling the signals using a time-frequency
representation with multiple temporal resolutions. The proposed representation
consists of a pyramid of wavelet scattering operators, which generalizes
Constant Q Transforms (CQT) with extra layers of convolution and complex
modulus. We first show that learning standard models with this multi-resolution
setting improves source separation results over fixed-resolution methods. As
study case, we use Non-Negative Matrix Factorizations (NMF) that has been
widely considered in many audio application. Then, we investigate the inclusion
of the proposed multi-resolution setting into a discriminative training regime.
We discuss several alternatives using different deep neural network
architectures
Deep Learning based Recommender System: A Survey and New Perspectives
With the ever-growing volume of online information, recommender systems have
been an effective strategy to overcome such information overload. The utility
of recommender systems cannot be overstated, given its widespread adoption in
many web applications, along with its potential impact to ameliorate many
problems related to over-choice. In recent years, deep learning has garnered
considerable interest in many research fields such as computer vision and
natural language processing, owing not only to stellar performance but also the
attractive property of learning feature representations from scratch. The
influence of deep learning is also pervasive, recently demonstrating its
effectiveness when applied to information retrieval and recommender systems
research. Evidently, the field of deep learning in recommender system is
flourishing. This article aims to provide a comprehensive review of recent
research efforts on deep learning based recommender systems. More concretely,
we provide and devise a taxonomy of deep learning based recommendation models,
along with providing a comprehensive summary of the state-of-the-art. Finally,
we expand on current trends and provide new perspectives pertaining to this new
exciting development of the field.Comment: The paper has been accepted by ACM Computing Surveys.
https://doi.acm.org/10.1145/328502
- …