Search CORE

5,224 research outputs found

Reconstruction of Speech Signals from their Unpredictable Points Manifold

Author: Daoudi Khalid
Khanagha Vahid
Pont Oriol
Turiel Antonio
Yahia Hussein
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 07/11/2011
Field of study

International audienceThis paper shows that a microcanonical approach to complexity, such as the Microcanonical Multiscale Formalism, provides new insights to analyze non-linear dynamics of speech, specifically in relation to the problem of speech samples classification according to their information content. Central to the approach is the precise computation of Local Predictability Exponents (LPEs) according to a procedure based on the evaluation of the degree of reconstructibility around a given point. We show that LPEs are key quantities related to predictability in the framework of reconstructible systems: it is possible to reconstruct the whole speech signal by applying a reconstruction kernel to a small subset of points selected according to their LPE value. This provides a strong indication of the importance of the Unpredictable Points Manifold(UPM), already demonstrated for other types of complex signals. Experiments show that a UPM containing around 12% of the points providesvery good perceptual reconstruction quality

INRIA a CCSD electronic archive server

Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression

Author: Deleforge Antoine
Girin Laurent
Horaud Radu
Schechner Yoav
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2015
Field of study

This paper addresses the problem of localizing audio sources using binaural measurements. We propose a supervised formulation that simultaneously localizes multiple sources at different locations. The approach is intrinsically efficient because, contrary to prior work, it relies neither on source separation, nor on monaural segregation. The method starts with a training stage that establishes a locally-linear Gaussian regression model between the directional coordinates of all the sources and the auditory features extracted from binaural measurements. While fixed-length wide-spectrum sounds (white noise) are used for training to reliably estimate the model parameters, we show that the testing (localization) can be extended to variable-length sparse-spectrum sounds (such as speech), thus enabling a wide range of realistic applications. Indeed, we demonstrate that the method can be used for audio-visual fusion, namely to map speech signals onto images and hence to spatially align the audio and visual modalities, thus enabling to discriminate between speaking and non-speaking faces. We release a novel corpus of real-room recordings that allow quantitative evaluation of the co-localization method in the presence of one or two sound sources. Experiments demonstrate increased accuracy and speed relative to several state-of-the-art methods.Comment: 15 pages, 8 figure

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

Chaos in the segments from Korean traditional singing and western singing

Author: Lee Jeong-No
Lee Myeong-Hwa
Soh Kwang-Sup
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 28/10/1997
Field of study

We investigate the time series of the segments from a Korean traditional song ``Gwansanyungma'' and a western song ``La Mamma Morta'' using chaotic analysis techniques. It is found that the phase portrait in the reconstructed state space of the time series of the segment from the Korean traditional song has a more complex structure in comparison with the segment from the western songs. The segment from the Korean traditional song has the correlation dimension 4.4 and two positive Lyapunov exponents which show that the dynamic related to the Korean traditional song is a high dimensional hyperchaotic process. On the other hand, the segment from the western song with only one positive Lyapunov exponent and the correlation dimension 2.5 exhibits low dimensional chaotic behavior.Comment: 23 pages including 10 eps figures, latex, to appear in J. Acoust. Soc. A

arXiv.org e-Print Archive

Crossref

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Author: Cichocki A.
Lee N.
Mandic D.
Oseledets I. V.
Phan A-H.
Sugiyama M.
Zhao Q.
Publication venue: 'Now Publishers'
Publication date: 01/01/2017
Field of study

Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

arXiv.org e-Print Archive

Crossref

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Author: Cichocki A.
Phan A-H.
Zhao Q.
Lee N.
Oseledets I. V.
Sugiyama M.
Mandic D.
Publication venue
Publication date: 01/01/2017
Field of study

arXiv.org e-Print Archive

Crossref

FigShare