2,752 research outputs found

    Multichannel dynamic modeling of non-Gaussian mixtures

    Full text link
    [EN] This paper presents a novel method that combines coupled hidden Markov models (HMM) and non Gaussian mixture models based on independent component analyzer mixture models (ICAMM). The proposed method models the joint behavior of a number of synchronized sequential independent component analyzer mixture models (SICAMM), thus we have named it generalized SICAMM (G-SICAMM). The generalization allows for flexible estimation of complex data densities, subspace classification, blind source separation, and accurate modeling of both local and global dynamic interactions. In this work, the structured result obtained by G-SICAMM was used in two ways: classification and interpretation. Classification performance was tested on an extensive number of simulations and a set of real electroencephalograms (EEG) from epileptic patients performing neuropsychological tests. G-SICAMM outperformed the following competitive methods: Gaussian mixture models, HMM, Coupled HMM, ICAMM, SICAMM, and a long short-term memory (LSTM) recurrent neural network. As for interpretation, the structured result returned by G-SICAMM on EEGs was mapped back onto the scalp, providing a set of brain activations. These activations were consistent with the physiological areas activated during the tests, thus proving the ability of the method to deal with different kind of data densities and changing non-stationary and non-linear brain dynamics. (C) 2019 Elsevier Ltd. All rights reserved.This work was supported by Spanish Administration (Ministerio de Economia y Competitividad) and European Union (FEDER) under grants TEC2014-58438-R and TEC2017-84743-P.Safont Armero, G.; Salazar Afanador, A.; Vergara Domínguez, L.; Gomez, E.; Villanueva, V. (2019). Multichannel dynamic modeling of non-Gaussian mixtures. Pattern Recognition. 93:312-323. https://doi.org/10.1016/j.patcog.2019.04.022S3123239

    Bayesian spectral modeling for multiple time series

    Get PDF
    We develop a novel Bayesian modeling approach to spectral density estimation for multiple time series. The log-periodogram distribution for each series is modeled as a mixture of Gaussian distributions with frequency-dependent weights and mean functions. The implied model for the log-spectral density is a mixture of linear mean functions with frequency-dependent weights. The mixture weights are built through successive differences of a logit-normal distribution function with frequency-dependent parameters. Building from the construction for a single spectral density, we develop a hierarchical extension for multiple time series. Specifically, we set the mean functions to be common to all spectral densities and make the weights specific to the time series through the parameters of the logit-normal distribution. In addition to accommodating flexible spectral density shapes, a practically important feature of the proposed formulation is that it allows for ready posterior simulation through a Gibbs sampler with closed form full conditional distributions for all model parameters. The modeling approach is illustrated with simulated datasets, and used for spectral analysis of multichannel electroencephalographic recordings (EEGs), which provides a key motivating application for the proposed methodology

    Sparsity and adaptivity for the blind separation of partially correlated sources

    Get PDF
    Blind source separation (BSS) is a very popular technique to analyze multichannel data. In this context, the data are modeled as the linear combination of sources to be retrieved. For that purpose, standard BSS methods all rely on some discrimination principle, whether it is statistical independence or morphological diversity, to distinguish between the sources. However, dealing with real-world data reveals that such assumptions are rarely valid in practice: the signals of interest are more likely partially correlated, which generally hampers the performances of standard BSS methods. In this article, we introduce a novel sparsity-enforcing BSS method coined Adaptive Morphological Component Analysis (AMCA), which is designed to retrieve sparse and partially correlated sources. More precisely, it makes profit of an adaptive re-weighting scheme to favor/penalize samples based on their level of correlation. Extensive numerical experiments have been carried out which show that the proposed method is robust to the partial correlation of sources while standard BSS techniques fail. The AMCA algorithm is evaluated in the field of astrophysics for the separation of physical components from microwave data.Comment: submitted to IEEE Transactions on signal processin

    Bibliographic Review on Distributed Kalman Filtering

    Get PDF
    In recent years, a compelling need has arisen to understand the effects of distributed information structures on estimation and filtering. In this paper, a bibliographical review on distributed Kalman filtering (DKF) is provided.\ud The paper contains a classification of different approaches and methods involved to DKF. The applications of DKF are also discussed and explained separately. A comparison of different approaches is briefly carried out. Focuses on the contemporary research are also addressed with emphasis on the practical applications of the techniques. An exhaustive list of publications, linked directly or indirectly to DKF in the open literature, is compiled to provide an overall picture of different developing aspects of this area

    Robust speaker diarization for meetings

    Get PDF
    Aquesta tesi doctoral mostra la recerca feta en l'àrea de la diarització de locutor per a sales de reunions. En la present s'estudien els algorismes i la implementació d'un sistema en diferit de segmentació i aglomerat de locutor per a grabacions de reunions a on normalment es té accés a més d'un micròfon per al processat. El bloc més important de recerca s'ha fet durant una estada al International Computer Science Institute (ICSI, Berkeley, Caligornia) per un període de dos anys.La diarització de locutor s'ha estudiat força per al domini de grabacions de ràdio i televisió. La majoria dels sistemes proposats utilitzen algun tipus d'aglomerat jeràrquic de les dades en grups acústics a on de bon principi no se sap el número de locutors òptim ni tampoc la seva identitat. Un mètode molt comunment utilitzat s'anomena "bottom-up clustering" (aglomerat de baix-a-dalt), amb el qual inicialment es defineixen molts grups acústics de dades que es van ajuntant de manera iterativa fins a obtenir el nombre òptim de grups tot i acomplint un criteri de parada. Tots aquests sistemes es basen en l'anàlisi d'un canal d'entrada individual, el qual no permet la seva aplicació directa per a reunions. A més a més, molts d'aquests algorisms necessiten entrenar models o afinar els parameters del sistema usant dades externes, el qual dificulta l'aplicabilitat d'aquests sistemes per a dades diferents de les usades per a l'adaptació.La implementació proposada en aquesta tesi es dirigeix a solventar els problemes mencionats anteriorment. Aquesta pren com a punt de partida el sistema existent al ICSI de diarització de locutor basat en l'aglomerat de "baix-a-dalt". Primer es processen els canals de grabació disponibles per a obtindre un sol canal d'audio de qualitat major, a més dínformació sobre la posició dels locutors existents. Aleshores s'implementa un sistema de detecció de veu/silenci que no requereix de cap entrenament previ, i processa els segments de veu resultant amb una versió millorada del sistema mono-canal de diarització de locutor. Aquest sistema ha estat modificat per a l'ús de l'informació de posició dels locutors (quan es tingui) i s'han adaptat i creat nous algorismes per a que el sistema obtingui tanta informació com sigui possible directament del senyal acustic, fent-lo menys depenent de les dades de desenvolupament. El sistema resultant és flexible i es pot usar en qualsevol tipus de sala de reunions pel que fa al nombre de micròfons o la seva posició. El sistema, a més, no requereix en absolute dades d´entrenament, sent més senzill adaptar-lo a diferents tipus de dades o dominis d'aplicació. Finalment, fa un pas endavant en l'ús de parametres que siguin mes robusts als canvis en les dades acústiques. Dos versions del sistema es van presentar amb resultats excel.lents a les evaluacions de RT05s i RT06s del NIST en transcripció rica per a reunions, a on aquests es van avaluar amb dades de dos subdominis diferents (conferencies i reunions). A més a més, es fan experiments utilitzant totes les dades disponibles de les evaluacions RT per a demostrar la viabilitat dels algorisms proposats en aquesta tasca.This thesis shows research performed into the topic of speaker diarization for meeting rooms. It looks into the algorithms and the implementation of an offline speaker segmentation and clustering system for a meeting recording where usually more than one microphone is available. The main research and system implementation has been done while visiting the International Computes Science Institute (ICSI, Berkeley, California) for a period of two years. Speaker diarization is a well studied topic on the domain of broadcast news recordings. Most of the proposed systems involve some sort of hierarchical clustering of the data into clusters, where the optimum number of speakers of their identities are unknown a priory. A very commonly used method is called bottom-up clustering, where multiple initial clusters are iteratively merged until the optimum number of clusters is reached, according to some stopping criterion. Such systems are based on a single channel input, not allowing a direct application for the meetings domain. Although some efforts have been done to adapt such systems to multichannel data, at the start of this thesis no effective implementation had been proposed. Furthermore, many of these speaker diarization algorithms involve some sort of models training or parameter tuning using external data, which impedes its usability with data different from what they have been adapted to.The implementation proposed in this thesis works towards solving the aforementioned problems. Taking the existing hierarchical bottom-up mono-channel speaker diarization system from ICSI, it first uses a flexible acoustic beamforming to extract speaker location information and obtain a single enhanced signal from all available microphones. It then implements a train-free speech/non-speech detection on such signal and processes the resulting speech segments with an improved version of the mono-channel speaker diarization system. Such system has been modified to use speaker location information (then available) and several algorithms have been adapted or created new to adapt the system behavior to each particular recording by obtaining information directly from the acoustics, making it less dependent on the development data.The resulting system is flexible to any meetings room layout regarding the number of microphones and their placement. It is train-free making it easy to adapt to different sorts of data and domains of application. Finally, it takes a step forward into the use of parameters that are more robust to changes in the acoustic data. Two versions of the system were submitted with excellent results in RT05s and RT06s NIST Rich Transcription evaluations for meetings, where data from two different subdomains (lectures and conferences) was evaluated. Also, experiments using the RT datasets from all meetings evaluations were used to test the different proposed algorithms proving their suitability to the task.Postprint (published version

    Robust 3-way tensor decomposition and extended state Kalman filtering to extract fetal ECG

    Get PDF
    International audienceThis paper addresses the problem of fetal electrocardiogram (ECG) extraction from multichannel recordings. The proposed two-step method, which is applicable to as few as two channels, relies on (i) a deterministic tensor decomposition approach, (ii) a Kalman filtering. Tensor decomposition criteria that are robust to outliers are proposed and used to better track weak traces of the fetal ECG. Then, the state parameters used within an extended realistic nonlinear dynamic model for extraction of N ECGs from M mixtures of several ECGs and noise are estimated from the loading matrices provided by the first step. Application of the proposed method on actual data shows its significantly superior performance in comparison to the classic methods

    Robust 3-way tensor decomposition and extended state Kalman filtering to extract fetal ECG

    No full text
    International audienceThis paper addresses the problem of fetal electrocardiogram (ECG) extraction from multichannel recordings. The proposed two-step method, which is applicable to as few as two channels, relies on (i) a deterministic tensor decomposition approach, (ii) a Kalman filtering. Tensor decomposition criteria that are robust to outliers are proposed and used to better track weak traces of the fetal ECG. Then, the state parameters used within an extended realistic nonlinear dynamic model for extraction of N ECGs from M mixtures of several ECGs and noise are estimated from the loading matrices provided by the first step. Application of the proposed method on actual data shows its significantly superior performance in comparison to the classic methods
    corecore