2,772 research outputs found

    Speech Synthesis Based on Hidden Markov Models

    Get PDF

    Speech Recognition based Automatic Earthquake Detection and Classification

    Get PDF
    Die moderne Seismologie zeichnet die Bodenbewegungen mit einem weltweit verteilten Stationsnetz kontinuierlich auf und gilt damit als datenreiche Wissenschaft. Die Extraktion der im Moment interessierenden Daten aus diesen kontinuierlichen Aufzeichnungen, seien es Erdbebensignale oder Nuklearsprengungen oder ä.m., ist eine Herausforderung an die bisher verwendeten Detektions- und Klassifizierungsalgorithmen

    Kalman tracking of linear predictor and harmonic noise models for noisy speech enhancement

    Get PDF
    This paper presents a speech enhancement method based on the tracking and denoising of the formants of a linear prediction (LP) model of the spectral envelope of speech and the parameters of a harmonic noise model (HNM) of its excitation. The main advantages of tracking and denoising the prominent energy contours of speech are the efficient use of the spectral and temporal structures of successive speech frames and a mitigation of processing artefact known as the ‘musical noise’ or ‘musical tones’.The formant-tracking linear prediction (FTLP) model estimation consists of three stages: (a) speech pre-cleaning based on a spectral amplitude estimation, (b) formant-tracking across successive speech frames using the Viterbi method, and (c) Kalman filtering of the formant trajectories across successive speech frames.The HNM parameters for the excitation signal comprise; voiced/unvoiced decision, the fundamental frequency, the harmonics’ amplitudes and the variance of the noise component of excitation. A frequency-domain pitch extraction method is proposed that searches for the peak signal to noise ratios (SNRs) at the harmonics. For each speech frame several pitch candidates are calculated. An estimate of the pitch trajectory across successive frames is obtained using a Viterbi decoder. The trajectories of the noisy excitation harmonics across successive speech frames are modeled and denoised using Kalman filters.The proposed method is used to deconstruct noisy speech, de-noise its model parameters and then reconstitute speech from its cleaned parts. Experimental evaluations show the performance gains of the formant tracking, pitch extraction and noise reduction stages

    Multichannel dynamic modeling of non-Gaussian mixtures

    Full text link
    [EN] This paper presents a novel method that combines coupled hidden Markov models (HMM) and non Gaussian mixture models based on independent component analyzer mixture models (ICAMM). The proposed method models the joint behavior of a number of synchronized sequential independent component analyzer mixture models (SICAMM), thus we have named it generalized SICAMM (G-SICAMM). The generalization allows for flexible estimation of complex data densities, subspace classification, blind source separation, and accurate modeling of both local and global dynamic interactions. In this work, the structured result obtained by G-SICAMM was used in two ways: classification and interpretation. Classification performance was tested on an extensive number of simulations and a set of real electroencephalograms (EEG) from epileptic patients performing neuropsychological tests. G-SICAMM outperformed the following competitive methods: Gaussian mixture models, HMM, Coupled HMM, ICAMM, SICAMM, and a long short-term memory (LSTM) recurrent neural network. As for interpretation, the structured result returned by G-SICAMM on EEGs was mapped back onto the scalp, providing a set of brain activations. These activations were consistent with the physiological areas activated during the tests, thus proving the ability of the method to deal with different kind of data densities and changing non-stationary and non-linear brain dynamics. (C) 2019 Elsevier Ltd. All rights reserved.This work was supported by Spanish Administration (Ministerio de Economia y Competitividad) and European Union (FEDER) under grants TEC2014-58438-R and TEC2017-84743-P.Safont Armero, G.; Salazar Afanador, A.; Vergara Domínguez, L.; Gomez, E.; Villanueva, V. (2019). Multichannel dynamic modeling of non-Gaussian mixtures. Pattern Recognition. 93:312-323. https://doi.org/10.1016/j.patcog.2019.04.022S3123239

    Hidden markov structures for dynamic copulae

    Get PDF
    This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.Understanding the time series dynamics of a multi-dimensional dependency structure is a challenging task. Multivariate covariance driven Gaussian or mixed normal time varying models have only a limited ability to capture important features of the data such as heavy tails, asymmetry, and nonlinear dependencies. The present paper tackles this problem by proposing and analyzing a hidden Markov model (HMM) for hierarchical Archimedean copulae (HAC). The HAC constitute a wide class of models for multi-dimensional dependencies, and HMM is a statistical technique for describing regime switching dynamics. HMM applied to HAC flexibly models multivariate dimensional non-Gaussian time series.We apply the expectation maximization (EM) algorithm for parameter estimation. Consistency results for both parameters and HAC structures are established in an HMM framework. The model is calibrated to exchange rate data with a VaR application. This example is motivated by a local adaptive analysis that yields a time varying HAC model. We compare its forecasting performance with that of other classical dynamic models. In another, second, application, we model a rainfall process. This task is of particular theoretical and practical interest because of the specific structure and required untypical treatment of precipitation data.Peer Reviewe

    Modelling Digital Media Objects

    Get PDF

    Hierarchical hidden Markov structure for dynamic correlations: the hierarchical RSDC model.

    Get PDF
    This paper presents a new multivariate GARCH model with time-varying conditional correlation structure which is a generalization of the Regime Switching Dynamic Correlation (RSDC) of Pelletier (2006). This model, which we name Hierarchical RSDC, is building with the hierarchical generalization of the hidden Markov model introduced by Fine et al. (1998). This can be viewed graphically as a tree-structure with different types of states. The first are called production states and they can emit observations, as in the classical Markov-Switching approach. The second are called abstract states. They can't emit observations but establish vertical and horizontal probabilities that define the dynamic of the hidden hierarchical structure. The main gain of this approach compared to the classical Markov-Switching model is to increase the granularity of the regimes. Our model is also compared to the new Double Smooth Transition Conditional Correlation GARCH model (DSTCC), a STAR approach for dynamic correlations proposed by Silvennoinen and Teräsvirta (2007). The reason is that under certain assumptions, the DSTCC and our model represent two classical competing approaches to modeling regime switching. We also perform Monte-Carlo simulations and we apply the model to two empirical applications studying the conditional correlations of selected stock returns. Results show that the Hierarchical RSDC provides a good measure of the correlations and also has an interesting explanatory power.Multivariate GARCH; Dynamic correlations; Regime switching; Markov chain; Hidden Markov models; Hierarchical Hidden Markov models
    corecore