Search CORE

30,071 research outputs found

Temporal Pattern Classification using Kernel Methods for Speech

Author: Chandrakala S.
Sekhar Chandra
Publication venue: 'Defence Scientific Information and Documentation Centre'
Publication date: 09/07/2010
Field of study

There are two paradigms for modelling the varying length temporal data namely, modelling the sequences of feature vectors as in the hidden Markov model-based approaches for speech recognition and modelling the sets of feature vectors as in the Gaussian mixture model (GMM)-based approaches for speech emotion recognition. In this paper, the methods using discrete hidden Markov models (DHMMs) in the kernel feature space and string kernel-based SVM classifier for classification of discretised representation of sequence of feature vectors obtained by clustering and vector quantisation in the kernel feature space are presented. The authors then present continuous density hidden Markov models (CDHMMs) in the explicit kernel feature space that use the continuous valued representation of features extracted from the temporal data. The methods for temporal pattern classification by mapping a varying length sequential pattern to a fixed-length sequential pattern and then using an SVM-based classifier for classification are also presented. The task of recognition of spoken letters in E-set, it is possible to build models that use a discretised representation and string kernel SVM based classification and obtain a classification performance better than that of models using the continuous valued representation is demonstrated. For modelling sets of vectors-based representation of temporal data, two approaches in a hybrid framework namely, the score vector-based approach and the segment modelling based approach are presented. In both approaches, a generative model-based method is used to obtain a fixed length pattern representation for a varying length temporal data and then a discriminative model is used for classification. These two approaches are studied for speech emotion recognition task. The segment modelling based approach gives a better performance than the score vector-based approach and the GMM-based classifiers for speech emotion recognition.Defence Science Journal, 2010, 60(4), pp.348-363, DOI:http://dx.doi.org/10.14429/dsj.60.49

Defence Science Journal

Phonocardiogram segmentation by using an hybrid RBF-HMM model

Author: Cardoso Manuel J.
Lima C. S.
Publication venue: 'ACTA Press'
Publication date: 16/02/2007
Field of study

This paper is concerned to the segmentation of heart sounds by using Radial-Basis Functions for acoustical modelling, combined with a Hidden Markov Model for heart sounds sequence modelling. The idea behind the use of RBF’s is to take advantage of the local approximations using exponentially decaying localized nonlinearities achieved by the Gaussian function, which increases the clustering power relatively to MLP’s. This neural model can be advantageous over the global approximations to nonlinear input-output mappings provided by Multilayer Perceptrons (MLP’s), especially when non-stationary processes need to be accurately modelled. The above described RBF’s properties combined with the non-stationary statistical properties of Hidden Markov Models can help in the detection of the T-wave which is fundamental for the detection of the second heart sound. The feature vectors are based on a MFCC based representation obtained from a spectral normalisation procedure, which showed better performance than the MFCC representation alone, in an Isolated Speech Recognition framework. Experimental results were evaluated on data collected from five different subjects, using CardioLab system and a Dash family patient monitor. The ECG leads I, II and III and an electronic stethoscope signal were sampled at 977 samples per second

Universidade do Minho: RepositoriUM

Generalized Species Sampling Priors with Latent Beta reinforcements

Author: Blei D.
Blei D.M.
Cardin N.
Curtis C.
Edoardo M. Airoldi
Fabrizio Leisen
Federico Bassetti
Ferguson J.D.
Fortini S.
Jara A.
Michele Guindani
Müller P.
Park J.H.
Sudderth E.B.
Sun W.
Thiago Costa
———
Publication venue
Publication date: 01/01/2014
Field of study

Many popular Bayesian nonparametric priors can be characterized in terms of exchangeable species sampling sequences. However, in some applications, exchangeability may not be appropriate. We introduce a {novel and probabilistically coherent family of non-exchangeable species sampling sequences characterized by a tractable predictive probability function with weights driven by a sequence of independent Beta random variables. We compare their theoretical clustering properties with those of the Dirichlet Process and the two parameters Poisson-Dirichlet process. The proposed construction provides a complete characterization of the joint process, differently from existing work. We then propose the use of such process as prior distribution in a hierarchical Bayes modeling framework, and we describe a Markov Chain Monte Carlo sampler for posterior inference. We evaluate the performance of the prior and the robustness of the resulting inference in a simulation study, providing a comparison with popular Dirichlet Processes mixtures and Hidden Markov Models. Finally, we develop an application to the detection of chromosomal aberrations in breast cancer by leveraging array CGH data.Comment: For correspondence purposes, Edoardo M. Airoldi's email is [email protected]; Federico Bassetti's email is [email protected]; Michele Guindani's email is [email protected] ; Fabrizo Leisen's email is [email protected]. To appear in the Journal of the American Statistical Associatio

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

PubMed Central

eScholarship - University of California

Kent Academic Repository

FigShare

Sequence Modelling For Analysing Student Interaction with Educational Systems

Author: Alstrup Stephen
Hansen Casper
Hansen Christian
Hjuler Niklas
Lioma Christina
Publication venue
Publication date: 25/06/2017
Field of study

The analysis of log data generated by online educational systems is an important task for improving the systems, and furthering our knowledge of how students learn. This paper uses previously unseen log data from Edulab, the largest provider of digital learning for mathematics in Denmark, to analyse the sessions of its users, where 1.08 million student sessions are extracted from a subset of their data. We propose to model students as a distribution of different underlying student behaviours, where the sequence of actions from each session belongs to an underlying student behaviour. We model student behaviour as Markov chains, such that a student is modelled as a distribution of Markov chains, which are estimated using a modified k-means clustering algorithm. The resulting Markov chains are readily interpretable, and in a qualitative analysis around 125,000 student sessions are identified as exhibiting unproductive student behaviour. Based on our results this student representation is promising, especially for educational systems offering many different learning usages, and offers an alternative to common approaches like modelling student behaviour as a single Markov chain often done in the literature.Comment: The 10th International Conference on Educational Data Mining 201

arXiv.org e-Print Archive

Copenhagen University Research Information System

Clustering Time Series from Mixture Polynomial Models with Discretised Data

Author: Bagnall AJ
Janacek GJ
Zhang M
Publication venue: University of East Anglia
Publication date: 01/01/2003
Field of study

Clustering time series is an active research area with applications in many fields. One common feature of time series is the likely presence of outliers. These uncharacteristic data can significantly effect the quality of clusters formed. This paper evaluates a method of over-coming the detrimental effects of outliers. We describe some of the alternative approaches to clustering time series, then specify a particular class of model for experimentation with k-means clustering and a correlation based distance metric. For data derived from this class of model we demonstrate that discretising the data into a binary series of above and below the median improves the clustering when the data has outliers. More specifically, we show that firstly discretisation does not significantly effect the accuracy of the clusters when there are no outliers and secondly it significantly increases the accuracy in the presence of outliers, even when the probability of outlier is very low

University of East Anglia digital repository