30,071 research outputs found
Temporal Pattern Classification using Kernel Methods for Speech
There are two paradigms for modelling the varying length temporal data namely, modelling the sequences of feature vectors as in the hidden Markov model-based approaches for speech recognition and modelling the sets of feature vectors as in the Gaussian mixture model (GMM)-based approaches for speech emotion recognition. In this paper, the methods using discrete hidden Markov models (DHMMs) in the kernel feature space and string kernel-based SVM classifier for classification of discretised representation of sequence of feature vectors obtained by clustering and vector quantisation in the kernel feature space are presented. The authors then present continuous density hidden Markov models (CDHMMs) in the explicit kernel feature space that use the continuous valued representation of features extracted from the temporal data. The methods for temporal pattern classification by mapping a varying length sequential pattern to a fixed-length sequential pattern and then using an SVM-based classifier for classification are also presented. The task of recognition of spoken letters in E-set, it is possible to build models that use a discretised representation and string kernel SVM based classification and obtain a classification performance better than that of models using the continuous valued representation is demonstrated. For modelling sets of vectors-based representation of temporal data, two approaches in a hybrid framework namely, the score vector-based approach and the segment modelling based approach are presented. In both approaches, a generative model-based method is used to obtain a fixed length pattern representation for a varying length temporal data and then a discriminative model is used for classification. These two approaches are studied for speech emotion recognition task. The segment modelling based approach gives a better performance than the score vector-based approach and the GMM-based classifiers for speech emotion recognition.Defence Science Journal, 2010, 60(4), pp.348-363, DOI:http://dx.doi.org/10.14429/dsj.60.49
Phonocardiogram segmentation by using an hybrid RBF-HMM model
This paper is concerned to the segmentation of heart sounds by using Radial-Basis Functions for acoustical modelling, combined with a Hidden Markov Model for heart sounds sequence modelling. The idea behind the use of RBF’s is to take advantage of the local approximations using exponentially decaying localized nonlinearities achieved by the Gaussian function, which increases the clustering power relatively to MLP’s. This neural model can be advantageous over the global approximations to nonlinear input-output mappings provided by Multilayer Perceptrons (MLP’s), especially when non-stationary processes need to be accurately modelled.
The above described RBF’s properties combined with the non-stationary statistical properties of Hidden Markov Models can help in the detection of the T-wave which is fundamental for the detection of the second heart sound.
The feature vectors are based on a MFCC based representation obtained from a spectral normalisation procedure, which showed better performance than the MFCC representation alone, in an Isolated Speech Recognition framework. Experimental results were evaluated on data collected from five different subjects, using CardioLab system and a Dash family patient monitor. The ECG leads I, II and III and an electronic stethoscope signal were sampled at 977 samples per second
Generalized Species Sampling Priors with Latent Beta reinforcements
Many popular Bayesian nonparametric priors can be characterized in terms of
exchangeable species sampling sequences. However, in some applications,
exchangeability may not be appropriate. We introduce a {novel and
probabilistically coherent family of non-exchangeable species sampling
sequences characterized by a tractable predictive probability function with
weights driven by a sequence of independent Beta random variables. We compare
their theoretical clustering properties with those of the Dirichlet Process and
the two parameters Poisson-Dirichlet process. The proposed construction
provides a complete characterization of the joint process, differently from
existing work. We then propose the use of such process as prior distribution in
a hierarchical Bayes modeling framework, and we describe a Markov Chain Monte
Carlo sampler for posterior inference. We evaluate the performance of the prior
and the robustness of the resulting inference in a simulation study, providing
a comparison with popular Dirichlet Processes mixtures and Hidden Markov
Models. Finally, we develop an application to the detection of chromosomal
aberrations in breast cancer by leveraging array CGH data.Comment: For correspondence purposes, Edoardo M. Airoldi's email is
[email protected]; Federico Bassetti's email is
[email protected]; Michele Guindani's email is
[email protected] ; Fabrizo Leisen's email is
[email protected]. To appear in the Journal of the American
Statistical Associatio
Sequence Modelling For Analysing Student Interaction with Educational Systems
The analysis of log data generated by online educational systems is an
important task for improving the systems, and furthering our knowledge of how
students learn. This paper uses previously unseen log data from Edulab, the
largest provider of digital learning for mathematics in Denmark, to analyse the
sessions of its users, where 1.08 million student sessions are extracted from a
subset of their data. We propose to model students as a distribution of
different underlying student behaviours, where the sequence of actions from
each session belongs to an underlying student behaviour. We model student
behaviour as Markov chains, such that a student is modelled as a distribution
of Markov chains, which are estimated using a modified k-means clustering
algorithm. The resulting Markov chains are readily interpretable, and in a
qualitative analysis around 125,000 student sessions are identified as
exhibiting unproductive student behaviour. Based on our results this student
representation is promising, especially for educational systems offering many
different learning usages, and offers an alternative to common approaches like
modelling student behaviour as a single Markov chain often done in the
literature.Comment: The 10th International Conference on Educational Data Mining 201
Clustering Time Series from Mixture Polynomial Models with Discretised Data
Clustering time series is an active research area with applications in many fields. One common feature of time series is the likely presence of outliers. These uncharacteristic data can significantly effect the quality of clusters formed. This paper evaluates a method of over-coming the detrimental effects of outliers. We describe some of the alternative approaches to clustering time series, then specify a particular class of model for experimentation with k-means clustering and a correlation based distance metric. For data derived from this class of model we demonstrate that discretising the data into a binary series of above and below the median improves the clustering when the data has outliers. More specifically, we show that firstly discretisation does not significantly effect the accuracy of the clusters when there are no outliers and secondly it significantly increases the accuracy in the presence of outliers, even when the probability of outlier is very low
- …