463 research outputs found
Measuring the Influence of Observations in HMMs through the Kullback-Leibler Distance
We measure the influence of individual observations on the sequence of the
hidden states of the Hidden Markov Model (HMM) by means of the Kullback-Leibler
distance (KLD). Namely, we consider the KLD between the conditional
distribution of the hidden states' chain given the complete sequence of
observations and the conditional distribution of the hidden chain given all the
observations but the one under consideration. We introduce a linear complexity
algorithm for computing the influence of all the observations. As an
illustration, we investigate the application of our algorithm to the problem of
detecting outliers in HMM data series
On improving the forecast accuracy of the hidden Markov model
The forecast accuracy of a hidden Markov model (HMM) may be low due first, to the measure of forecast accuracy being ignored in the parameterestimation method and, second, to overfitting caused by the large number of parameters that must be estimated. A general approach to forecasting is described which aims to resolve these two problems and so improve the forecast accuracy of the HMM. First, the application of extremum estimators to the HMM is proposed. Extremum estimators aim to improve the forecast accuracy of the HMM by minimising an estimate of the forecast error on the observed data. The forecast accuracy is measured by a score function and the use of some general classes of score functions is proposed. This approach contrasts with the standard use of a minus log-likelihood score function. Second, penalised estimation for the HMM is described. The aim of penalised estimation is to reduce overfitting and so increase the forecast accuracy of the HMM. Penalties on both the state-dependent distribution parameters and transition probability matrix are proposed. In addition, a number of cross-validation approaches for tuning the penalty function are investigated. Empirical assessment of the proposed approach on both simulated and real data demonstrated that, in terms of forecast accuracy, penalised HMMs fitted using extremum estimators generally outperformed unpenalised HMMs fitted using maximum likelihood
Hierarchically-coupled hidden Markov models for learning kinetic rates from single-molecule data
We address the problem of analyzing sets of noisy time-varying signals that
all report on the same process but confound straightforward analyses due to
complex inter-signal heterogeneities and measurement artifacts. In particular
we consider single-molecule experiments which indirectly measure the distinct
steps in a biomolecular process via observations of noisy time-dependent
signals such as a fluorescence intensity or bead position. Straightforward
hidden Markov model (HMM) analyses attempt to characterize such processes in
terms of a set of conformational states, the transitions that can occur between
these states, and the associated rates at which those transitions occur; but
require ad-hoc post-processing steps to combine multiple signals. Here we
develop a hierarchically coupled HMM that allows experimentalists to deal with
inter-signal variability in a principled and automatic way. Our approach is a
generalized expectation maximization hyperparameter point estimation procedure
with variational Bayes at the level of individual time series that learns an
single interpretable representation of the overall data generating process.Comment: 9 pages, 5 figure
Toward autism recognition using hidden Markov models
Master of ScienceDepartment of Computing and Information SciencesDavid A. GustafsonThe use of hidden Markov models in autism recognition and analysis is investigated.
More speciïŹcally, we would like to be able to determine a person's level of autism (AS, HFA,
MFA, LFA) using hidden Markov models trained on observations taken from a subject's
behavior in an experiment. A preliminary model is described that includes the three mental
states self-absorbed, attentive, and join-attentive. Futhermore, observations are included
that are more or less indicative of each of these states. Two experiments are described,
the ïŹrst on a single subject and the second on two subjects. Data was collected from one
individual in the second experiment and observations were prepared for input to hidden
Markov models and the resulting hidden Markov models were studied. Several questions
subsequently arose and tests, written in Java using the JaHMM hidden Markov model tool-
kit, were conducted to learn more about the hidden Markov models being used as autism
recognizers and the training algorithms being used to train them. The tests are described
along with the corresponding results and implications. Finally, suggestions are made for
future work. It turns out that we aren't yet able to produce hidden Markov models that are
indicative of a persons level of autism and the problems encountered are discussed and the
suggested future work is intended to further investigate the use of hidden Markov models
in autism recognition
Non-Gaussian data modeling with hidden Markov models
In 2015, 2.5 quintillion bytes of data were daily generated worldwide of which 90% were unstructured data that do not follow any pre-defined model. These data can be found in a great variety of formats among them are texts, images, audio tracks, or videos. With appropriate techniques, this massive amount of data is a goldmine from which one can extract a variety of meaningful embedded information. Among those techniques, machine learning algorithms allow multiple processing possibilities from compact data representation, to data clustering, classification, analysis, and synthesis, to the detection of outliers. Data modeling is the first step for performing any of these tasks and the accuracy and reliability of this initial step is thus crucial for subsequently building up a complete data processing framework. The principal motivation behind my work is the over-use of the Gaussian assumption for data modeling in the literature. Though this assumption is probably the best to make when no information about the data to be modeled is available, in most cases studying a few data properties would make other distributions a better assumption. In this thesis, I focus on proportional data that are most commonly known in the form of histograms and that naturally arise in a number of situations such as in bag-of-words methods. These data are non-Gaussian and their modeling with distributions belonging the Dirichlet family, that have common properties, is expected to be more accurate. The models I focus on are the hidden Markov models, well-known for their capabilities to easily handle dynamic ordered multivariate data. They have been shown to be very effective in numerous fields for various applications for the last 30 years and especially became a corner stone in speech processing. Despite their extensive use in almost all computer vision areas, they are still mainly suited for Gaussian data modeling. I propose here to theoretically derive different approaches for learning and applying to real-world situations hidden Markov models based on mixtures of Dirichlet, generalized Dirichlet, Beta-Liouville distributions, and mixed data. Expectation-Maximization and variational learning approaches are studied and compared over several data sets, specifically for the task of detecting and localizing unusual events. Hybrid HMMs are proposed to model mixed data with the goal of detecting changes in satellite images corrupted by different noises. Finally, several parametric distances for comparing Dirichlet and generalized Dirichlet-based HMMs are proposed and extensively tested for assessing their robustness. My experimental results show situations in which such models are worthy to be used, but also unravel their strength and limitations
A Framework for Discovery and Diagnosis of Behavioral Transitions in Event-streams
Date stream mining techniques can be used in tracking user behaviors as they attempt to achieve their goals. Quality metrics over stream-mined models identify potential changes in user goal attainment. When the quality of some data mined models varies significantly from nearby modelsâas defined by quality metricsâthen the userâs behavior is automatically flagged as a potentially significant behavioral change. Decision tree, sequence pattern and Hidden Markov modeling being used in this study. These three types of modeling can expose different aspect of userâs behavior. In case of decision tree modeling, the specific changes in user behavior can automatically characterized by differencing the data-mined decision-tree models. The sequence pattern modeling can shed light on how the user changes his sequence of actions and Hidden Markov modeling can identifies the learning transition points. This research describes how model-quality monitoring and these three types of modeling as a generic framework can aid recognition and diagnoses of behavioral changes in a case study of cognitive rehabilitation via emailing. The date stream mining techniques mentioned are used to monitor patient goals as part of a clinical plan to aid cognitive rehabilitation. In this context, real time data mining aids clinicians in tracking user behaviors as they attempt to achieve their goals. This generic framework can be widely applicable to other real-time data-intensive analysis problems. In order to illustrate this fact, the similar Hidden Markov modeling is being used for analyzing the transactional behavior of a telecommunication company for fraud detection. Fraud similarly can be considered as a potentially significant transaction behavioral change
Network Traffic Analysis Using Stochastic Grammars
Network traffic analysis is widely used to infer information from Internet traffic. This is possible even if the traffic is encrypted. Previous work uses traffic characteristics, such as port numbers, packet sizes, and frequency, without looking for more subtle patterns in the network traffic. In this work, we use stochastic grammars, hidden Markov models (HMMs) and probabilistic context-free grammars (PCFGs), as pattern recognition tools for traffic analysis. HMMs are widely used for pattern recognition and detection. We use a HMM inference approach. With inferred HMMs, we use confidence intervals (CI) to detect if a data sequence matches the HMM. To compare HMMs, we define a normalized Markov metric. A statistical test is used to determine model equivalence. Our metric systematically removes the least likely events from both HMMs until the remaining models are statistically equivalent. This defines the distance between models. We extend the use of HMMs to PCFGs, which have more expressive power. We estimate PCFG production probabilities from data. A statistical test is used for detection. We present three applications of HMM and PCFG detection to network traffic analysis. First, we infer the presence of protocol tunneling through Tor (the onion router) anonymization network. The Markov metric quantifies the similarity of network traffic HMMs in Tor to identify the protocol. It also measures communication noise in Tor network. We use HMMs to detect centralized botnet traffic. We infer HMMs from botnet traffic data and detect botnet infections. Experimental results show that HMMs can accurately detect Zeus botnet traffic. To hide their locations better, newer botnets have P2P control structures. Hierarchical P2P botnets contain recursive and hierarchical patterns. We use PCFGs to detect P2P botnet traffic. Experimentation on real-world traffic data shows that PCFGs can accurately differentiate between P2P botnet traffic and normal Internet traffic
Bayesian modelling of dependent data
Bayesian approaches to statistical modelling allow practitioners to coherently update prior beliefs based on observed data, and encode these updated beliefs in the posterior distribution. The resulting inferences are then supported by a meaningful measure of uncertainty, and as such the Bayesian statistical paradigm is attractive in a number of applications.
In a variety of these applications, however, the assumption that data is conditionally independent, given a common data generating process, is simply not well-founded. As such, it is essential that Bayesian approaches, which nevertheless retain their rigour, coherence, and practical appeal, be formulated and studied in these contexts.
In this thesis, we examine the Bayesian modelling of dependent data, both in theory and in practice. We place particular focus on semiparametric Hidden Markov models. These models are based upon the idea that many processes may possess a simple, Markovian, latent structure underlying dependence in data. This latent structure cannot be observed directly, and the relationship between this latent structure, and the observations through which our posterior distribution is formed, may be very complex and therefore not easily captured by finite dimensional models.
We study the theoretical properties of such models from a frequentist perspective, which may equally be viewed from the fully subjective Bayesian perspective as a notion of robustness to prior choice. In doing so, we propose the application of a modular Bayesian approach, the cut posterior, to semiparametric inference more broadly, showing its specific benefit in the model we study.
Alongside this line of work, we pursue a second, thematically related, line of work relating to Bayesian modelling of conflict data, which exhibits dependencies both in space and time. Much like the HMMs we study, such models can be interpreted as capturing an underlying structure in the data, and we will see that inference in such models provides us with interpretable and meaningful insights into conflict events
- âŠ