107,691 research outputs found
On Prediction Using Variable Order Markov Models
This paper is concerned with algorithms for prediction of discrete sequences
over a finite alphabet, using variable order Markov models. The class of such
algorithms is large and in principle includes any lossless compression
algorithm. We focus on six prominent prediction algorithms, including Context
Tree Weighting (CTW), Prediction by Partial Match (PPM) and Probabilistic
Suffix Trees (PSTs). We discuss the properties of these algorithms and compare
their performance using real life sequences from three domains: proteins,
English text and music pieces. The comparison is made with respect to
prediction quality as measured by the average log-loss. We also compare
classification algorithms based on these predictors with respect to a number of
large protein classification tasks. Our results indicate that a "decomposed"
CTW (a variant of the CTW algorithm) and PPM outperform all other algorithms in
sequence prediction tasks. Somewhat surprisingly, a different algorithm, which
is a modification of the Lempel-Ziv compression algorithm, significantly
outperforms all algorithms on the protein classification problems
Improving location prediction services for new users with probabilistic latent semantic analysis
Location prediction systems that attempt to determine the mobility patterns of individuals in their daily lives have become increasingly common in recent years. Approaches to this prediction task include eigenvalue decomposition [5], non-linear time series analysis of arrival times [10], and variable order Markov models [1]. However, these approachesall assume sufficient sets of training data. For new users, by definition, this data is typically not available, leading to poor predictive performance. Given that mobility is a highly personal behaviour, this represents a significant barrier to entry. Against this background, we present a novel framework to enhance prediction using information about the mobility habits of existing users. At the core of the framework is a hierarchical Bayesian model, a type of probabilistic semantic analysis [7], representing the intuition that the temporal features of the new user’s location habits are likely to be similar to those of an existing user in the system. We evaluate this framework on the real life location habits of 38 users in the Nokia Lausanne dataset, showing that accuracy is improved by 16%, relative to the state of the art, when predicting the next location of new users
On the Inability of Markov Models to Capture Criticality in Human Mobility
We examine the non-Markovian nature of human mobility by exposing the
inability of Markov models to capture criticality in human mobility. In
particular, the assumed Markovian nature of mobility was used to establish a
theoretical upper bound on the predictability of human mobility (expressed as a
minimum error probability limit), based on temporally correlated entropy. Since
its inception, this bound has been widely used and empirically validated using
Markov chains. We show that recurrent-neural architectures can achieve
significantly higher predictability, surpassing this widely used upper bound.
In order to explain this anomaly, we shed light on several underlying
assumptions in previous research works that has resulted in this bias. By
evaluating the mobility predictability on real-world datasets, we show that
human mobility exhibits scale-invariant long-range correlations, bearing
similarity to a power-law decay. This is in contrast to the initial assumption
that human mobility follows an exponential decay. This assumption of
exponential decay coupled with Lempel-Ziv compression in computing Fano's
inequality has led to an inaccurate estimation of the predictability upper
bound. We show that this approach inflates the entropy, consequently lowering
the upper bound on human mobility predictability. We finally highlight that
this approach tends to overlook long-range correlations in human mobility. This
explains why recurrent-neural architectures that are designed to handle
long-range structural correlations surpass the previously computed upper bound
on mobility predictability
Evaluating Variable Length Markov Chain Models for Analysis of User Web Navigation Sessions
Markov models have been widely used to represent and analyse user web
navigation data. In previous work we have proposed a method to dynamically
extend the order of a Markov chain model and a complimentary method for
assessing the predictive power of such a variable length Markov chain. Herein,
we review these two methods and propose a novel method for measuring the
ability of a variable length Markov model to summarise user web navigation
sessions up to a given length. While the summarisation ability of a model is
important to enable the identification of user navigation patterns, the ability
to make predictions is important in order to foresee the next link choice of a
user after following a given trail so as, for example, to personalise a web
site. We present an extensive experimental evaluation providing strong evidence
that prediction accuracy increases linearly with summarisation ability
Nonuniform Markov models
A statistical language model assigns probability to strings of arbitrary
length. Unfortunately, it is not possible to gather reliable statistics on
strings of arbitrary length from a finite corpus. Therefore, a statistical
language model must decide that each symbol in a string depends on at most a
small, finite number of other symbols in the string. In this report we propose
a new way to model conditional independence in Markov models. The central
feature of our nonuniform Markov model is that it makes predictions of varying
lengths using contexts of varying lengths. Experiments on the Wall Street
Journal reveal that the nonuniform model performs slightly better than the
classic interpolated Markov model. This result is somewhat remarkable because
both models contain identical numbers of parameters whose values are estimated
in a similar manner. The only difference between the two models is how they
combine the statistics of longer and shorter strings.
Keywords: nonuniform Markov model, interpolated Markov model, conditional
independence, statistical language model, discrete time series.Comment: 17 page
Housing Market Crash Prediction Using Machine Learning and Historical Data
The 2008 housing crisis was caused by faulty banking policies and the use of credit derivatives of mortgages for investment purposes. In this project, we look into datasets that are the markers to a typical housing crisis. Using those data sets we build three machine learning techniques which are, Linear regression, Hidden Markov Model, and Long Short-Term Memory. After building the model we did a comparative study to show the prediction done by each model. The linear regression model did not predict a housing crisis, instead, it showed that house prices would be rising steadily and the R-squared score of the model is 0.76. The Hidden Markov Model predicted a fall in the house prices and the R-squared score for this model is 0.706. Lastly, the Long Short-Term Memory showed that the house price would fall briefly but would stabilize after that. Also, fall is not as sharp as what was predicted by the HMM model. The R- squared scored for this model is 0.9, which is the highest among all other models. Although the R-squared score doesn’t say how accurate a model it definitely says how closely a model fits the data. From our model R-square score the model that best fits the data was LSTM. As the dataset used in all the models are the same therefore it is safe to say the prediction made by LSTM is better than the other ones
Retrospective Higher-Order Markov Processes for User Trails
Users form information trails as they browse the web, checkin with a
geolocation, rate items, or consume media. A common problem is to predict what
a user might do next for the purposes of guidance, recommendation, or
prefetching. First-order and higher-order Markov chains have been widely used
methods to study such sequences of data. First-order Markov chains are easy to
estimate, but lack accuracy when history matters. Higher-order Markov chains,
in contrast, have too many parameters and suffer from overfitting the training
data. Fitting these parameters with regularization and smoothing only offers
mild improvements. In this paper we propose the retrospective higher-order
Markov process (RHOMP) as a low-parameter model for such sequences. This model
is a special case of a higher-order Markov chain where the transitions depend
retrospectively on a single history state instead of an arbitrary combination
of history states. There are two immediate computational advantages: the number
of parameters is linear in the order of the Markov chain and the model can be
fit to large state spaces. Furthermore, by providing a specific structure to
the higher-order chain, RHOMPs improve the model accuracy by efficiently
utilizing history states without risks of overfitting the data. We demonstrate
how to estimate a RHOMP from data and we demonstrate the effectiveness of our
method on various real application datasets spanning geolocation data, review
sequences, and business locations. The RHOMP model uniformly outperforms
higher-order Markov chains, Kneser-Ney regularization, and tensor
factorizations in terms of prediction accuracy
- …