46,857 research outputs found
Sequence Modelling For Analysing Student Interaction with Educational Systems
The analysis of log data generated by online educational systems is an
important task for improving the systems, and furthering our knowledge of how
students learn. This paper uses previously unseen log data from Edulab, the
largest provider of digital learning for mathematics in Denmark, to analyse the
sessions of its users, where 1.08 million student sessions are extracted from a
subset of their data. We propose to model students as a distribution of
different underlying student behaviours, where the sequence of actions from
each session belongs to an underlying student behaviour. We model student
behaviour as Markov chains, such that a student is modelled as a distribution
of Markov chains, which are estimated using a modified k-means clustering
algorithm. The resulting Markov chains are readily interpretable, and in a
qualitative analysis around 125,000 student sessions are identified as
exhibiting unproductive student behaviour. Based on our results this student
representation is promising, especially for educational systems offering many
different learning usages, and offers an alternative to common approaches like
modelling student behaviour as a single Markov chain often done in the
literature.Comment: The 10th International Conference on Educational Data Mining 201
Convergence of Least Squares Temporal Difference Methods Under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least squares temporal difference algorithm, LSTD(λ). We establish for the discounted cost criterion that the off-policy LSTD(λ) converges almost surely under mild, minimal conditions. We also analyze other convergence and boundedness properties of the iterates involved in the algorithm, and based on them, we suggest a modification in its practical implementation. Our analysis uses theories of both finite space Markov chains and Markov chains on topological spaces, in particular, the e-chains
Fast MCMC sampling for Markov jump processes and extensions
Markov jump processes (or continuous-time Markov chains) are a simple and
important class of continuous-time dynamical systems. In this paper, we tackle
the problem of simulating from the posterior distribution over paths in these
models, given partial and noisy observations. Our approach is an auxiliary
variable Gibbs sampler, and is based on the idea of uniformization. This sets
up a Markov chain over paths by alternately sampling a finite set of virtual
jump times given the current path and then sampling a new path given the set of
extant and virtual jump times using a standard hidden Markov model forward
filtering-backward sampling algorithm. Our method is exact and does not involve
approximations like time-discretization. We demonstrate how our sampler extends
naturally to MJP-based models like Markov-modulated Poisson processes and
continuous-time Bayesian networks and show significant computational benefits
over state-of-the-art MCMC samplers for these models.Comment: Accepted at the Journal of Machine Learning Research (JMLR
Faster quantum mixing for slowly evolving sequences of Markov chains
Markov chain methods are remarkably successful in computational physics,
machine learning, and combinatorial optimization. The cost of such methods
often reduces to the mixing time, i.e., the time required to reach the steady
state of the Markov chain, which scales as , the inverse of the
spectral gap. It has long been conjectured that quantum computers offer nearly
generic quadratic improvements for mixing problems. However, except in special
cases, quantum algorithms achieve a run-time of , which introduces a costly dependence on the Markov chain size
not present in the classical case. Here, we re-address the problem of mixing of
Markov chains when these form a slowly evolving sequence. This setting is akin
to the simulated annealing setting and is commonly encountered in physics,
material sciences and machine learning. We provide a quantum memory-efficient
algorithm with a run-time of ,
neglecting logarithmic terms, which is an important improvement for large state
spaces. Moreover, our algorithms output quantum encodings of distributions,
which has advantages over classical outputs. Finally, we discuss the run-time
bounds of mixing algorithms and show that, under certain assumptions, our
algorithms are optimal.Comment: 20 pages, 2 figure
- …