8,770 research outputs found
Feature Reinforcement Learning: Part I: Unstructured MDPs
General-purpose, intelligent, learning agents cycle through sequences of
observations, actions, and rewards that are complex, uncertain, unknown, and
non-Markovian. On the other hand, reinforcement learning is well-developed for
small finite state Markov decision processes (MDPs). Up to now, extracting the
right state representations out of bare observations, that is, reducing the
general agent setup to the MDP framework, is an art that involves significant
effort by designers. The primary goal of this work is to automate the reduction
process and thereby significantly expand the scope of many existing
reinforcement learning algorithms and the agents that employ them. Before we
can think of mechanizing this search for suitable MDPs, we need a formal
objective criterion. The main contribution of this article is to develop such a
criterion. I also integrate the various parts into one learning algorithm.
Extensions to more realistic dynamic Bayesian networks are developed in Part
II. The role of POMDPs is also considered there.Comment: 24 LaTeX pages, 5 diagram
Many Roads to Synchrony: Natural Time Scales and Their Algorithms
We consider two important time scales---the Markov and cryptic orders---that
monitor how an observer synchronizes to a finitary stochastic process. We show
how to compute these orders exactly and that they are most efficiently
calculated from the epsilon-machine, a process's minimal unifilar model.
Surprisingly, though the Markov order is a basic concept from stochastic
process theory, it is not a probabilistic property of a process. Rather, it is
a topological property and, moreover, it is not computable from any
finite-state model other than the epsilon-machine. Via an exhaustive survey, we
close by demonstrating that infinite Markov and infinite cryptic orders are a
dominant feature in the space of finite-memory processes. We draw out the roles
played in statistical mechanical spin systems by these two complementary length
scales.Comment: 17 pages, 16 figures:
http://cse.ucdavis.edu/~cmg/compmech/pubs/kro.htm. Santa Fe Institute Working
Paper 10-11-02
Learning models of plant behavior for anomaly detection and condition monitoring
Providing engineers and asset managers with a too] which can diagnose faults within transformers can greatly assist decision making on such issues as maintenance, performance and safety. However, the onus has always been on personnel to accurately decide how serious a problem is and how urgently maintenance is required. In dealing with the large volumes of data involved, it is possible that faults may not be noticed until serious damage has occurred. This paper proposes the integration of a newly developed anomaly detection technique with an existing diagnosis system. By learning a Hidden Markov Model of healthy transformer behavior, unexpected operation, such as when a fault develops, can be flagged for attention. Faults can then be diagnosed using the existing system and maintenance scheduled as required, all at a much earlier stage than would previously have been possible
- …