8,770 research outputs found

    Feature Reinforcement Learning: Part I: Unstructured MDPs

    Get PDF
    General-purpose, intelligent, learning agents cycle through sequences of observations, actions, and rewards that are complex, uncertain, unknown, and non-Markovian. On the other hand, reinforcement learning is well-developed for small finite state Markov decision processes (MDPs). Up to now, extracting the right state representations out of bare observations, that is, reducing the general agent setup to the MDP framework, is an art that involves significant effort by designers. The primary goal of this work is to automate the reduction process and thereby significantly expand the scope of many existing reinforcement learning algorithms and the agents that employ them. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Extensions to more realistic dynamic Bayesian networks are developed in Part II. The role of POMDPs is also considered there.Comment: 24 LaTeX pages, 5 diagram

    Many Roads to Synchrony: Natural Time Scales and Their Algorithms

    Full text link
    We consider two important time scales---the Markov and cryptic orders---that monitor how an observer synchronizes to a finitary stochastic process. We show how to compute these orders exactly and that they are most efficiently calculated from the epsilon-machine, a process's minimal unifilar model. Surprisingly, though the Markov order is a basic concept from stochastic process theory, it is not a probabilistic property of a process. Rather, it is a topological property and, moreover, it is not computable from any finite-state model other than the epsilon-machine. Via an exhaustive survey, we close by demonstrating that infinite Markov and infinite cryptic orders are a dominant feature in the space of finite-memory processes. We draw out the roles played in statistical mechanical spin systems by these two complementary length scales.Comment: 17 pages, 16 figures: http://cse.ucdavis.edu/~cmg/compmech/pubs/kro.htm. Santa Fe Institute Working Paper 10-11-02

    Learning models of plant behavior for anomaly detection and condition monitoring

    Get PDF
    Providing engineers and asset managers with a too] which can diagnose faults within transformers can greatly assist decision making on such issues as maintenance, performance and safety. However, the onus has always been on personnel to accurately decide how serious a problem is and how urgently maintenance is required. In dealing with the large volumes of data involved, it is possible that faults may not be noticed until serious damage has occurred. This paper proposes the integration of a newly developed anomaly detection technique with an existing diagnosis system. By learning a Hidden Markov Model of healthy transformer behavior, unexpected operation, such as when a fault develops, can be flagged for attention. Faults can then be diagnosed using the existing system and maintenance scheduled as required, all at a much earlier stage than would previously have been possible
    corecore