3,764 research outputs found

    Cover Tree Bayesian Reinforcement Learning

    Get PDF
    This paper proposes an online tree-based Bayesian approach for reinforcement learning. For inference, we employ a generalised context tree model. This defines a distribution on multivariate Gaussian piecewise-linear models, which can be updated in closed form. The tree structure itself is constructed using the cover tree method, which remains efficient in high dimensional spaces. We combine the model with Thompson sampling and approximate dynamic programming to obtain effective exploration policies in unknown environments. The flexibility and computational simplicity of the model render it suitable for many reinforcement learning problems in continuous state spaces. We demonstrate this in an experimental comparison with least squares policy iteration

    The Neuropharmacology of Implicit Learning

    Get PDF
    Two decades of pharmacologic research on the human capacity to implicitly acquire knowledge as well as cognitive skills and procedures have yielded surprisingly few conclusive insights. We review the empirical literature of the neuropharmacology of implicit learning. We evaluate the findings in the context of relevant computational models related to neurotransmittors such as dopamine, serotonin, acetylcholine and noradrenalin. These include models for reinforcement learning, sequence production, and categorization. We conclude, based on the reviewed literature, that one can predict improved implicit acquisition by moderately elevated dopamine levels and impaired implicit acquisition by moderately decreased dopamine levels. These effects are most prominent in the dorsal striatum. This is supported by a range of behavioral tasks in the empirical literature. Similar predictions can be made for serotonin, although there is yet a lack of support in the literature for serotonin involvement in classical implicit learning tasks. There is currently a lack of evidence for a role of the noradrenergic and cholinergic systems in implicit and related forms of learning. GABA modulators, including benzodiazepines, seem to affect implicit learning in a complex manner and further research is needed. Finally, we identify allosteric AMPA receptors modulators as a potentially interesting target for future investigation of the neuropharmacology of procedural and implicit learning

    States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning

    Get PDF
    Reinforcement learning (RL) uses sequential experience with situations (“states”) and outcomes to assess actions. Whereas model-free RL uses this experience directly, in the form of a reward prediction error (RPE), model-based RL uses it indirectly, building a model of the state transition and outcome structure of the environment, and evaluating actions by searching this model. A state prediction error (SPE) plays a central role, reporting discrepancies between the current model and the observed state transitions. Using functional magnetic resonance imaging in humans solving a probabilistic Markov decision task, we found the neural signature of an SPE in the intraparietal sulcus and lateral prefrontal cortex, in addition to the previously well-characterized RPE in the ventral striatum. This finding supports the existence of two unique forms of learning signal in humans, which may form the basis of distinct computational strategies for guiding behavior

    Predicting Policy Violations in Policy Based Proactive Systems Management

    Get PDF
    The continuous development and advancement in networking, computing, software and web technologies have led to an explosive growth in distributed systems. To ensure better quality of service (QoS), management of large scale distributed systems is important. The increasing complexity of distributed systems requires significantly higher levels of automation in system management. The core of autonomie computing is the ability to analyze data about the distributed system and to take actions. Such autonomic management should include some ability to anticipate potential problems and take action to avoid them that is, it should be proactive. System management should be proactive in order to be able to identify possible faults before they occur and before they can result in severe degradation in performance. In this thesis, our goal is to predict policy violations and take actions ahead of time in order to achieve proactive management in a policy based system.We implemented different prediction algorithm to predict policy violations. Based on the prediction decision, proactive actions are implemented in the system. Adaptive proactive action approach is also introduced to increase the performance of the proactive management system
    • …
    corecore