3,764 research outputs found
Cover Tree Bayesian Reinforcement Learning
This paper proposes an online tree-based Bayesian approach for reinforcement
learning. For inference, we employ a generalised context tree model. This
defines a distribution on multivariate Gaussian piecewise-linear models, which
can be updated in closed form. The tree structure itself is constructed using
the cover tree method, which remains efficient in high dimensional spaces. We
combine the model with Thompson sampling and approximate dynamic programming to
obtain effective exploration policies in unknown environments. The flexibility
and computational simplicity of the model render it suitable for many
reinforcement learning problems in continuous state spaces. We demonstrate this
in an experimental comparison with least squares policy iteration
The Neuropharmacology of Implicit Learning
Two decades of pharmacologic research on the human capacity to implicitly acquire knowledge as well as cognitive skills and procedures have yielded surprisingly few conclusive insights. We review the empirical literature of the neuropharmacology of implicit learning. We evaluate the findings in the context of relevant computational models related to neurotransmittors such as dopamine, serotonin, acetylcholine and noradrenalin. These include models for reinforcement learning, sequence production, and categorization. We conclude, based on the reviewed literature, that one can predict improved implicit acquisition by moderately elevated dopamine levels and impaired implicit acquisition by moderately decreased dopamine levels. These effects are most prominent in the dorsal striatum. This is supported by a range of behavioral tasks in the empirical literature. Similar predictions can be made for serotonin, although there is yet a lack of support in the literature for serotonin involvement in classical implicit learning tasks. There is currently a lack of evidence for a role of the noradrenergic and cholinergic systems in implicit and related forms of learning. GABA modulators, including benzodiazepines, seem to affect implicit learning in a complex manner and further research is needed. Finally, we identify allosteric AMPA receptors modulators as a potentially interesting target for future investigation of the neuropharmacology of procedural and implicit learning
States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning
Reinforcement learning (RL) uses sequential experience with situations (“states”) and outcomes to assess actions. Whereas model-free RL uses this experience directly, in the form of a reward prediction error (RPE), model-based RL uses it indirectly, building a model of the state transition and outcome structure of the environment, and evaluating actions by searching this model. A state prediction error (SPE) plays a central role, reporting discrepancies between the current model and the observed state transitions. Using functional magnetic resonance imaging in humans solving a probabilistic Markov decision task, we found the neural signature of an SPE in the intraparietal sulcus and lateral prefrontal cortex, in addition to the previously well-characterized RPE in the ventral striatum. This finding supports the existence of two unique forms of learning signal in humans, which may form the basis of distinct computational strategies for guiding behavior
Predicting Policy Violations in Policy Based Proactive Systems Management
The continuous development and advancement in networking, computing, software and web technologies have led to an explosive growth in distributed systems. To ensure better quality of service (QoS), management of large scale distributed systems is important. The increasing complexity of distributed systems requires significantly higher levels of automation in system management. The core of autonomie computing is the ability to analyze data about the distributed system and to take actions. Such autonomic management should include some ability to anticipate potential problems and take action to avoid them that is, it should be proactive. System management should be proactive in order to be able to identify possible faults before they occur and before they can result in severe degradation in performance. In this thesis, our goal is to predict policy violations and take actions ahead of time in order to achieve proactive management in a policy based system.We implemented different prediction algorithm to predict policy violations. Based on the prediction decision, proactive actions are implemented in the system. Adaptive proactive action approach is also introduced to increase the performance of the proactive management system
- …