Search CORE

58,948 research outputs found

Goal-Directed Planning for Habituated Agents by Active Inference Using a Variational Recurrent Neural Network

Author: Matsumoto Takazumi
Tani Jun
Publication venue: 'MDPI AG'
Publication date: 18/05/2020
Field of study

It is crucial to ask how agents can achieve goals by generating action plans using only partial models of the world acquired through habituated sensory-motor experiences. Although many existing robotics studies use a forward model framework, there are generalization issues with high degrees of freedom. The current study shows that the predictive coding (PC) and active inference (AIF) frameworks, which employ a generative model, can develop better generalization by learning a prior distribution in a low dimensional latent state space representing probabilistic structures extracted from well habituated sensory-motor trajectories. In our proposed model, learning is carried out by inferring optimal latent variables as well as synaptic weights for maximizing the evidence lower bound, while goal-directed planning is accomplished by inferring latent variables for maximizing the estimated lower bound. Our proposed model was evaluated with both simple and complex robotic tasks in simulation, which demonstrated sufficient generalization in learning with limited training data by setting an intermediate value for a regularization coefficient. Furthermore, comparative simulation results show that the proposed model outperforms a conventional forward model in goal-directed planning, due to the learned prior confining the search of motor plans within the range of habituated trajectories.Comment: 30 pages, 19 figure

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

OIST Institutional Repository

Taming Uncertainty in the Assurance Process of Self-Adaptive Systems: a Goal-Oriented Approach

Author: Caldas Ricardo Diniz
Pelliccione Patrizio
Rodrigues Genaína Nunes
Solano Gabriela Félix
Vogel Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Goals are first-class entities in a self-adaptive system (SAS) as they guide the self-adaptation. A SAS often operates in dynamic and partially unknown environments, which cause uncertainty that the SAS has to address to achieve its goals. Moreover, besides the environment, other classes of uncertainty have been identified. However, these various classes and their sources are not systematically addressed by current approaches throughout the life cycle of the SAS. In general, uncertainty typically makes the assurance provision of SAS goals exclusively at design time not viable. This calls for an assurance process that spans the whole life cycle of the SAS. In this work, we propose a goal-oriented assurance process that supports taming different sources (within different classes) of uncertainty from defining the goals at design time to performing self-adaptation at runtime. Based on a goal model augmented with uncertainty annotations, we automatically generate parametric symbolic formulae with parameterized uncertainties at design time using symbolic model checking. These formulae and the goal model guide the synthesis of adaptation policies by engineers. At runtime, the generated formulae are evaluated to resolve the uncertainty and to steer the self-adaptation using the policies. In this paper, we focus on reliability and cost properties, for which we evaluate our approach on the Body Sensor Network (BSN) implemented in OpenDaVINCI. The results of the validation are promising and show that our approach is able to systematically tame multiple classes of uncertainty, and that it is effective and efficient in providing assurances for the goals of self-adaptive systems

arXiv.org e-Print Archive

Chalmers Research

Network Plasticity as Bayesian Inference

Author: Habenschuss Stefan
Kappel David
Legenstein Robert
Maass Wolfgang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 20/04/2015
Field of study

General results from statistical learning theory suggest to understand not only brain computations, but also brain plasticity as probabilistic inference. But a model for that has been missing. We propose that inherently stochastic features of synaptic plasticity and spine motility enable cortical networks of neurons to carry out probabilistic inference by sampling from a posterior distribution of network configurations. This model provides a viable alternative to existing models that propose convergence of parameters to maximum likelihood values. It explains how priors on weight distributions and connection probabilities can be merged optimally with learned experience, how cortical networks can generalize learned information so well to novel experiences, and how they can compensate continuously for unforeseen disturbances of the network. The resulting new theory of network plasticity explains from a functional perspective a number of experimental data on stochastic aspects of synaptic plasticity that previously appeared to be quite puzzling.Comment: 33 pages, 5 figures, the supplement is available on the author's web page http://www.igi.tugraz.at/kappe

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation

Author: Lawrence Carolin
Riezler Stefan
Sokolov Artem
Publication venue
Publication date: 01/01/2017
Field of study

The goal of counterfactual learning for statistical machine translation (SMT) is to optimize a target SMT system from logged data that consist of user feedback to translations that were predicted by another, historic SMT system. A challenge arises by the fact that risk-averse commercial SMT systems deterministically log the most probable translation. The lack of sufficient exploration of the SMT output space seemingly contradicts the theoretical requirements for counterfactual learning. We show that counterfactual learning from deterministic bandit logs is possible nevertheless by smoothing out deterministic components in learning. This can be achieved by additive and multiplicative control variates that avoid degenerate behavior in empirical risk minimization. Our simulation experiments show improvements of up to 2 BLEU points by counterfactual learning from deterministic bandit feedback.Comment: Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017, Copenhagen, Denmar

arXiv.org e-Print Archive

Crossref