4 research outputs found

    Risk-Sensitive Reinforcement Learning with Exponential Criteria

    Full text link
    While reinforcement learning has shown experimental success in a number of applications, it is known to be sensitive to noise and perturbations in the parameters of the system, leading to high variance in the total reward amongst different episodes on slightly different environments. To introduce robustness, as well as sample efficiency, risk-sensitive reinforcement learning methods are being thoroughly studied. In this work, we provide a definition of robust reinforcement learning policies and formulate a risk-sensitive reinforcement learning problem to approximate them, by solving an optimization problem with respect to a modified objective based on exponential criteria. In particular, we study a model-free risk-sensitive variation of the widely-used Monte Carlo Policy Gradient algorithm, and introduce a novel risk-sensitive online Actor-Critic algorithm based on solving a multiplicative Bellman equation using stochastic approximation updates. Analytical results suggest that the use of exponential criteria generalizes commonly used ad-hoc regularization approaches, improves sample efficiency, and introduces robustness with respect to perturbations in the model parameters and the environment. The implementation, performance, and robustness properties of the proposed methods are evaluated in simulated experiments

    Robust Counterfactual Explanations for Neural Networks With Probabilistic Guarantees

    Full text link
    There is an emerging interest in generating robust counterfactual explanations that would remain valid if the model is updated or changed even slightly. Towards finding robust counterfactuals, existing literature often assumes that the original model mm and the new model MM are bounded in the parameter space, i.e., ∥Params(M)−Params(m)∥<Δ\|\text{Params}(M){-}\text{Params}(m)\|{<}\Delta. However, models can often change significantly in the parameter space with little to no change in their predictions or accuracy on the given dataset. In this work, we introduce a mathematical abstraction termed \emph{naturally-occurring} model change, which allows for arbitrary changes in the parameter space such that the change in predictions on points that lie on the data manifold is limited. Next, we propose a measure -- that we call \emph{Stability} -- to quantify the robustness of counterfactuals to potential model changes for differentiable models, e.g., neural networks. Our main contribution is to show that counterfactuals with sufficiently high value of \emph{Stability} as defined by our measure will remain valid after potential ``naturally-occurring'' model changes with high probability (leveraging concentration bounds for Lipschitz function of independent Gaussians). Since our quantification depends on the local Lipschitz constant around a data point which is not always available, we also examine practical relaxations of our proposed measure and demonstrate experimentally how they can be incorporated to find robust counterfactuals for neural networks that are close, realistic, and remain valid after potential model changes. This work also has interesting connections with model multiplicity, also known as, the Rashomon effect.Comment: International Conference on Machine Learning (ICML), 202

    Robust Reinforcement Learning via Risk-Sensitivity

    No full text
    The objective of this research is to develop robust-resilient-adaptive Reinforcement Learning (RL) systems that are generic, provide performance guarantees, and can generalize-reason-improve in complex and unknown task environments. To achieve this objective, we focus on exploring the concept of Risk-sensitivity in RL systems and its extensions to Multi-Agent (MA) systems. The development of robust reinforcement learning algorithms is crucial to address challenges such as model misspecification, parameter uncertainty, disturbances, and more. Risk-sensitive methods offer an approach to developing robust RL algorithms by hedging against undesirable outcomes in a probabilistic manner. The robustness properties of risk-sensitive controllers have long been established. We investigate risk-sensitive RL (as a generalization of risk-sensitive stochastic control), by theoretically analyzing the risk-sensitive exponential (exponential of the total reward) criteria and the benefits and improvements the introduction of risk-sensitivity brings to conventional RL. By considering exponential criteria as risk measures, we aim to enhance the reliability of our decision-making process. We explore the exponential criteria to better understand its representation, the implications of its optimization, and the behavioral characteristics exhibited by an agent optimizing this criterion. We demonstrate the advantages of utilizing exponential criteria for the development of RL algorithms. We then shift our focus to developing algorithms that effectively leverage these exponential criteria. To do that, we first focus on developing risk-sensitive RL algorithms within the framework of Markov Decision Processes (MDPs). We then broaden our scope by exploring the application of the Probabilistic Graphical Models (PGM) framework for developing risk-sensitive algorithms. Within this context, we delve into the PGM framework and examine its connection with the MDP framework. We proceed by exploring the effects of risk sensitivity on trust, collaboration, and cooperation in multi-agent systems. To conclude, we finally investigate the concept of risk sensitivity and the robust properties of risk-sensitive algorithms in decision-making and optimization domains beyond RL. Specifically, we focus on safe RL using risk-sensitive filters. Through our exploration, we seek to enhance the understanding and applicability of risk-sensitive approaches in various domains
    corecore