Search CORE

5 research outputs found

Uncertainty-Based Out-of-Distribution Classification in Deep Reinforcement Learning

Author: Belzner Lenz
Gabor Thomas
Linnhoff-Popien Claudia
Phan Thomy
Sedlmeier Andreas
Publication venue: 'Scitepress'
Publication date: 31/12/2019
Field of study

Robustness to out-of-distribution (OOD) data is an important goal in building reliable machine learning systems. Especially in autonomous systems, wrong predictions for OOD inputs can cause safety critical situations. As a first step towards a solution, we consider the problem of detecting such data in a value-based deep reinforcement learning (RL) setting. Modelling this problem as a one-class classification problem, we propose a framework for uncertainty-based OOD classification: UBOOD. It is based on the effect that an agent's epistemic uncertainty is reduced for situations encountered during training (in-distribution), and thus lower than for unencountered (OOD) situations. Being agnostic towards the approach used for estimating epistemic uncertainty, combinations with different uncertainty estimation methods, e.g. approximate Bayesian inference methods or ensembling techniques are possible. We further present a first viable solution for calculating a dynamic classification threshold, based on the uncertainty distribution of the training data. Evaluation shows that the framework produces reliable classification results when combined with ensemble-based estimators, while the combination with concrete dropout-based estimators fails to reliably detect OOD situations. In summary, UBOOD presents a viable approach for OOD classification in deep RL settings by leveraging the epistemic uncertainty of the agent's value function.Comment: arXiv admin note: text overlap with arXiv:1901.0221

arXiv.org e-Print Archive

Crossref

The scenario coevolution paradigm: adaptive quality assurance for adaptive systems

Author: Belzner Lenz
Gabor Thomas
Kempter Bernhard
Kiermeier Marie
Klein Cornel
Linnhoff-Popien Claudia
Phan Thomy
Ritz Fabian
Sauer Horst
Schmid Reiner
Sedlmeier Andreas
Wieghardt Jan
Zeller Marc
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Systems are becoming increasingly more adaptive, using techniques like machine learning to enhance their behavior on their own rather than only through human developers programming them. We analyze the impact the advent of these new techniques has on the discipline of rigorous software engineering, especially on the issue of quality assurance. To this end, we provide a general description of the processes related to machine learning and embed them into a formal framework for the analysis of adaptivity, recognizing that to test an adaptive system a new approach to adaptive testing is necessary. We introduce scenario coevolution as a design pattern describing how system and test can work as antagonists in the process of software evolution. While the general pattern applies to large-scale processes (including human developers further augmenting the system), we show all techniques on a smaller-scale example of an agent navigating a simple smart factory. We point out new aspects in software engineering for adaptive systems that may be tackled naturally using scenario coevolution. This work is a substantially extended take on Gabor et al. (International symposium on leveraging applications of formal methods, Springer, pp 137–154, 2018)

Open Access LMU

Disorder in the reconstructed (111) 2 × 2 surfaces of InSb and GaSb

Author: Andreas Belzner
Eckard Ritter
Heinz Schulz
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Resilient Multi-Agent Reinforcement Learning with Adversarial Value Decomposition

Author: Belzner Lenz
Gabor Thomas
Linnhoff-Popien Claudia
Phan Thomy
Ritz Fabian
Sedlmeier Andreas
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 18/05/2021
Field of study

We focus on resilience in cooperative multi-agent systems, where agents can change their behavior due to udpates or failures of hardware and software components. Current state-of-the-art approaches to cooperative multi-agent reinforcement learning (MARL) have either focused on idealized settings without any changes or on very specialized scenarios, where the number of changing agents is fixed, e.g., in extreme cases with only one productive agent. Therefore, we propose Resilient Adversarial value Decomposition with Antagonist-Ratios (RADAR). RADAR offers a value decomposition scheme to train competing teams of varying size for improved resilience against arbitrary agent changes. We evaluate RADAR in two cooperative multi-agent domains and show that RADAR achieves better worst case performance w.r.t. arbitrary agent changes than state-of-the-art MARL

Association for the Advancement of Artificial Intelligence: AAAI Publications