3 research outputs found
Introducing an Improved Information-Theoretic Measure of Predictive Uncertainty
Applying a machine learning model for decision-making in the real world
requires to distinguish what the model knows from what it does not. A critical
factor in assessing the knowledge of a model is to quantify its predictive
uncertainty. Predictive uncertainty is commonly measured by the entropy of the
Bayesian model average (BMA) predictive distribution. Yet, the properness of
this current measure of predictive uncertainty was recently questioned. We
provide new insights regarding those limitations. Our analyses show that the
current measure erroneously assumes that the BMA predictive distribution is
equivalent to the predictive distribution of the true model that generated the
dataset. Consequently, we introduce a theoretically grounded measure to
overcome these limitations. We experimentally verify the benefits of our
introduced measure of predictive uncertainty. We find that our introduced
measure behaves more reasonably in controlled synthetic tasks. Moreover, our
evaluations on ImageNet demonstrate that our introduced measure is advantageous
in real-world applications utilizing predictive uncertainty.Comment: M3L & InfoCog Workshops NeurIPS 2
Quantification of Uncertainty with Adversarial Models
Quantifying uncertainty is important for actionable predictions in real-world
applications. A crucial part of predictive uncertainty quantification is the
estimation of epistemic uncertainty, which is defined as an integral of the
product between a divergence function and the posterior. Current methods such
as Deep Ensembles or MC dropout underperform at estimating the epistemic
uncertainty, since they primarily consider the posterior when sampling models.
We suggest Quantification of Uncertainty with Adversarial Models (QUAM) to
better estimate the epistemic uncertainty. QUAM identifies regions where the
whole product under the integral is large, not just the posterior.
Consequently, QUAM has lower approximation error of the epistemic uncertainty
compared to previous methods. Models for which the product is large correspond
to adversarial models (not adversarial examples!). Adversarial models have both
a high posterior as well as a high divergence between their predictions and
that of a reference model. Our experiments show that QUAM excels in capturing
epistemic uncertainty for deep learning models and outperforms previous methods
on challenging tasks in the vision domain
A Dataset Perspective on Offline Reinforcement Learning
The application of Reinforcement Learning (RL) in real world environments can
be expensive or risky due to sub-optimal policies during training. In Offline
RL, this problem is avoided since interactions with an environment are
prohibited. Policies are learned from a given dataset, which solely determines
their performance. Despite this fact, how dataset characteristics influence
Offline RL algorithms is still hardly investigated. The dataset characteristics
are determined by the behavioral policy that samples this dataset. Therefore,
we define characteristics of behavioral policies as exploratory for yielding
high expected information in their interaction with the Markov Decision Process
(MDP) and as exploitative for having high expected return. We implement two
corresponding empirical measures for the datasets sampled by the behavioral
policy in deterministic MDPs. The first empirical measure SACo is defined by
the normalized unique state-action pairs and captures exploration. The second
empirical measure TQ is defined by the normalized average trajectory return and
captures exploitation. Empirical evaluations show the effectiveness of TQ and
SACo. In large-scale experiments using our proposed measures, we show that the
unconstrained off-policy Deep Q-Network family requires datasets with high SACo
to find a good policy. Furthermore, experiments show that policy constraint
algorithms perform well on datasets with high TQ and SACo. Finally, the
experiments show, that purely dataset-constrained Behavioral Cloning performs
competitively to the best Offline RL algorithms for datasets with high TQ.Comment: Code: https://github.com/ml-jku/OfflineR