29 research outputs found
Estimating Uncertainty Online Against an Adversary
Assessing uncertainty is an important step towards ensuring the safety and
reliability of machine learning systems. Existing uncertainty estimation
techniques may fail when their modeling assumptions are not met, e.g. when the
data distribution differs from the one seen at training time. Here, we propose
techniques that assess a classification algorithm's uncertainty via calibrated
probabilities (i.e. probabilities that match empirical outcome frequencies in
the long run) and which are guaranteed to be reliable (i.e. accurate and
calibrated) on out-of-distribution input, including input generated by an
adversary. This represents an extension of classical online learning that
handles uncertainty in addition to guaranteeing accuracy under adversarial
assumptions. We establish formal guarantees for our methods, and we validate
them on two real-world problems: question answering and medical diagnosis from
genomic data
Calibrated Propensity Scores for Causal Effect Estimation
Propensity scores are commonly used to balance observed covariates while
estimating treatment effects. Estimates obtained through propensity score
weighing can be biased when the propensity score model cannot learn the true
treatment assignment mechanism. We argue that the probabilistic output of a
learned propensity score model should be calibrated, i.e. a predictive
treatment probability of 90% should correspond to 90% of individuals being
assigned the treatment group. We propose simple recalibration techniques to
ensure this property. We investigate the theoretical properties of a calibrated
propensity score model and its role in unbiased treatment effect estimation. We
demonstrate improved causal effect estimation with calibrated propensity scores
in several tasks including high-dimensional genome-wide association studies,
where we also show reduced computational requirements when calibration is
applied to simpler propensity score models.Comment: 23 pages, 3 figure
Adversarial Calibrated Regression for Online Decision Making
Accurately estimating uncertainty is an essential component of
decision-making and forecasting in machine learning. However, existing
uncertainty estimation methods may fail when data no longer follows the
distribution seen during training. Here, we introduce online uncertainty
estimation algorithms that are guaranteed to be reliable on arbitrary streams
of data points, including data chosen by an adversary. Specifically, our
algorithms perform post-hoc recalibration of a black-box regression model and
produce outputs that are provably calibrated -- i.e., an 80% confidence
interval will contain the true outcome 80% of the time -- and that have low
regret relative to the learning objective of the base model. We apply our
algorithms in the context of Bayesian optimization, an online model-based
decision-making task in which the data distribution shifts over time, and
observe accelerated convergence to improved optima. Our results suggest that
robust uncertainty quantification has the potential to improve online
decision-making.Comment: arXiv admin note: text overlap with arXiv:1607.0359
Calibrated Uncertainty Estimation Improves Bayesian Optimization
Bayesian optimization is a sequential procedure for obtaining the global
optimum of black-box functions without knowing a priori their true form. Good
uncertainty estimates over the shape of the objective function are essential in
guiding the optimization process. However, these estimates can be inaccurate if
the true objective function violates assumptions made by its model (e.g.,
Gaussianity). This paper studies which uncertainties are needed in Bayesian
optimization models and argues that ideal uncertainties should be calibrated --
i.e., an 80% predictive interval should contain the true outcome 80% of the
time. We propose a simple algorithm for enforcing this property and show that
it enables Bayesian optimization to arrive at the global optimum in fewer
steps. We provide theoretical insights into the role of calibrated
uncertainties and demonstrate the improved performance of our method on
standard benchmark functions and hyperparameter optimization tasks