Search CORE

191 research outputs found

Quantifying Epistemic Uncertainty in Deep Learning

Author: Huang Ziyi
Lam Henry
Zhang Haofeng
Publication venue
Publication date: 19/02/2022
Field of study

Uncertainty quantification is at the core of the reliability and robustness of machine learning. In this paper, we provide a theoretical framework to dissect the uncertainty, especially the epistemic component, in deep learning into procedural variability (from the training procedure) and data variability (from the training data), which is the first such attempt in the literature to our best knowledge. We then propose two approaches to estimate these uncertainties, one based on influence function and one on batching. We demonstrate how our approaches overcome the computational difficulties in applying classical statistical methods. Experimental evaluations on multiple problem settings corroborate our theory and illustrate how our framework and estimation can provide direct guidance on modeling and data collection effort to improve deep learning performance

arXiv.org e-Print Archive

Asymptotically Optimal Pure Exploration for Infinite-Armed Bandits

Author: Gong Xiao-Yue
Sellke Mark
Publication venue
Publication date: 03/06/2023
Field of study

We study pure exploration with infinitely many bandit arms generated i.i.d. from an unknown distribution. Our goal is to efficiently select a single high quality arm whose average reward is, with probability

1-\delta

, within

\varepsilon

of being among the top

\eta

-fraction of arms; this is a natural adaptation of the classical PAC guarantee for infinite action sets. We consider both the fixed confidence and fixed budget settings, aiming respectively for minimal expected and fixed sample complexity. For fixed confidence, we give an algorithm with expected sample complexity

O\left(\frac{\log (1/\eta)\log (1/\delta)}{\eta\varepsilon^2}\right)

. This is optimal except for the

\log (1/\eta)

factor, and the

\delta

-dependence closes a quadratic gap in the literature. For fixed budget, we show the asymptotically optimal sample complexity as

\delta\to 0

c^{-1}\log(1/\delta)\big(\log\log(1/\delta)\big)^2

to leading order. Equivalently, the optimal failure probability given exactly

N

samples decays as

\exp\big(-cN/\log^2 N\big)

, up to a factor

1\pm o_N(1)

inside the exponent. The constant

c

depends explicitly on the problem parameters (including the unknown arm distribution) through a certain Fisher information distance. Even the strictly super-linear dependence on

\log(1/\delta)

was not known and resolves a question of Grossman and Moshkovitz (FOCS 2016, SIAM Journal on Computing 2020)

arXiv.org e-Print Archive

Online Learning of Energy Consumption for Navigation of Electric Vehicles

Author: \uc5kerblom Niklas
Chen Yuxin
Haghir Chehreghani Morteza
Publication venue: 'Elsevier BV'
Publication date: 01/01/2023
Field of study

Energy efficient navigation constitutes an important challenge in electric vehicles, due to their limited battery capacity. We employ a Bayesian approach to model the energy consumption at road segments for efficient navigation. In order to learn the model parameters, we develop an online learning framework and investigate several exploration strategies such as Thompson Sampling and Upper Confidence Bound. We then extend our online learning framework to the multi-agent setting, where multiple vehicles adaptively navigate and learn the parameters of the energy model. We analyze Thompson Sampling and establish rigorous regret bounds on its performance in the single-agent and multi-agent settings, through an analysis of the algorithm under batched feedback. Finally, we demonstrate the performance of our methods via experiments on several real-world city road networks

Chalmers Research

Online Learning of Energy Consumption for Navigation of Electric Vehicles

Author: Chehreghani Morteza Haghir
Chen Yuxin
Åkerblom Niklas
Publication venue: 'Elsevier BV'
Publication date: 01/01/2023
Field of study

arXiv.org e-Print Archive

Chalmers Research

Knowledge UChicago

Modeling Persistent Trends in Distributions

Author: Gifford David
Jaakkola Tommi
Mueller Jonas
Publication venue: 'Informa UK Limited'
Publication date: 24/05/2017
Field of study

We present a nonparametric framework to model a short sequence of probability distributions that vary both due to underlying effects of sequential progression and confounding noise. To distinguish between these two types of variation and estimate the sequential-progression effects, our approach leverages an assumption that these effects follow a persistent trend. This work is motivated by the recent rise of single-cell RNA-sequencing experiments over a brief time course, which aim to identify genes relevant to the progression of a particular biological process across diverse cell populations. While classical statistical tools focus on scalar-response regression or order-agnostic differences between distributions, it is desirable in this setting to consider both the full distributions as well as the structure imposed by their ordering. We introduce a new regression model for ordinal covariates where responses are univariate distributions and the underlying relationship reflects consistent changes in the distributions over increasing levels of the covariate. This concept is formalized as a "trend" in distributions, which we define as an evolution that is linear under the Wasserstein metric. Implemented via a fast alternating projections algorithm, our method exhibits numerous strengths in simulations and analyses of single-cell gene expression data.Comment: To appear in: Journal of the American Statistical Associatio

arXiv.org e-Print Archive

DSpace@MIT

FigShare

Recommended from our members

Generalized Probabilistic Bisection for Stochastic Root-Finding

Author: Rodriguez Hernandez Sergio
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

This thesis studies the stochastic root-finding problem, which consists of estimating the point x∗ that solves the equation h(x∗) = 0, where the function h : (0,1) → R is learned via a stochastic simulator (oracle). Instead of focusing on modeling h(·), we develop statistical methodologies that directly infer x∗ following a fully Bayesian approach. To do so, we investigate procedures that generalize the Probabilistic Bisection Algorithm (PBA) first introduced in Horstein (1963). The PBA is a one-dimensional stochastic root-finding routine which builds an explicit Bayesian representation (i.e., a posterior density) for x∗ based on the history of noisy function evaluations and sampling locations. The PBA starts by assuming that x∗ is the realized value of an absolutely continuous random variable, X∗ ∼ g0, with prior density g0. Then, it recursively updates a posterior, gn, leveraging the information provided by the signs (positive/negative) of the noisy function evaluations — which inform the direction where x∗ is located with respect to a given location, x—. Due to observational noise, the oracle responses are correct only with probability p(x). Waeber et al. (2013) showed that sampling at the median of gn is an optimal sampling strategy and established exponential convergence of the posterior gn to a Dirac mass at the true x∗ under the very restrictive assumption that the probability of correct response p(x) is known and constant for all x; however, in the most general and practical settings the latter condition no longer holds and the only way to implement the PBA is to estimate p(·).In the first part of this thesis, we state the Generalized PBA (G-PBA), where the above assumption is relaxed to the case where the sampling distribution of the oracle is unknown and location-dependent. Namely, as in standard PBA, we rely on a knowledge state to approximate the posterior of the root location. To implement the corresponding Bayesian updating, we also carry out inference of p(·). To this end we utilize batched querying in combination with a variety of frequentist and Bayesian estimators based on majority vote, as well as the underlying functional responses, if available. For guiding sampling selection we propose two families of sampling policies: batched Information Di- rected Sampling and Randomized Quantile Sampling, which is a reminiscent of Thompson Sampling and a generalization of the median sampling as in classical PBA. The latter leads to the first main conclusion: the G-PBA is able to efficiently learn p(·) and X∗ simultaneously.In the second part of this thesis, we propose to leverage the spatial structure of a typical oracle by constructing a non-parametric statistical surrogate for p(·) based on binomial regression. The latter leads to the second main conclusion: surrogate modeling allows to determine the batch size for querying the oracle adaptively as a function of the estimated predictive uncertainty of p(·).In the last part of this thesis, we present extensive numerical experiments in order to evaluate our sampling strategies (information-based or randomized). In particular we demonstrate the efficiency of randomized quantile sampling for balancing the ex- ploration/exploitation component; moreover, we show that spatial surrogate modeling results in significant gains relative to the local estimators, as quantified by the improved quality of the resulting root estimates (namely lower absolute residuals, narrower credible intervals and dramatically higher probability coverage). Our work is motivated by the root-finding sub-routine in pricing of Bermudan financial derivatives, illustrated in the last section of this thesis

eScholarship - University of California

Steady-State Co-Kriging Models

Author: Hemmati Sahar
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2017
Field of study

In deterministic computer experiments, a computer code can often be run at different levels of complexity/fidelity and a hierarchy of levels of code can be obtained. The higher the fidelity and hence the computational cost, the more accurate output data can be obtained. Methods based on the co-kriging methodology Cressie (2015) for predicting the output of a high-fidelity computer code by combining data generated to varying levels of fidelity have become popular over the last two decades. For instance, Kennedy and O\u27Hagan (2000) first propose to build a metamodel for multi-level computer codes by using an auto-regressive model structure. Forrester et al. (2007) provide details on estimation of the model parameters and further investigate the use of co-kriging for multi-fidelity optimization based on the efficient global optimization algorithm Jones et al. (1998). Qian and Wu (2008) propose a Bayesian hierarchical modeling approach for combining low-accuracy and high-accuracy experiments. More recently, Gratiet and Cannamela (2015) propose sequential design strategies using fast cross-validation techniques for multi-fidelity computer codes.;This research intends to extend the co-kriging metamodeling methodology to study steady-state simulation experiments. First, the mathematical structure of co-kriging is extended to take into account heterogeneous simulation output variances. Next, efficient steady-state simulation experimental designs are investigated for co-kriging to achieve a high prediction accuracy for estimation of steady-state parameters. Specifically, designs consisting of replicated longer simulation runs at a few design points and replicated shorter simulation runs at a larger set of design points will be considered. Also, design with no replicated simulation runs at long simulation is studied, along with different methods for calculating the output variance in absence of replicated outputs.;Stochastic co-kriging (SCK) method is applied to an M/M/1, as well as an M/M/5 queueing system. In both examples, the prediction performance of the SCK model is promising. It is also shown that the SCK method provides better response surfaces compared to the SK method

ProQuest OAI Repository

The Research Repository @ WVU (West Virginia University)