14 research outputs found
Complexity-Free Generalization via Distributionally Robust Optimization
Established approaches to obtain generalization bounds in data-driven
optimization and machine learning mostly build on solutions from empirical risk
minimization (ERM), which depend crucially on the functional complexity of the
hypothesis class. In this paper, we present an alternate route to obtain these
bounds on the solution from distributionally robust optimization (DRO), a
recent data-driven optimization framework based on worst-case analysis and the
notion of ambiguity set to capture statistical uncertainty. In contrast to the
hypothesis class complexity in ERM, our DRO bounds depend on the ambiguity set
geometry and its compatibility with the true loss function. Notably, when using
maximum mean discrepancy as a DRO distance metric, our analysis implies, to the
best of our knowledge, the first generalization bound in the literature that
depends solely on the true loss function, entirely free of any complexity
measures or bounds on the hypothesis class
Bounding Optimality Gap in Stochastic Optimization via Bagging: Statistical Efficiency and Stability
We study a statistical method to estimate the optimal value, and the
optimality gap of a given solution for stochastic optimization as an assessment
of the solution quality. Our approach is based on bootstrap aggregating, or
bagging, resampled sample average approximation (SAA). We show how this
approach leads to valid statistical confidence bounds for non-smooth
optimization. We also demonstrate its statistical efficiency and stability that
are especially desirable in limited-data situations, and compare these
properties with some existing methods. We present our theory that views SAA as
a kernel in an infinite-order symmetric statistic, which can be approximated
via bagging. We substantiate our theoretical findings with numerical results
Distributional Robust Batch Contextual Bandits
Policy learning using historical observational data is an important problem
that has found widespread applications. Examples include selecting offers,
prices, advertisements to send to customers, as well as selecting which
medication to prescribe to a patient. However, existing literature rests on the
crucial assumption that the future environment where the learned policy will be
deployed is the same as the past environment that has generated the data--an
assumption that is often false or too coarse an approximation. In this paper,
we lift this assumption and aim to learn a distributional robust policy with
incomplete (bandit) observational data. We propose a novel learning algorithm
that is able to learn a robust policy to adversarial perturbations and unknown
covariate shifts. We first present a policy evaluation procedure in the
ambiguous environment and then give a performance guarantee based on the theory
of uniform convergence. Additionally, we also give a heuristic algorithm to
solve the distributional robust policy learning problems efficiently.Comment: The short version has been accepted in ICML 202