35,711 research outputs found
Biased Stochastic Gradient Descent for Conditional Stochastic Optimization
Conditional Stochastic Optimization (CSO) covers a variety of applications
ranging from meta-learning and causal inference to invariant learning. However,
constructing unbiased gradient estimates in CSO is challenging due to the
composition structure. As an alternative, we propose a biased stochastic
gradient descent (BSGD) algorithm and study the bias-variance tradeoff under
different structural assumptions. We establish the sample complexities of BSGD
for strongly convex, convex, and weakly convex objectives, under smooth and
non-smooth conditions. We also provide matching lower bounds of BSGD for convex
CSO objectives. Extensive numerical experiments are conducted to illustrate the
performance of BSGD on robust logistic regression, model-agnostic meta-learning
(MAML), and instrumental variable regression (IV)
Byzantine Stochastic Gradient Descent
This paper studies the problem of distributed stochastic optimization in an
adversarial setting where, out of the machines which allegedly compute
stochastic gradients every iteration, an -fraction are Byzantine, and
can behave arbitrarily and adversarially. Our main result is a variant of
stochastic gradient descent (SGD) which finds -approximate
minimizers of convex functions in iterations. In contrast, traditional
mini-batch SGD needs iterations,
but cannot tolerate Byzantine failures. Further, we provide a lower bound
showing that, up to logarithmic factors, our algorithm is
information-theoretically optimal both in terms of sampling complexity and time
complexity
Data-driven Inverse Optimization with Imperfect Information
In data-driven inverse optimization an observer aims to learn the preferences
of an agent who solves a parametric optimization problem depending on an
exogenous signal. Thus, the observer seeks the agent's objective function that
best explains a historical sequence of signals and corresponding optimal
actions. We focus here on situations where the observer has imperfect
information, that is, where the agent's true objective function is not
contained in the search space of candidate objectives, where the agent suffers
from bounded rationality or implementation errors, or where the observed
signal-response pairs are corrupted by measurement noise. We formalize this
inverse optimization problem as a distributionally robust program minimizing
the worst-case risk that the {\em predicted} decision ({\em i.e.}, the decision
implied by a particular candidate objective) differs from the agent's {\em
actual} response to a random signal. We show that our framework offers rigorous
out-of-sample guarantees for different loss functions used to measure
prediction errors and that the emerging inverse optimization problems can be
exactly reformulated as (or safely approximated by) tractable convex programs
when a new suboptimality loss function is used. We show through extensive
numerical tests that the proposed distributionally robust approach to inverse
optimization attains often better out-of-sample performance than the
state-of-the-art approaches
- …