646 research outputs found
Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods
This work proposes a universal and adaptive second-order method for
minimizing second-order smooth, convex functions. Our algorithm achieves
convergence when the oracle feedback is stochastic with
variance , and improves its convergence to with
deterministic oracles, where is the number of iterations. Our method also
interpolates these rates without knowing the nature of the oracle apriori,
which is enabled by a parameter-free adaptive step-size that is oblivious to
the knowledge of smoothness modulus, variance bounds and the diameter of the
constrained set. To our knowledge, this is the first universal algorithm with
such global guarantees within the second-order optimization literature.Comment: 32 pages, 4 figures, accepted at NeurIPS 202
Optimal and Fair Encouragement Policy Evaluation and Learning
In consequential domains, it is often impossible to compel individuals to
take treatment, so that optimal policy rules are merely suggestions in the
presence of human non-adherence to treatment recommendations. In these same
domains, there may be heterogeneity both in who responds in taking-up
treatment, and heterogeneity in treatment efficacy. While optimal treatment
rules can maximize causal outcomes across the population, access parity
constraints or other fairness considerations can be relevant in the case of
encouragement. For example, in social services, a persistent puzzle is the gap
in take-up of beneficial services among those who may benefit from them the
most. When in addition the decision-maker has distributional preferences over
both access and average outcomes, the optimal decision rule changes. We study
causal identification, statistical variance-reduced estimation, and robust
estimation of optimal treatment rules, including under potential violations of
positivity. We consider fairness constraints such as demographic parity in
treatment take-up, and other constraints, via constrained optimization. Our
framework can be extended to handle algorithmic recommendations under an
often-reasonable covariate-conditional exclusion restriction, using our
robustness checks for lack of positivity in the recommendation. We develop a
two-stage algorithm for solving over parametrized policy classes under general
constraints to obtain variance-sensitive regret bounds. We illustrate the
methods in two case studies based on data from randomized encouragement to
enroll in insurance and from pretrial supervised release with electronic
monitoring
Catalyst Acceleration for Gradient-Based Non-Convex Optimization
We introduce a generic scheme to solve nonconvex optimization problems using
gradient-based algorithms originally designed for minimizing convex functions.
Even though these methods may originally require convexity to operate, the
proposed approach allows one to use them on weakly convex objectives, which
covers a large class of non-convex functions typically appearing in machine
learning and signal processing. In general, the scheme is guaranteed to produce
a stationary point with a worst-case efficiency typical of first-order methods,
and when the objective turns out to be convex, it automatically accelerates in
the sense of Nesterov and achieves near-optimal convergence rate in function
values. These properties are achieved without assuming any knowledge about the
convexity of the objective, by automatically adapting to the unknown weak
convexity constant. We conclude the paper by showing promising experimental
results obtained by applying our approach to incremental algorithms such as
SVRG and SAGA for sparse matrix factorization and for learning neural networks
- …