646 research outputs found

    Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods

    Full text link
    This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions. Our algorithm achieves O(σ/T)O(\sigma / \sqrt{T}) convergence when the oracle feedback is stochastic with variance σ2\sigma^2, and improves its convergence to O(1/T3)O( 1 / T^3) with deterministic oracles, where TT is the number of iterations. Our method also interpolates these rates without knowing the nature of the oracle apriori, which is enabled by a parameter-free adaptive step-size that is oblivious to the knowledge of smoothness modulus, variance bounds and the diameter of the constrained set. To our knowledge, this is the first universal algorithm with such global guarantees within the second-order optimization literature.Comment: 32 pages, 4 figures, accepted at NeurIPS 202

    Optimal and Fair Encouragement Policy Evaluation and Learning

    Full text link
    In consequential domains, it is often impossible to compel individuals to take treatment, so that optimal policy rules are merely suggestions in the presence of human non-adherence to treatment recommendations. In these same domains, there may be heterogeneity both in who responds in taking-up treatment, and heterogeneity in treatment efficacy. While optimal treatment rules can maximize causal outcomes across the population, access parity constraints or other fairness considerations can be relevant in the case of encouragement. For example, in social services, a persistent puzzle is the gap in take-up of beneficial services among those who may benefit from them the most. When in addition the decision-maker has distributional preferences over both access and average outcomes, the optimal decision rule changes. We study causal identification, statistical variance-reduced estimation, and robust estimation of optimal treatment rules, including under potential violations of positivity. We consider fairness constraints such as demographic parity in treatment take-up, and other constraints, via constrained optimization. Our framework can be extended to handle algorithmic recommendations under an often-reasonable covariate-conditional exclusion restriction, using our robustness checks for lack of positivity in the recommendation. We develop a two-stage algorithm for solving over parametrized policy classes under general constraints to obtain variance-sensitive regret bounds. We illustrate the methods in two case studies based on data from randomized encouragement to enroll in insurance and from pretrial supervised release with electronic monitoring

    Catalyst Acceleration for Gradient-Based Non-Convex Optimization

    Get PDF
    We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions. Even though these methods may originally require convexity to operate, the proposed approach allows one to use them on weakly convex objectives, which covers a large class of non-convex functions typically appearing in machine learning and signal processing. In general, the scheme is guaranteed to produce a stationary point with a worst-case efficiency typical of first-order methods, and when the objective turns out to be convex, it automatically accelerates in the sense of Nesterov and achieves near-optimal convergence rate in function values. These properties are achieved without assuming any knowledge about the convexity of the objective, by automatically adapting to the unknown weak convexity constant. We conclude the paper by showing promising experimental results obtained by applying our approach to incremental algorithms such as SVRG and SAGA for sparse matrix factorization and for learning neural networks
    corecore