29 research outputs found

    Concentration of the Langevin Algorithm's Stationary Distribution

    Full text link
    A canonical algorithm for log-concave sampling is the Langevin Algorithm, aka the Langevin Diffusion run with some discretization stepsize η>0\eta > 0. This discretization leads the Langevin Algorithm to have a stationary distribution πη\pi_{\eta} which differs from the stationary distribution π\pi of the Langevin Diffusion, and it is an important challenge to understand whether the well-known properties of π\pi extend to πη\pi_{\eta}. In particular, while concentration properties such as isoperimetry and rapidly decaying tails are classically known for π\pi, the analogous properties for πη\pi_{\eta} are open questions with direct algorithmic implications. This note provides a first step in this direction by establishing concentration results for πη\pi_{\eta} that mirror classical results for π\pi. Specifically, we show that for any nontrivial stepsize η>0\eta > 0, πη\pi_{\eta} is sub-exponential (respectively, sub-Gaussian) when the potential is convex (respectively, strongly convex). Moreover, the concentration bounds we show are essentially tight. Key to our analysis is the use of a rotation-invariant moment generating function (aka Bessel function) to study the stationary dynamics of the Langevin Algorithm. This technique may be of independent interest because it enables directly analyzing the discrete-time stationary distribution πη\pi_{\eta} without going through the continuous-time stationary distribution π\pi as an intermediary

    Near-linear convergence of the Random Osborne algorithm for Matrix Balancing

    Full text link
    We revisit Matrix Balancing, a pre-conditioning task used ubiquitously for computing eigenvalues and matrix exponentials. Since 1960, Osborne's algorithm has been the practitioners' algorithm of choice and is now implemented in most numerical software packages. However, its theoretical properties are not well understood. Here, we show that a simple random variant of Osborne's algorithm converges in near-linear time in the input sparsity. Specifically, it balances KR0n×nK\in\mathbb{R}_{\geq 0}^{n\times n} after O(mϵ2logκ)O(m\epsilon^{-2}\log\kappa) arithmetic operations, where mm is the number of nonzeros in KK, ϵ\epsilon is the 1\ell_1 accuracy, and κ=ijKij/(minij:Kij0Kij)\kappa=\sum_{ij}K_{ij}/(\min_{ij:K_{ij}\neq 0}K_{ij}) measures the conditioning of KK. Previous work had established near-linear runtimes either only for 2\ell_2 accuracy (a weaker criterion which is less relevant for applications), or through an entirely different algorithm based on (currently) impractical Laplacian solvers. We further show that if the graph with adjacency matrix KK is moderately connected--e.g., if KK has at least one positive row/column pair--then Osborne's algorithm initially converges exponentially fast, yielding an improved runtime O(mϵ1logκ)O(m\epsilon^{-1}\log\kappa). We also address numerical precision by showing that these runtime bounds still hold when using O(log(nκ/ϵ))O(\log(n\kappa/\epsilon))-bit numbers. Our results are established through an intuitive potential argument that leverages a convex optimization perspective of Osborne's algorithm, and relates the per-iteration progress to the current imbalance as measured in Hellinger distance. Unlike previous analyses, we critically exploit log-convexity of the potential. Our analysis extends to other variants of Osborne's algorithm: along the way, we establish significantly improved runtime bounds for cyclic, greedy, and parallelized variants.Comment: v2: Fixed minor typos. Modified title for clarity. Corrected statement of Thm 6.1; this does not affect our main result

    Acceleration by Stepsize Hedging I: Multi-Step Descent and the Silver Stepsize Schedule

    Full text link
    Can we accelerate convergence of gradient descent without changing the algorithm -- just by carefully choosing stepsizes? Surprisingly, we show that the answer is yes. Our proposed Silver Stepsize Schedule optimizes strongly convex functions in klogρ2k0.7864k^{\log_{\rho} 2} \approx k^{0.7864} iterations, where ρ=1+2\rho=1+\sqrt{2} is the silver ratio and kk is the condition number. This is intermediate between the textbook unaccelerated rate kk and the accelerated rate k\sqrt{k} due to Nesterov in 1983. The non-strongly convex setting is conceptually identical, and standard black-box reductions imply an analogous accelerated rate εlogρ2ε0.7864\varepsilon^{-\log_{\rho} 2} \approx \varepsilon^{-0.7864}. We conjecture and provide partial evidence that these rates are optimal among all possible stepsize schedules. The Silver Stepsize Schedule is constructed recursively in a fully explicit way. It is non-monotonic, fractal-like, and approximately periodic of period klogρ2k^{\log_{\rho} 2}. This leads to a phase transition in the convergence rate: initially super-exponential (acceleration regime), then exponential (saturation regime).Comment: 7 figure

    Acceleration by Stepsize Hedging II: Silver Stepsize Schedule for Smooth Convex Optimization

    Full text link
    We provide a concise, self-contained proof that the Silver Stepsize Schedule proposed in Part I directly applies to smooth (non-strongly) convex optimization. Specifically, we show that with these stepsizes, gradient descent computes an ϵ\epsilon-minimizer in O(ϵlogρ2)=O(ϵ0.7864)O(\epsilon^{-\log_{\rho} 2}) = O(\epsilon^{-0.7864}) iterations, where ρ=1+2\rho = 1+\sqrt{2} is the silver ratio. This is intermediate between the textbook unaccelerated rate O(ϵ1)O(\epsilon^{-1}) and the accelerated rate O(ϵ1/2)O(\epsilon^{-1/2}) due to Nesterov in 1983. The Silver Stepsize Schedule is a simple explicit fractal: the ii-th stepsize is 1+ρv(i)11+\rho^{v(i)-1} where v(i)v(i) is the 22-adic valuation of ii. The design and analysis are conceptually identical to the strongly convex setting in Part I, but simplify remarkably in this specific setting.Comment: 10 pages, 3 figure

    Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent

    Full text link
    We study first-order optimization algorithms for computing the barycenter of Gaussian distributions with respect to the optimal transport metric. Although the objective is geodesically non-convex, Riemannian GD empirically converges rapidly, in fact faster than off-the-shelf methods such as Euclidean GD and SDP solvers. This stands in stark contrast to the best-known theoretical results for Riemannian GD, which depend exponentially on the dimension. In this work, we prove new geodesic convexity results which provide stronger control of the iterates, yielding a dimension-free convergence rate. Our techniques also enable the analysis of two related notions of averaging, the entropically-regularized barycenter and the geometric median, providing the first convergence guarantees for Riemannian GD for these problems.Comment: 48 pages, 8 figure

    Development and implementation of a prescription opioid registry across diverse health systems

    Get PDF
    Objective: Develop and implement a prescription opioid registry in 10 diverse health systems across the US and describe trends in prescribed opioids between 2012 and 2018. Materials and Methods: Using electronic health record and claims data, we identified patients who had an outpatient fill for any prescription opioid, and/or an opioid use disorder diagnosis, between January 1, 2012 and December 31, 2018. The registry contains distributed files of prescription opioids, benzodiazepines and other select medications, opioid antagonists, clinical diagnoses, procedures, health services utilization, and health plan membership. Rates of outpatient opioid fills over the study period, standardized to health system demographic distributions, are described by age, gender, and race/ethnicity among members without cancer. Results: The registry includes 6 249 710 patients and over 40 million outpatient opioid fills. For the combined registry population, opioid fills declined from a high of 0.718 per member-year in 2013 to 0.478 in 2018, and morphine milligram equivalents (MMEs) per fill declined from 985 MMEs per fill in 2012 to 758 MMEs in 2018. MMEs per member declined from 692 MMEs per member in 2012 to 362 MMEs per member in 2018. Conclusion: This study established a population-based opioid registry across 10 diverse health systems that can be used to address questions related to opioid use. Initial analyses showed large reductions in overall opioid use per member among the combined health systems. The registry will be used in future studies to answer a broad range of other critical public health issues relating to prescription opioid use
    corecore