2,316 research outputs found

    Bidirectional compression in heterogeneous settings for distributed or federated learning with partial participation: tight convergence guarantees

    Full text link
    We introduce a framework - Artemis - to tackle the problem of learning in a distributed or federated setting with communication constraints and device partial Several workers (randomly sampled) perform the optimization process using a central server to aggregate their computations. To alleviate the communication cost, Artemis allows to compresses the information sent in both directions (from the workers to the server and conversely) combined with a memory It improves on existing algorithms that only consider unidirectional compression (to the server), or use very strong assumptions on the compression operator, and often do not take into account devices partial participation. We provide fast rates of convergence (linear up to a threshold) under weak assumptions on the stochastic gradients (noise's variance bounded only at optimal point) in non-i.i.d. setting, highlight the impact of memory for unidirectional and bidirectional compression, analyze Polyak-Ruppert averaging. We use convergence in distribution to obtain a lower bound of the asymptotic variance that highlights practical limits of compression. And we provide experimental results to demonstrate the validity of our analysis.Comment: 56 pages, 4 theorems, 1 algorithm, code source on GitHu

    Metric based up-scaling

    Get PDF
    We consider divergence form elliptic operators in dimension n≥2n\geq 2 with L∞L^\infty coefficients. Although solutions of these operators are only H\"{o}lder continuous, we show that they are differentiable (C1,αC^{1,\alpha}) with respect to harmonic coordinates. It follows that numerical homogenization can be extended to situations where the medium has no ergodicity at small scales and is characterized by a continuum of scales by transferring a new metric in addition to traditional averaged (homogenized) quantities from subgrid scales into computational scales and error bounds can be given. This numerical homogenization method can also be used as a compression tool for differential operators.Comment: Final version. Accepted for publication in Communications on Pure and Applied Mathematics. Presented at CIMMS (March 2005), Socams 2005 (April), Oberwolfach, MPI Leipzig (May 2005), CIRM (July 2005). Higher resolution figures are available at http://www.acm.caltech.edu/~owhadi

    EControl: Fast Distributed Optimization with Compression and Error Control

    Full text link
    Modern distributed training relies heavily on communication compression to reduce the communication overhead. In this work, we study algorithms employing a popular class of contractive compressors in order to reduce communication overhead. However, the naive implementation often leads to unstable convergence or even exponential divergence due to the compression bias. Error Compensation (EC) is an extremely popular mechanism to mitigate the aforementioned issues during the training of models enhanced by contractive compression operators. Compared to the effectiveness of EC in the data homogeneous regime, the understanding of the practicality and theoretical foundations of EC in the data heterogeneous regime is limited. Existing convergence analyses typically rely on strong assumptions such as bounded gradients, bounded data heterogeneity, or large batch accesses, which are often infeasible in modern machine learning applications. We resolve the majority of current issues by proposing EControl, a novel mechanism that can regulate error compensation by controlling the strength of the feedback signal. We prove fast convergence for EControl in standard strongly convex, general convex, and nonconvex settings without any additional assumptions on the problem or data heterogeneity. We conduct extensive numerical evaluations to illustrate the efficacy of our method and support our theoretical findings
    • …
    corecore