Search CORE

931 research outputs found

Stable splittings, spaces of representations and almost commuting elements in Lie groups

Author: Adem
Adem
ALEJANDRO ADEM
Borel
Bredon
FREDERICK R. COHEN
JOSÉ MANUEL GÓMEZ
Murayama
Sjerve
Steenrod
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 04/10/2010
Field of study

In this paper the space of almost commuting elements in a Lie group is studied through a homotopical point of view. In particular a stable splitting after one suspension is derived for these spaces and their quotients under conjugation. A complete description for the stable factors appearing in this splitting is provided for compact connected Lie groups of rank one.By using symmetric products, the colimits \Rep(\Z^n, SU), \Rep(\Z^n,U) and \Rep(\Z^n, Sp) are explicitly described as finite products of Eilenberg-MacLane spaces.Comment: 37 Pages. To appear in Math. Proc. Camb. Phil. So

arXiv.org e-Print Archive

Crossref

Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates

Author: Cohen Alejandro
Lang Natalie
Shlezinger Nir
Publication venue
Publication date: 27/03/2024
Field of study

Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning. It typically involves a set of heterogeneous devices locally training neural network (NN) models in parallel with periodic centralized aggregations. As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers. Conventional approaches discard incomplete intra-model updates done by stragglers, alter the amount of local workload and architecture, or resort to asynchronous settings; which all affect the trained model performance under tight training latency constraints. In this work, we propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion. SALF allows stragglers to synchronously convey partial gradients, having each layer of the global model be updated independently with a different contributing set of users. We provide a theoretical analysis, establishing convergence guarantees for the global model under mild assumptions on the distribution of the participating devices, revealing that SALF converges at the same asymptotic rate as FL with no timing limitations. This insight is matched with empirical observations, demonstrating the performance gains of SALF compared to alternative mechanisms mitigating the device heterogeneity gap in FL

arXiv.org e-Print Archive

Distributed Computations with Layered Resolution

Author: Cohen Alejandro
Esfahanizadeh Homa
Médard Muriel
Shamai Shlomo
Publication venue
Publication date: 02/08/2022
Field of study

Modern computationally-heavy applications are often time-sensitive, demanding distributed strategies to accelerate them. On the other hand, distributed computing suffers from the bottleneck of slow workers in practice. Distributed coded computing is an attractive solution that adds redundancy such that a subset of distributed computations suffices to obtain the final result. However, the final result is still either obtained within a desired time or not, and for the latter, the resources that are spent are wasted. In this paper, we introduce the novel concept of layered-resolution distributed coded computations such that lower resolutions of the final result are obtained from collective results of the workers -- at an earlier stage than the final result. This innovation makes it possible to have more effective deadline-based systems, since even if a computational job is terminated because of timing, an approximated version of the final result can be released. Based on our theoretical and empirical results, the average execution delay for the first resolution is notably smaller than the one for the final resolution. Moreover, the probability of meeting a deadline is one for the first resolution in a setting where the final resolution exceeds the deadline almost all the time, reducing the success rate of the systems with no layering

arXiv.org e-Print Archive

Stream Distributed Coded Computing

Author: Cohen Alejandro
Esfahanizadeh Homa
Médard Muriel
Thiran Guillaume
Publication venue
Publication date: 01/01/2021
Field of study

The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the stragglers. To address this challenge, introducing efficient amount of redundant computations via distributed coded computation has received significant attention. Recent approaches in this area have mainly focused on introducing minimum computational redundancies to tolerate certain number of stragglers. To the best of our knowledge, the current literature lacks a unified end-to-end design in a heterogeneous setting where the workers can vary in their computation and communication capabilities. The contribution of this paper is to devise a novel framework for joint scheduling-coding, in a setting where the workers and the arrival of stream computational jobs are based on stochastic models. In our initial joint scheme, we propose a systematic framework that illustrates how to select a set of workers and how to split the computational load among the selected workers based on their differences in order to minimize the average in-order job execution delay. Through simulations, we demonstrate that the performance of our framework is dramatically better than the performance of naive method that splits the computational load uniformly among the workers, and it is close to the ideal performance

arXiv.org e-Print Archive

DSpace@MIT

DIAL UCLouvain