21 research outputs found

    Uncoded caching and cross-level coded delivery for non-uniform file popularity

    Get PDF

    Uncoded caching and cross-level coded delivery for non-uniform file popularity

    Get PDF
    Proactive content caching at user devices and coded delivery is studied considering a non-uniform file popularity distribution. A novel centralized uncoded caching and coded delivery scheme, which can be applied to large file libraries, is proposed. The proposed cross-level coded delivery (CLCD) scheme is shown to achieve a lower average delivery rate than the state of art. In the proposed CLCD scheme, the same subpacketization is used for all the files in the library in order to prevent additional zero-padding in the delivery phase, and unlike the existing schemes in the literature, two users requesting files from different popularity groups can be served by the same multicast message in order to reduce the delivery rate. Simulation results indicate significant reduction in the average delivery rate for typical Zipf distribution parameter values

    FedADC: Accelerated Federated Learning with Drift Control

    No full text
    Federated learning (FL) has become de facto framework for collaborative learning among edge devices with privacy concern. The core of the FL strategy is the use of stochastic gradient descent (SGD) in a distributed manner. Large scale implementation of FL brings new challenges, such as the incorporation of acceleration techniques designed for SGD into the distributed setting, and mitigation of the drift problem due to non-homogeneous distribution of local datasets. These two problems have been separately studied in the literature; whereas, in this paper, we show that it is possible to address both problems using a single strategy without any major alteration to the FL framework, or introducing additional computation and communication load. To achieve this goal, we propose FedADC, which is an accelerated FL algorithm with drift control. We empirically illustrate the advantages of FedADC

    FedADC: Accelerated Federated Learning with Drift Control

    No full text
    Federated learning (FL) has become de facto framework for collaborative learning among edge devices with privacy concern. The core of the FL strategy is the use of stochastic gradient descent (SGD) in a distributed manner. Large scale implementation of FL brings new challenges, such as the incorporation of acceleration techniques designed for SGD into the distributed setting, and mitigation of the drift problem due to non-homogeneous distribution of local datasets. These two problems have been separately studied in the literature; whereas, in this paper, we show that it is possible to address both problems using a single strategy without any major alteration to the FL framework, or introducing additional computation and communication load. To achieve this goal, we propose FedADC, which is an accelerated FL algorithm with drift control. We empirically illustrate the advantages of FedADC

    Coded distributed computing with partial recovery

    Get PDF
    Coded computation techniques provide robustness against straggling workers in distributed computing. However, most of the existing schemes require exact provisioning of the straggling behavior and ignore the computations carried out by straggling workers. Moreover, these schemes are typically designed to recover the desired computation results accurately, while in many machine learning and iterative optimization algorithms, faster approximate solutions are known to result in an improvement in the overall convergence time. In this paper, we first introduce a novel coded matrix-vector multiplication scheme, called coded computation with partial recovery (CCPR), which benefits from the advantages of both coded and uncoded computation schemes, and reduces both the computation time and the decoding complexity by allowing a trade-off between the accuracy and the speed of computation. We then extend this approach to distributed implementation of more general computation tasks by proposing a coded communication scheme with partial recovery, where the results of subtasks computed by the workers are coded before being communicated. Numerical simulations on a large linear regression task confirm the benefits of the proposed scheme in terms of the trade-off between the computation accuracy and latency

    Gradient coding with dynamic clustering for straggler-tolerant distributed learning

    Get PDF
    Distributed implementations are crucial in speeding up large scale machine learning applications. Distributed gradient descent (GD) is widely employed to parallelize the learning task by distributing the dataset across multiple workers. A significant performance bottleneck for the per-iteration completion time in distributed synchronous GD is straggling workers. Coded distributed computation techniques have been introduced recently to mitigate stragglers and to speed up GD iterations by assigning redundant computations to workers. In this paper, we introduce a novel paradigm of dynamic coded computation, which assigns redundant data to workers to acquire the flexibility to dynamically choose from among a set of possible codes depending on the past straggling behavior. In particular, we propose gradient coding (GC) with dynamic clustering, called GC-DC, and regulate the number of stragglers in each cluster by dynamically forming the clusters at each iteration. With time-correlated straggling behavior, GC-DC adapts to the straggling behavior over time; in particular, at each iteration, GC-DC aims at distributing the stragglers across clusters as uniformly as possible based on the past straggler behavior. For both homogeneous and heterogeneous worker models, we numerically show that GC-DC provides significant improvements in the average per-iteration completion time without an increase in the communication load compared to the original GC scheme

    Mobility and popularity-aware coded small-cell caching

    No full text
    In heterogeneous cellular networks with caching capability, due to mobility of users and storage constraints of small-cell base stations (SBSs), users may not be able to download all of their requested content from the SBSs within the delay deadline of the content. In that case, the users are directed to the macro-cell base station (MBS) in order to satisfy the service quality requirement. Coded caching is exploited here to minimize the amount of data downloaded from the MBS, taking into account the mobility of the users as well as the popularity of the contents. An optimal distributed caching policy is presented when the delay deadline is below a certain threshold, and a distributed greedy caching policy is proposed when the delay deadline is relaxed

    Mobility-aware coded storage and delivery

    No full text
    Content caching at small-cell base stations (SBSs) is a promising method to mitigate the excessive backhaul load and delay, particularly for on-demand video streaming applications. A cache-enabled heterogeneous cellular network architecture is considered in this paper, where mobile users connect to multiple SBSs during a video downloading session, and the SBSs request files, or fragments of files, from the macro-cell base station (MBS) according to the user requests they receive. A novel coded storage and delivery scheme is introduced to reduce the load on the backhaul link from the MBS to the SBSs. The achievable backhaul delivery rate as well as the number of sub-files required to achieve this rate are studied for the proposed coded delivery scheme, and it is shown that the proposed scheme provides significant reduction in the number of sub-files required, making it more viable for practical applications

    Speeding Up Distributed Gradient Descent by Utilizing Non-persistent Stragglers

    No full text
    When gradient descent (GD) is scaled to many parallel computing servers (workers) for large scale machine learning problems, its per-iteration computation time is limited by the straggling workers. Coded distributed GD (DGD) can tolerate straggling workers by assigning redundant computations to the workers, but in most existing schemes, each non-straggling worker transmits one message per iteration to the parameter server (master) after completing all its computations. We allow multiple computations to be conveyed from each worker per iteration in order to exploit computations executed also by the straggling worker. We show that the average completion time per iteration can be reduced significantly at a reasonable increase in the communication load. We also propose a general coded DGD technique which can trade-off the average computation time with the communication load

    Decentralized SGD with Over-the-Air Computation

    No full text
    We consider multiple devices with local datasets collaboratively learning a global model through device-to-device (D2D) communications. The conventional decentralized stochastic gradient descent (DSGD) solution for this problem assumes error-free orthogonal links among the devices. This is based on the assumption of an underlying communication protocol that takes care of the noise, fading, and interference in the wireless medium. In this work, we show the suboptimality of this approach by designing the communication and learning protocols jointly. We first consider a point-to-point (P2P) communication scheme by scheduling D2D transmissions in an orthogonal fashion to minimize interference. Then, we propose a novel over-the-air consensus scheme by exploiting the signal superposition property of wireless transmission, rather than avoiding interference. In the proposed OAC-MAC scheme, multiple nodes align their transmissions toward a single receiver node. For both schemes, we cast the scheduling problem as a graph coloring problem. We then numerically compare the two approaches for the distributed MNIST image classification task under various network conditions. We show that the OAC-MAC scheme attains better convergence speed and final accuracy thanks to the improved robustness against channel fading and noise. We also introduce a noise-aware version of the OAC-MAC scheme with further improvements in the convergence speed and accuracy
    corecore