12 research outputs found
Latency Analysis of Coded Computation Schemes over Wireless Networks
Large-scale distributed computing systems face two major bottlenecks that
limit their scalability: straggler delay caused by the variability of
computation times at different worker nodes and communication bottlenecks
caused by shuffling data across many nodes in the network. Recently, it has
been shown that codes can provide significant gains in overcoming these
bottlenecks. In particular, optimal coding schemes for minimizing latency in
distributed computation of linear functions and mitigating the effect of
stragglers was proposed for a wired network, where the workers can
simultaneously transmit messages to a master node without interference. In this
paper, we focus on the problem of coded computation over a wireless
master-worker setup with straggling workers, where only one worker can transmit
the result of its local computation back to the master at a time. We consider 3
asymptotic regimes (determined by how the communication and computation times
are scaled with the number of workers) and precisely characterize the total
run-time of the distributed algorithm and optimum coding strategy in each
regime. In particular, for the regime of practical interest where the
computation and communication times of the distributed computing algorithm are
comparable, we show that the total run-time approaches a simple lower bound
that decouples computation and communication, and demonstrate that coded
schemes are times faster than uncoded schemes