182 research outputs found
OverSketch: Approximate Matrix Multiplication for the Cloud
We propose OverSketch, an approximate algorithm for distributed matrix
multiplication in serverless computing. OverSketch leverages ideas from matrix
sketching and high-performance computing to enable cost-efficient
multiplication that is resilient to faults and straggling nodes pervasive in
low-cost serverless architectures. We establish statistical guarantees on the
accuracy of OverSketch and empirically validate our results by solving a
large-scale linear program using interior-point methods and demonstrate a 34%
reduction in compute time on AWS Lambda.Comment: Published in Proc. IEEE Big Data 2018. Updated version provides
details of distributed sketching and highlights other advantages of
OverSketc
Function-as-a-Service Performance Evaluation: A Multivocal Literature Review
Function-as-a-Service (FaaS) is one form of the serverless cloud computing
paradigm and is defined through FaaS platforms (e.g., AWS Lambda) executing
event-triggered code snippets (i.e., functions). Many studies that empirically
evaluate the performance of such FaaS platforms have started to appear but we
are currently lacking a comprehensive understanding of the overall domain. To
address this gap, we conducted a multivocal literature review (MLR) covering
112 studies from academic (51) and grey (61) literature. We find that existing
work mainly studies the AWS Lambda platform and focuses on micro-benchmarks
using simple functions to measure CPU speed and FaaS platform overhead (i.e.,
container cold starts). Further, we discover a mismatch between academic and
industrial sources on tested platform configurations, find that function
triggers remain insufficiently studied, and identify HTTP API gateways and
cloud storages as the most used external service integrations. Following
existing guidelines on experimentation in cloud systems, we discover many flaws
threatening the reproducibility of experiments presented in the surveyed
studies. We conclude with a discussion of gaps in literature and highlight
methodological suggestions that may serve to improve future FaaS performance
evaluation studies.Comment: improvements including postprint update
Randomized Polar Codes for Anytime Distributed Machine Learning
We present a novel distributed computing framework that is robust to slow
compute nodes, and is capable of both approximate and exact computation of
linear operations. The proposed mechanism integrates the concepts of randomized
sketching and polar codes in the context of coded computation. We propose a
sequential decoding algorithm designed to handle real valued data while
maintaining low computational complexity for recovery. Additionally, we provide
an anytime estimator that can generate provably accurate estimates even when
the set of available node outputs is not decodable. We demonstrate the
potential applications of this framework in various contexts, such as
large-scale matrix multiplication and black-box optimization. We present the
implementation of these methods on a serverless cloud computing system and
provide numerical results to demonstrate their scalability in practice,
including ImageNet scale computations
Rateless Codes for Near-Perfect Load Balancing in Distributed Matrix-Vector Multiplication
Large-scale machine learning and data mining applications require computer
systems to perform massive matrix-vector and matrix-matrix multiplication
operations that need to be parallelized across multiple nodes. The presence of
straggling nodes -- computing nodes that unpredictably slowdown or fail -- is a
major bottleneck in such distributed computations. Ideal load balancing
strategies that dynamically allocate more tasks to faster nodes require
knowledge or monitoring of node speeds as well as the ability to quickly move
data. Recently proposed fixed-rate erasure coding strategies can handle
unpredictable node slowdown, but they ignore partial work done by straggling
nodes thus resulting in a lot of redundant computation. We propose a
\emph{rateless fountain coding} strategy that achieves the best of both worlds
-- we prove that its latency is asymptotically equal to ideal load balancing,
and it performs asymptotically zero redundant computations. Our idea is to
create linear combinations of the rows of the matrix and assign these
encoded rows to different worker nodes. The original matrix-vector product can
be decoded as soon as slightly more than row-vector products are
collectively finished by the nodes. We conduct experiments in three computing
environments: local parallel computing, Amazon EC2, and Amazon Lambda, which
show that rateless coding gives as much as speed-up over uncoded
schemes
Serverless Computing for Scientific Applications
Serverless computing has become an important model in cloud computing and
influenced the design of many applications. Here, we provide our perspective on
how the recent landscape of serverless computing for scientific applications
looks like. We discuss the advantages and problems with serverless computing
for scientific applications, and based on the analysis of existing solutions
and approaches, we propose a science-oriented architecture for a serverless
computing framework that is based on the existing designs. Finally, we provide
an outlook of current trends and future directions
- …