1,897 research outputs found
Dependability in Aggregation by Averaging
Aggregation is an important building block of modern distributed
applications, allowing the determination of meaningful properties (e.g. network
size, total storage capacity, average load, majorities, etc.) that are used to
direct the execution of the system. However, the majority of the existing
aggregation algorithms exhibit relevant dependability issues, when prospecting
their use in real application environments. In this paper, we reveal some
dependability issues of aggregation algorithms based on iterative averaging
techniques, giving some directions to solve them. This class of algorithms is
considered robust (when compared to common tree-based approaches), being
independent from the used routing topology and providing an aggregation result
at all nodes. However, their robustness is strongly challenged and their
correctness often compromised, when changing the assumptions of their working
environment to more realistic ones. The correctness of this class of algorithms
relies on the maintenance of a fundamental invariant, commonly designated as
"mass conservation". We will argue that this main invariant is often broken in
practical settings, and that additional mechanisms and modifications are
required to maintain it, incurring in some degradation of the algorithms
performance. In particular, we discuss the behavior of three representative
algorithms Push-Sum Protocol, Push-Pull Gossip protocol and Distributed Random
Grouping under asynchronous and faulty (with message loss and node crashes)
environments. More specifically, we propose and evaluate two new versions of
the Push-Pull Gossip protocol, which solve its message interleaving problem
(evidenced even in a synchronous operation mode).Comment: 14 pages. Presented in Inforum 200
Solving the At-Most-Once Problem with Nearly Optimal Effectiveness
We present and analyze a wait-free deterministic algorithm for solving the
at-most-once problem: how m shared-memory fail-prone processes perform
asynchronously n jobs at most once. Our algorithmic strategy provides for the
first time nearly optimal effectiveness, which is a measure that expresses the
total number of jobs completed in the worst case. The effectiveness of our
algorithm equals n-2m+2. This is up to an additive factor of m close to the
known effectiveness upper bound n-m+1 over all possible algorithms and improves
on the previously best known deterministic solutions that have effectiveness
only n-log m o(n). We also present an iterative version of our algorithm that
for any is both
effectiveness-optimal and work-optimal, for any constant . We
then employ this algorithm to provide a new algorithmic solution for the
Write-All problem which is work optimal for any
.Comment: Updated Version. A Brief Announcement was published in PODC 2011. An
Extended Abstract was published in the proceeding of ICDCN 2012. A full
version was published in Theoretical Computer Science, Volume 496, 22 July
2013, Pages 69 - 8
Doing-it-All with Bounded Work and Communication
We consider the Do-All problem, where cooperating processors need to
complete similar and independent tasks in an adversarial setting. Here we
deal with a synchronous message passing system with processors that are subject
to crash failures. Efficiency of algorithms in this setting is measured in
terms of work complexity (also known as total available processor steps) and
communication complexity (total number of point-to-point messages). When work
and communication are considered to be comparable resources, then the overall
efficiency is meaningfully expressed in terms of effort defined as work +
communication. We develop and analyze a constructive algorithm that has work
and a nonconstructive
algorithm that has work . The latter result is close to the
lower bound on work. The effort of each of
these algorithms is proportional to its work when the number of crashes is
bounded above by , for some positive constant . We also present a
nonconstructive algorithm that has effort
Distributed Computing in the Asynchronous LOCAL model
The LOCAL model is among the main models for studying locality in the
framework of distributed network computing. This model is however subject to
pertinent criticisms, including the facts that all nodes wake up
simultaneously, perform in lock steps, and are failure-free. We show that
relaxing these hypotheses to some extent does not hurt local computing. In
particular, we show that, for any construction task associated to a locally
checkable labeling (LCL), if is solvable in rounds in the LOCAL model,
then remains solvable in rounds in the asynchronous LOCAL model.
This improves the result by Casta\~neda et al. [SSS 2016], which was restricted
to 3-coloring the rings. More generally, the main contribution of this paper is
to show that, perhaps surprisingly, asynchrony and failures in the computations
do not restrict the power of the LOCAL model, as long as the communications
remain synchronous and failure-free
Evaluation and Analysis of Distributed Graph-Parallel Processing Frameworks
A number of graph-parallel processing frameworks have been proposed to address the needs of processing complex and large-scale graph structured datasets in recent years. Although significant performance improvement made by those frameworks were reported, comparative advantages of each of these frameworks over the others have not been fully studied, which impedes the best utilization of those frameworks for a specific graph computing task and setting. In this work, we conducted a comparison study on parallel processing systems for large-scale graph computations in a systematic manner, aiming to reveal the characteristics of those systems in performing common graph algorithms with real-world datasets on the same ground. We selected three popular graph-parallel processing frameworks (Giraph, GPS and GraphLab) for the study and also include a representative general data-parallel computing system— Spark—in the comparison in order to understand how well a general data-parallel system can run graph problems. We applied basic performance metrics measuring speed, resource utilization, and scalability to answer a basic question of which graph-parallel processing platform is better suited for what applications and datasets. Three widely-used graph algorithms— clustering coefficient, shortest path length, and PageRank score—were used for benchmarking on the targeted computing systems.We ran those algorithms against three real world network datasets with diverse characteristics and scales on a research cluster and have obtained a number of interesting observations. For instance, all evaluated systems showed poor scalability (i.e., the runtime increases with more computing nodes) with small datasets likely due to communication overhead. Further, out of the evaluated graphparallel computing platforms, PowerGraph consistently exhibits better performance than others
- …