7,869 research outputs found
Computing in the RAIN: a reliable array of independent nodes
The RAIN project is a research collaboration between Caltech and NASA-JPL on distributed computing and data-storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. The RAIN-technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this paper, we describe the following contributions: 1) fault-tolerant interconnect topologies and communication protocols providing consistent error reporting of link failures, 2) fault management techniques based on group membership, and 3) data storage schemes based on computationally efficient error-control codes. We present several proof-of-concept applications: a highly-available video server, a highly-available Web server, and a distributed checkpointing system. Also, we describe a commercial product, Rainwall, built with the RAIN technology
The Raincore API for clusters of networking elements
Clustering technology offers a way to increase overall reliability and performance of Internet information flow by strengthening one link in the chain without adding others. We have implemented this technology in a distributed computing architecture for network elements. The architecture, called Raincore, originated in the Reliable Array of Independent Nodes, or RAIN, research collaboration between the California Institute of Technology and the US National Aeronautics and Space Agency's Jet Propulsion Laboratory. The RAIN project focused on developing high-performance, fault-tolerant, portable clustering technology for spaceborne computing . The technology that emerged from this project became the basis for a spinoff company, Rainfinity, which has the exclusive intellectual property rights to the RAIN technology. The authors describe the Raincore conceptual architecture and distributed services, which are designed to make it easy for developers to port their applications to run on top of a cluster of networking elements. We include two applications: a Web server prototype that was part of the original RAIN research project and a commercial firewall cluster product from Rainfinity
Fault-Tolerant Spanners: Better and Simpler
A natural requirement of many distributed structures is fault-tolerance:
after some failures, whatever remains from the structure should still be
effective for whatever remains from the network. In this paper we examine
spanners of general graphs that are tolerant to vertex failures, and
significantly improve their dependence on the number of faults , for all
stretch bounds.
For stretch we design a simple transformation that converts every
-spanner construction with at most edges into an -fault-tolerant
-spanner construction with at most edges.
Applying this to standard greedy spanner constructions gives -fault tolerant
-spanners with edges. The previous
construction by Chechik, Langberg, Peleg, and Roddity [STOC 2009] depends
similarly on but exponentially on (approximately like ).
For the case and unit-length edges, an -approximation
algorithm is known from recent work of Dinitz and Krauthgamer [arXiv 2010],
where several spanner results are obtained using a common approach of rounding
a natural flow-based linear programming relaxation. Here we use a different
(stronger) LP relaxation and improve the approximation ratio to ,
which is, notably, independent of the number of faults . We further
strengthen this bound in terms of the maximum degree by using the \Lovasz Local
Lemma.
Finally, we show that most of our constructions are inherently local by
designing equivalent distributed algorithms in the LOCAL model of distributed
computation.Comment: 17 page
The Raincore Distributed Session Service for Networking Elements
Motivated by the explosive growth of the Internet, we study efficient and fault-tolerant distributed session layer
protocols for networking elements. These protocols are
designed to enable a network cluster to share the state
information necessary for balancing network traffic and
computation load among a group of networking elements.
In addition, in the presence of failures, they allow
network traffic to fail-over from failed networking
elements to healthy ones. To maximize the overall
network throughput of the networking cluster, we assume a unicast communication medium for these protocols. The Raincore Distributed Session Service is based on a fault-tolerant token protocol, and provides group membership, reliable multicast and mutual exclusion services in a networking environment. We show that this service provides atomic reliable multicast with consistent ordering. We also show that Raincore token protocol consumes less overhead than a broadcast-based protocol in this environment in terms of CPU task-switching. The Raincore technology was transferred to Rainfinity, a startup company that is focusing on software for Internet reliability and performance. Rainwall, Rainfinity’s first product, was developed using the Raincore Distributed Session Service. We present initial performance results of the Rainwall product that validates our design assumptions and goals
Checkpointing as a Service in Heterogeneous Cloud Environments
A non-invasive, cloud-agnostic approach is demonstrated for extending
existing cloud platforms to include checkpoint-restart capability. Most cloud
platforms currently rely on each application to provide its own fault
tolerance. A uniform mechanism within the cloud itself serves two purposes: (a)
direct support for long-running jobs, which would otherwise require a custom
fault-tolerant mechanism for each application; and (b) the administrative
capability to manage an over-subscribed cloud by temporarily swapping out jobs
when higher priority jobs arrive. An advantage of this uniform approach is that
it also supports parallel and distributed computations, over both TCP and
InfiniBand, thus allowing traditional HPC applications to take advantage of an
existing cloud infrastructure. Additionally, an integrated health-monitoring
mechanism detects when long-running jobs either fail or incur exceptionally low
performance, perhaps due to resource starvation, and proactively suspends the
job. The cloud-agnostic feature is demonstrated by applying the implementation
to two very different cloud platforms: Snooze and OpenStack. The use of a
cloud-agnostic architecture also enables, for the first time, migration of
applications from one cloud platform to another.Comment: 20 pages, 11 figures, appears in CCGrid, 201
Optimal Gossip with Direct Addressing
Gossip algorithms spread information by having nodes repeatedly forward
information to a few random contacts. By their very nature, gossip algorithms
tend to be distributed and fault tolerant. If done right, they can also be fast
and message-efficient. A common model for gossip communication is the random
phone call model, in which in each synchronous round each node can PUSH or PULL
information to or from a random other node. For example, Karp et al. [FOCS
2000] gave algorithms in this model that spread a message to all nodes in
rounds while sending only messages per node
on average.
Recently, Avin and Els\"asser [DISC 2013], studied the random phone call
model with the natural and commonly used assumption of direct addressing.
Direct addressing allows nodes to directly contact nodes whose ID (e.g., IP
address) was learned before. They show that in this setting, one can "break the
barrier" and achieve a gossip algorithm running in
rounds, albeit while using messages per node.
We study the same model and give a simple gossip algorithm which spreads a
message in only rounds. We also prove a matching lower bound which shows that this running time is best possible. In
particular we show that any gossip algorithm takes with high probability at
least rounds to terminate. Lastly, our algorithm can be
tweaked to send only messages per node on average with only
bits per message. Our algorithm therefore simultaneously achieves the optimal
round-, message-, and bit-complexity for this setting. As all prior gossip
algorithms, our algorithm is also robust against failures. In particular, if in
the beginning an oblivious adversary fails any nodes our algorithm still,
with high probability, informs all but surviving nodes
- …