2,364 research outputs found
Resource efficient redundancy using quorum-based cycle routing in optical networks
In this paper we propose a cycle redundancy technique that provides optical
networks almost fault-tolerant point-to-point and multipoint-to-multipoint
communications. The technique more importantly is shown to approximately halve
the necessary light-trail resources in the network while maintaining the
fault-tolerance and dependability expected from cycle-based routing. For
efficiency and distributed control, it is common in distributed systems and
algorithms to group nodes into intersecting sets referred to as quorum sets.
Optimal communication quorum sets forming optical cycles based on light-trails
have been shown to flexibly and efficiently route both point-to-point and
multipoint-to-multipoint traffic requests. Commonly cycle routing techniques
will use pairs of cycles to achieve both routing and fault-tolerance, which
uses substantial resources and creates the potential for underutilization.
Instead, we intentionally utilize redundancy within the quorum cycles for
fault-tolerance such that almost every point-to-point communication occurs in
more than one cycle. The result is a set of cycles with 96.60% - 99.37% fault
coverage, while using 42.9% - 47.18% fewer resources.Comment: 17th International Conference on Transparent Optical Networks
(ICTON), 5-9 July 2015. arXiv admin note: substantial text overlap with
arXiv:1608.05172, arXiv:1608.0516
The Raincore API for clusters of networking elements
Clustering technology offers a way to increase overall reliability and performance of Internet information flow by strengthening one link in the chain without adding others. We have implemented this technology in a distributed computing architecture for network elements. The architecture, called Raincore, originated in the Reliable Array of Independent Nodes, or RAIN, research collaboration between the California Institute of Technology and the US National Aeronautics and Space Agency's Jet Propulsion Laboratory. The RAIN project focused on developing high-performance, fault-tolerant, portable clustering technology for spaceborne computing . The technology that emerged from this project became the basis for a spinoff company, Rainfinity, which has the exclusive intellectual property rights to the RAIN technology. The authors describe the Raincore conceptual architecture and distributed services, which are designed to make it easy for developers to port their applications to run on top of a cluster of networking elements. We include two applications: a Web server prototype that was part of the original RAIN research project and a commercial firewall cluster product from Rainfinity
The Raincore Distributed Session Service for Networking Elements
Motivated by the explosive growth of the Internet, we study efficient and fault-tolerant distributed session layer
protocols for networking elements. These protocols are
designed to enable a network cluster to share the state
information necessary for balancing network traffic and
computation load among a group of networking elements.
In addition, in the presence of failures, they allow
network traffic to fail-over from failed networking
elements to healthy ones. To maximize the overall
network throughput of the networking cluster, we assume a unicast communication medium for these protocols. The Raincore Distributed Session Service is based on a fault-tolerant token protocol, and provides group membership, reliable multicast and mutual exclusion services in a networking environment. We show that this service provides atomic reliable multicast with consistent ordering. We also show that Raincore token protocol consumes less overhead than a broadcast-based protocol in this environment in terms of CPU task-switching. The Raincore technology was transferred to Rainfinity, a startup company that is focusing on software for Internet reliability and performance. Rainwall, Rainfinity’s first product, was developed using the Raincore Distributed Session Service. We present initial performance results of the Rainwall product that validates our design assumptions and goals
HT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers
Paxos is a prominent theory of state machine replication. Recent data
intensive Systems those implement state machine replication generally require
high throughput. Earlier versions of Paxos as few of them are classical Paxos,
fast Paxos and generalized Paxos have a major focus on fault tolerance and
latency but lacking in terms of throughput and scalability. A major reason for
this is the heavyweight leader. Through offloading the leader, we can further
increase throughput of the system. Ring Paxos, Multi Ring Paxos and S-Paxos are
few prominent attempts in this direction for clustered data centers. In this
paper, we are proposing HT-Paxos, a variant of Paxos that one is the best
suitable for any large clustered data center. HT-Paxos further offloads the
leader very significantly and hence increases the throughput and scalability of
the system. While at the same time, among high throughput state-machine
replication protocols, HT-Paxos provides reasonably low latency and response
time
Exploiting the Synergy Between Gossiping and Structured Overlays
In this position paper we argue for exploiting the synergy between gossip-based algorithms and structured overlay networks (SON). These two strands of research have both aimed at building fault-tolerant, dynamic, self-managing, and large-scale distributed systems. Despite the common goals, the two areas have, however, been relatively isolated. We focus on three problem domains where there is an untapped potential of using gossiping combined with SONs. We argue for applying gossip-based membership for ring-based SONs---such as Chord and Bamboo---to make them handle partition mergers and loopy networks. We argue that small world SONs---such as Accordion and Mercury---are specifically well-suited for gossip-based membership management. The benefits would be better graph-theoretic properties. Finally, we argue that gossip-based algorithms could use the overlay constructed by SONs. For example, many unreliable broadcast algorithms for SONs could be augmented with anti-entropy protocols. Similarly, gossip-based aggregation could be used in SONs for network size estimation and load-balancing purposes
Programming with process groups: Group and multicast semantics
Process groups are a natural tool for distributed programming and are increasingly important in distributed computing environments. Discussed here is a new architecture that arose from an effort to simplify Isis process group semantics. The findings include a refined notion of how the clients of a group should be treated, what the properties of a multicast primitive should be when systems contain large numbers of overlapping groups, and a new construct called the causality domain. A system based on this architecture is now being implemented in collaboration with the Chorus and Mach projects
Unidirectional Quorum-based Cycle Planning for Efficient Resource Utilization and Fault-Tolerance
In this paper, we propose a greedy cycle direction heuristic to improve the
generalized redundancy quorum cycle technique. When applied using
only single cycles rather than the standard paired cycles, the generalized
redundancy technique has been shown to almost halve the necessary
light-trail resources in the network. Our greedy heuristic improves this
cycle-based routing technique's fault-tolerance and dependability.
For efficiency and distributed control, it is common in distributed systems
and algorithms to group nodes into intersecting sets referred to as quorum
sets. Optimal communication quorum sets forming optical cycles based on
light-trails have been shown to flexibly and efficiently route both
point-to-point and multipoint-to-multipoint traffic requests. Commonly cycle
routing techniques will use pairs of cycles to achieve both routing and
fault-tolerance, which uses substantial resources and creates the potential for
underutilization. Instead, we use a single cycle and intentionally utilize
redundancy within the quorum cycles such that every point-to-point
communication pairs occur in at least cycles. Without the paired
cycles the direction of the quorum cycles becomes critical to the fault
tolerance performance. For this we developed a greedy cycle direction heuristic
and our single fault network simulations show a reduction of missing pairs by
greater than 30%, which translates to significant improvements in fault
coverage.Comment: Computer Communication and Networks (ICCCN), 2016 25th International
Conference on. arXiv admin note: substantial text overlap with
arXiv:1608.05172, arXiv:1608.05168, arXiv:1608.0517
- …