1,028 research outputs found

    The Raincore Distributed Session Service for Networking Elements

    Get PDF
    Motivated by the explosive growth of the Internet, we study efficient and fault-tolerant distributed session layer protocols for networking elements. These protocols are designed to enable a network cluster to share the state information necessary for balancing network traffic and computation load among a group of networking elements. In addition, in the presence of failures, they allow network traffic to fail-over from failed networking elements to healthy ones. To maximize the overall network throughput of the networking cluster, we assume a unicast communication medium for these protocols. The Raincore Distributed Session Service is based on a fault-tolerant token protocol, and provides group membership, reliable multicast and mutual exclusion services in a networking environment. We show that this service provides atomic reliable multicast with consistent ordering. We also show that Raincore token protocol consumes less overhead than a broadcast-based protocol in this environment in terms of CPU task-switching. The Raincore technology was transferred to Rainfinity, a startup company that is focusing on software for Internet reliability and performance. Rainwall, Rainfinity’s first product, was developed using the Raincore Distributed Session Service. We present initial performance results of the Rainwall product that validates our design assumptions and goals

    Issues in designing transport layer multicast facilities

    Get PDF
    Multicasting denotes a facility in a communications system for providing efficient delivery from a message's source to some well-defined set of locations using a single logical address. While modem network hardware supports multidestination delivery, first generation Transport Layer protocols (e.g., the DoD Transmission Control Protocol (TCP) (15) and ISO TP-4 (41)) did not anticipate the changes over the past decade in underlying network hardware, transmission speeds, and communication patterns that have enabled and driven the interest in reliable multicast. Much recent research has focused on integrating the underlying hardware multicast capability with the reliable services of Transport Layer protocols. Here, we explore the communication issues surrounding the design of such a reliable multicast mechanism. Approaches and solutions from the literature are discussed, and four experimental Transport Layer protocols that incorporate reliable multicast are examined

    Scalability of broadcast performance in wireless network-on-chip

    Get PDF
    Networks-on-Chip (NoCs) are currently the paradigm of choice to interconnect the cores of a chip multiprocessor. However, conventional NoCs may not suffice to fulfill the on-chip communication requirements of processors with hundreds or thousands of cores. The main reason is that the performance of such networks drops as the number of cores grows, especially in the presence of multicast and broadcast traffic. This not only limits the scalability of current multiprocessor architectures, but also sets a performance wall that prevents the development of architectures that generate moderate-to-high levels of multicast. In this paper, a Wireless Network-on-Chip (WNoC) where all cores share a single broadband channel is presented. Such design is conceived to provide low latency and ordered delivery for multicast/broadcast traffic, in an attempt to complement a wireline NoC that will transport the rest of communication flows. To assess the feasibility of this approach, the network performance of WNoC is analyzed as a function of the system size and the channel capacity, and then compared to that of wireline NoCs with embedded multicast support. Based on this evaluation, preliminary results on the potential performance of the proposed hybrid scheme are provided, together with guidelines for the design of MAC protocols for WNoC.Peer ReviewedPostprint (published version

    The multidriver: A reliable multicast service using the Xpress Transfer Protocol

    Get PDF
    A reliable multicast facility extends traditional point-to-point virtual circuit reliability to one-to-many communication. Such services can provide more efficient use of network resources, a powerful distributed name binding capability, and reduced latency in multidestination message delivery. These benefits will be especially valuable in real-time environments where reliable multicast can enable new applications and increase the availability and the reliability of data and services. We present a unique multicast service that exploits features in the next-generation, real-time transfer layer protocol, the Xpress Transfer Protocol (XTP). In its reliable mode, the service offers error, flow, and rate-controlled multidestination delivery of arbitrary-sized messages, with provision for the coordination of reliable reverse channels. Performance measurements on a single-segment Proteon ProNET-4 4 Mbps 802.5 token ring with heterogeneous nodes are discussed

    A Survey of Fault-Tolerance and Fault-Recovery Techniques in Parallel Systems

    Full text link
    Supercomputing systems today often come in the form of large numbers of commodity systems linked together into a computing cluster. These systems, like any distributed system, can have large numbers of independent hardware components cooperating or collaborating on a computation. Unfortunately, any of this vast number of components can fail at any time, resulting in potentially erroneous output. In order to improve the robustness of supercomputing applications in the presence of failures, many techniques have been developed to provide resilience to these kinds of system faults. This survey provides an overview of these various fault-tolerance techniques.Comment: 11 page

    Study of architecture and protocols for reliable multicasting in packet switching networks

    Get PDF
    Group multicast protocols have been challenged to provide scalable solutions that meet the following requirements: (i) reliable delivery from different sources to all destinations within a multicast group; (ii) congestion control among multiple asynchronous sources. Although it is mainly a transport layer task, reliable group multicasting depends on routing architectures as well. This dissertation covers issues of both network and transport layers. Two routing architectures, tree and ring, are surveyed with a comparative study of their routing costs and impact to upper layer performances. Correspondingly, two generic transport protocol models are established for performance study. The tree-based protocol is rate-based and uses negative acknowledgment mechanisms for reliability control, while the ring-based protocol uses window-based flow control and positive acknowledgment schemes. The major performance measures observed in the study are network cost, multicast delay, throughput and efficiency. The results suggest that the tree architecture costs less at network layer than the ring, and helps to minimize latency under light network load. Meanwhile, heavy load reliable group multicasting can benefit from ring architecture, which facilitates window-based flow and congestion control. Based on the comparative study, a new two-hierarchy hybrid architecture, Rings Interconnected with Tree Architecture (RITA), is presented. Here, a multicast group is partitioned into multiple clusters with the ring as the intra-cluster architecture, and the tree as backbone architecture that implements inter-cluster multicasting. To compromise between performance measures such as delay and through put, reliability and congestion controls are accomplished at the transport layer with a hybrid use of rate and window-based protocols, which are based on either negative or positive feedback mechanisms respectively. Performances are compared with simulations against tree- and ring-based approaches. Results are encouraging because RITA achieves similar throughput performance as the ring-based protocol, but with significantly lowered delay. Finally, the multicast tree packing problem is discussed. In a network accommodating multiple concurrent multicast sessions, routing for an individual session can be optimized to minimize the competition with other sessions, rather than to minimize cost or delay. Packing lower bound and a heuristic are investigated. Simulation show that congestion can be reduced effectively with limited cost increase of routings

    A reliable totally-ordered group multicast protocol for mobile Internet

    Get PDF
    Version of RecordPublishe

    The Scalability of Multicast Communication

    Get PDF
    Multicast is a communication method which operates on groups of applications. Having multiple instances of an application which are addressed collectively using a unique, multicast address, allows elegant solutions to some of the more intractable problems in distributed programming, such as providing fault tolerance. However, as multicast techniques are applied in areas such as distributed operating systems, where the operating system may span a large number of hosts, or on faster network architectures, where the problems of congestion reduce the effectiveness of the technique, then the scalability of multicast must be addressed if multicast is to gain a wider application. The main scalability issue was considered to be packet loss due to buffer overrun, the most common cause of this buffer overrun being the mismatch in packet arrival rate and packet consumption at the multicast originator, the so-called implosion problem. This issue affects positively acknowledged and transactional protocols. As these two techniques are the most common protocol designs, it was felt that an investigation into the problems of these types of protocol would be most effective. A model for implosion was developed which was simulated in order to investigate the parameters of implosion. A measure of this implosion was derived from the data, this index of implosion allowing the severity of implosion to be described as well as the location of the implosion in the model. This implosion index was derived by dividing the rate at which buffers were occupied by the rate at which packets were generated by the model. The value may then be used to predict the number of buffers required given the number of packets expected. A number of techniques were developed which may be used to offset implosion, either by artificially increasing the inter-packet gap, or by distributing replies so that no one host receives enough packets to cause an implosion. Of these alternatives, the latter offers the most promise, although requiring a large effort to maintain the resulting hierarchical structure in the presence of multiple failures

    Design, Implementation, and Verification of the Reliable Multicast Protocol

    Get PDF
    This document describes the Reliable Multicast Protocol (RMP) design, first implementation, and formal verification. RMP provides a totally ordered, reliable, atomic multicast service on top of an unreliable multicast datagram service. RMP is fully and symmetrically distributed so that no site bears an undue portion of the communications load. RMP provides a wide range of guarantees, from unreliable delivery to totally ordered delivery, to K-resilient, majority resilient, and totally resilient atomic delivery. These guarantees are selectable on a per message basis. RMP provides many communication options, including virtual synchrony, a publisher/subscriber model of message delivery, a client/server model of delivery, mutually exclusive handlers for messages, and mutually exclusive locks. It has been commonly believed that total ordering of messages can only be achieved at great performance expense. RMP discounts this. The first implementation of RMP has been shown to provide high throughput performance on Local Area Networks (LAN). For two or more destinations a single LAN, RMP provides higher throughput than any other protocol that does not use multicast or broadcast technology. The design, implementation, and verification activities of RMP have occurred concurrently. This has allowed the verification to maintain a high fidelity between design model, implementation model, and the verification model. The restrictions of implementation have influenced the design earlier than in normal sequential approaches. The protocol as a whole has matured smoother by the inclusion of several different perspectives into the product development
    corecore