4,130 research outputs found
Computing in the RAIN: a reliable array of independent nodes
The RAIN project is a research collaboration between Caltech and NASA-JPL on distributed computing and data-storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. The RAIN-technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this paper, we describe the following contributions: 1) fault-tolerant interconnect topologies and communication protocols providing consistent error reporting of link failures, 2) fault management techniques based on group membership, and 3) data storage schemes based on computationally efficient error-control codes. We present several proof-of-concept applications: a highly-available video server, a highly-available Web server, and a distributed checkpointing system. Also, we describe a commercial product, Rainwall, built with the RAIN technology
AIRNET: A real-time comunications network for aircraft
A real-time local area network was developed for use on aircraft and space vehicles. It uses token ring technology to provide high throughput, low latency, and high reliability. The system was implemented on PCs and PC/ATs operating on PCbus, and on Intel 8086/186/286/386s operating on Multibus. A standard IEEE 802.2 logical link control interface was provided to (optional) upper layer software; this permits the controls designer to utilize standard communications protocols (e.g., ISO, TCP/IP) if time permits, or to utilize a very fast link level protocol directly if speed is critical. Both unacknowledged datagram and reliable virtual circuit services are supported. A station operating an 8 MHz Intel 286 as a host can generate a sustained load of 1.8 megabits per second per station, and a 100-byte message can be delivered from the transmitter's user memory to the receiver's user memory, including all operating system and network overhead, in under 4 milliseconds
Performance of voice and video conferencing over ATM and gigabit ethernet backbone networks
Gigabit Ethernet and ATM network technologies have been modeled as campus network
backbones for the simulation-based comparison of their performance. Real-time voice and
video conferencing traffic is used to compare the performance of both backbone
technologies in terms of response times and packet end-to-end delays. Simulation results
show that Gigabit Ethernet has been able to perform the same and in some cases better than
ATM as a backbone network for video and voice conferencing providing network designers
with a cheaper solution to meet the growing needs of bandwidth-hungry applications in a
campus environment
Proof-of-Concept Application - Annual Report Year 1
In this document the Cat-COVITE Application for use in the CATNETS Project is introduced and motivated. Furthermore an introduction to the catallactic middleware and Web Services Agreement (WS-Agreement) concepts is given as a basis for the future work. Requirements for the application of Cat-COVITE with in catallactic systems are analysed. Finally the integration of the Cat-COVITE application and the catallactic middleware is described. --Grid Computing
XTP for the NASA space station
The NASA Space Station is a truly international effort; therefore, its communications systems must conform to established international standards. Thus, NASA is requiring that each network-interface unit implement a full suite of ISO protocols. However, NASA is understandably concerned that a full ISO stack will not deliver performance consistent with the real-time demands of Space Station control systems. Therefore, as a research project, the suitability of the Xpress transfer protocol (XTP) is investigated along side a full ISO stack. The initial plans for implementing XTP and comparing its performance to ISO TP4 are described
A Configurable Transport Layer for CAF
The message-driven nature of actors lays a foundation for developing scalable
and distributed software. While the actor itself has been thoroughly modeled,
the message passing layer lacks a common definition. Properties and guarantees
of message exchange often shift with implementations and contexts. This adds
complexity to the development process, limits portability, and removes
transparency from distributed actor systems.
In this work, we examine actor communication, focusing on the implementation
and runtime costs of reliable and ordered delivery. Both guarantees are often
based on TCP for remote messaging, which mixes network transport with the
semantics of messaging. However, the choice of transport may follow different
constraints and is often governed by deployment. As a first step towards
re-architecting actor-to-actor communication, we decouple the messaging
guarantees from the transport protocol. We validate our approach by redesigning
the network stack of the C++ Actor Framework (CAF) so that it allows to combine
an arbitrary transport protocol with additional functions for remote messaging.
An evaluation quantifies the cost of composability and the impact of individual
layers on the entire stack
Issues in designing transport layer multicast facilities
Multicasting denotes a facility in a communications system for providing efficient delivery from a message's source to some well-defined set of locations using a single logical address. While modem network hardware supports multidestination delivery, first generation Transport Layer protocols (e.g., the DoD Transmission Control Protocol (TCP) (15) and ISO TP-4 (41)) did not anticipate the changes over the past decade in underlying network hardware, transmission speeds, and communication patterns that have enabled and driven the interest in reliable multicast. Much recent research has focused on integrating the underlying hardware multicast capability with the reliable services of Transport Layer protocols. Here, we explore the communication issues surrounding the design of such a reliable multicast mechanism. Approaches and solutions from the literature are discussed, and four experimental Transport Layer protocols that incorporate reliable multicast are examined
Crux: Locality-Preserving Distributed Services
Distributed systems achieve scalability by distributing load across many
machines, but wide-area deployments can introduce worst-case response latencies
proportional to the network's diameter. Crux is a general framework to build
locality-preserving distributed systems, by transforming an existing scalable
distributed algorithm A into a new locality-preserving algorithm ALP, which
guarantees for any two clients u and v interacting via ALP that their
interactions exhibit worst-case response latencies proportional to the network
latency between u and v. Crux builds on compact-routing theory, but generalizes
these techniques beyond routing applications. Crux provides weak and strong
consistency flavors, and shows latency improvements for localized interactions
in both cases, specifically up to several orders of magnitude for
weakly-consistent Crux (from roughly 900ms to 1ms). We deployed on PlanetLab
locality-preserving versions of a Memcached distributed cache, a Bamboo
distributed hash table, and a Redis publish/subscribe. Our results indicate
that Crux is effective and applicable to a variety of existing distributed
algorithms.Comment: 11 figure
- …