133 research outputs found

    Fabric-on-a-Chip: Toward Consolidating Packet Switching Functions on Silicon

    Get PDF
    The switching capacity of an Internet router is often dictated by the memory bandwidth required to bu¤er arriving packets. With the demand for greater capacity and improved service provisioning, inherent memory bandwidth limitations are encountered rendering input queued (IQ) switches and combined input and output queued (CIOQ) architectures more practical. Output-queued (OQ) switches, on the other hand, offer several highly desirable performance characteristics, including minimal average packet delay, controllable Quality of Service (QoS) provisioning and work-conservation under any admissible traffic conditions. However, the memory bandwidth requirements of such systems is O(NR), where N denotes the number of ports and R the data rate of each port. Clearly, for high port densities and data rates, this constraint dramatically limits the scalability of the switch. In an effort to retain the desirable attributes of output-queued switches, while significantly reducing the memory bandwidth requirements, distributed shared memory architectures, such as the parallel shared memory (PSM) switch/router, have recently received much attention. The principle advantage of the PSM architecture is derived from the use of slow-running memory units operating in parallel to distribute the memory bandwidth requirement. At the core of the PSM architecture is a memory management algorithm that determines, for each arriving packet, the memory unit in which it will be placed. However, to date, the computational complexity of this algorithm is O(N), thereby limiting the scalability of PSM switches. In an effort to overcome the scalability limitations, it is the goal of this dissertation to extend existing shared-memory architecture results while introducing the notion of Fabric on a Chip (FoC). In taking advantage of recent advancements in integrated circuit technologies, FoC aims to facilitate the consolidation of as many packet switching functions as possible on a single chip. Accordingly, this dissertation introduces a novel pipelined memory management algorithm, which plays a key role in the context of on-chip output- queued switch emulation. We discuss in detail the fundamental properties of the proposed scheme, along with hardware-based implementation results that illustrate its scalability and performance attributes. To complement the main effort and further support the notion of FoC, we provide performance analysis of output queued cell switches with heterogeneous traffic. The result is a flexible tool for obtaining bounds on the memory requirements in output queued switches under a wide range of tra¢ c scenarios. Additionally, we present a reconfigurable high-speed hardware architecture for real-time generation of packets for the various traffic scenarios. The work presented in this thesis aims at providing pragmatic foundations for designing next-generation, high-performance Internet switches and routers

    On packet switch design

    Get PDF

    Supporting distributed computation over wide area gigabit networks

    Get PDF
    The advent of high bandwidth fibre optic links that may be used over very large distances has lead to much research and development in the field of wide area gigabit networking. One problem that needs to be addressed is how loosely coupled distributed systems may be built over these links, allowing many computers worldwide to take part in complex calculations in order to solve "Grand Challenge" problems. The research conducted as part of this PhD has looked at the practicality of implementing a communication mechanism proposed by Craig Partridge called Late-binding Remote Procedure Calls (LbRPC). LbRPC is intended to export both code and data over the network to remote machines for evaluation, as opposed to traditional RPC mechanisms that only send parameters to pre-existing remote procedures. The ability to send code as well as data means that LbRPC requests can overcome one of the biggest problems in Wide Area Distributed Computer Systems (WADCS): the fixed latency due to the speed of light. As machines get faster, the fixed multi-millisecond round trip delay equates to ever increasing numbers of CPU cycles. For a WADCS to be efficient, programs should minimise the number of network transits they incur. By allowing the application programmer to export arbitrary code to the remote machine, this may be achieved. This research has looked at the feasibility of supporting secure exportation of arbitrary code and data in heterogeneous, loosely coupled, distributed computing environments. It has investigated techniques for making placement decisions for the code in cases where there are a large number of widely dispersed remote servers that could be used. The latter has resulted in the development of a novel prototype LbRPC using multicast IP for implicit placement and a sequenced, multi-packet saturation multicast transport protocol. These prototypes show that it is possible to export code and data to multiple remote hosts, thereby removing the need to perform complex and error prone explicit process placement decisions

    Architectures and dynamic bandwidth allocation algorithms for next generation optical access networks

    Get PDF

    DIVE on the internet

    Get PDF
    This dissertation reports research and development of a platform for Collaborative Virtual Environments (CVEs). It has particularly focused on two major challenges: supporting the rapid development of scalable applications and easing their deployment on the Internet. This work employs a research method based on prototyping and refinement and promotes the use of this method for application development. A number of the solutions herein are in line with other CVE systems. One of the strengths of this work consists in a global approach to the issues raised by CVEs and the recognition that such complex problems are best tackled using a multi-disciplinary approach that understands both user and system requirements. CVE application deployment is aided by an overlay network that is able to complement any IP multicast infrastructure in place. Apart from complementing a weakly deployed worldwide multicast, this infrastructure provides for a certain degree of introspection, remote controlling and visualisation. As such, it forms an important aid in assessing the scalability of running applications. This scalability is further facilitated by specialised object distribution algorithms and an open framework for the implementation of novel partitioning techniques. CVE application development is eased by a scripting language, which enables rapid development and favours experimentation. This scripting language interfaces many aspects of the system and enables the prototyping of distribution-related components as well as user interfaces. It is the key construct of a distributed environment to which components, written in different languages, connect and onto which they operate in a network abstracted manner. The solutions proposed are exemplified and strengthened by three collaborative applications. The Dive room system is a virtual environment modelled after the room metaphor and supporting asynchronous and synchronous cooperative work. WebPath is a companion application to a Web browser that seeks to make the current history of page visits more visible and usable. Finally, the London travel demonstrator supports travellers by providing an environment where they can explore the city, utilise group collaboration facilities, rehearse particular journeys and access tourist information data

    Atomic Transfer for Distributed Systems

    Get PDF
    Building applications and information systems increasingly means dealing with concurrency and faults stemming from distribution of system components. Atomic transactions are a well-known method for transferring the responsibility for handling concurrency and faults from developers to the software\u27s execution environment, but incur considerable execution overhead. This dissertation investigates methods that shift some of the burden of concurrency control into the network layer, to reduce response times and increase throughput. It anticipates future programmable network devices, enabling customized high-performance network protocols. We propose Atomic Transfer (AT), a distributed algorithm to prevent race conditions due to messages crossing on a path of network switches. Switches check request messages for conflicts with response messages traveling in the opposite direction. Conflicting requests are dropped, obviating the request\u27s receiving host from detecting and handling the conflict. AT is designed to perform well under high data contention, as concurrency control effort is balanced across a network instead of being handled by the contended endpoint hosts themselves. We use AT as the basis for a new optimistic transactional cache consistency algorithm, supporting execution of atomic applications caching shared data. We then present a scalable refinement, allowing hierarchical consistent caches with predictable performance despite high data update rates. We give detailed I/O Automata models of our algorithms along with correctness proofs. We begin with a simplified model, assuming static network paths and no message loss, and then refine it to support dynamic network paths and safe handling of message loss. We present a trie-based data structure for accelerating conflict-checking on switches, with benchmarks suggesting the feasibility of our approach from a performance stand-point
    • …
    corecore