9 research outputs found

    Design and Performance of Scalable High-Performance Programmable Routers - Doctoral Dissertation, August 2002

    Get PDF
    The flexibility to adapt to new services and protocols without changes in the underlying hardware is and will increasingly be a key requirement for advanced networks. Introducing a processing component into the data path of routers and implementing packet processing in software provides this ability. In such a programmable router, a powerful processing infrastructure is necessary to achieve to level of performance that is comparable to custom silicon-based routers and to demonstrate the feasibility of this approach. This work aims at the general design of such programmable routers and, specifically, at the design and performance analysis of the processing subsystem. The necessity of programmable routers is motivated, and a router design is proposed. Based on the design, a general performance model is developed and quantitatively evaluated using a new network processor benchmark. Operational challenges, like scheduling of packets to processing engines, are addressed, and novel algorithms are presented. The results of this work give qualitative and quantitative insights into this new domain that combines issues from networking, computer architecture, and system design

    Network Processors and Next Generation Networks: Design, Applications, and Perspectives

    Get PDF
    Network Processors (NPs) are hardware platforms born as appealing solutions for packet processing devices in networking applications. Nowadays, a plethora of solutions exists, with no agreement on a common architecture. Each vendor has proposed its specific solution and no official standard still exists. The common features of all proposals are a hierarchy of processors, with a general purpose processor and several units specialized for packet processing, a series of memory devices with different sizes and latencies, a low-level programmability. The target is a platform for networking applications with low time to market and high time in market, thanks to a high flexibility and a programmability simpler than that of ASICs, for example. After about ten years since the "birth" of network processors, this research activity wants to make an analytical balance of their development and usage. Many authoritative opinions suggest that NPs have been "outdated" by multicore or manycore systems, which provide general purpose environments and some specialized cores. The main reasons of these negative opinions are the hard programmability of NPs, which often requires the knowledge of private microcode, or the excessive architectural limits, such as reduced memories and minimal instruction store. Our research shows that Network Processors can be appealing for different applications in networking area, and many interesting solutions can be obtained, which present very high performance, outscoring current solutions. However, the issues of hard programming and remarkable limits exist, and they could be alleviated only by providing almost a comprehensive programming environment and a proper design in terms of processing and memory resources. More e cient solutions can be surely provided, but the experience of network processors has produced an important legacy in developing packet processing engines. In this work, we have realized many devices for networking purposes based on NP platform, in order to understand the complexity of programming, the flexibility of design, the complexity of tasks that can be implemented, the maximum depth of packet processing, the performance of such devices, the real usefulness of NPs in network devices. All these features have been accurately analyzed and will be illustrated in this thesis. Many remarkable results have been obtained, which confirm the Network Processors as appealing solutions for network devices. Moreover, the research on NPs have lead us to analyze and solve more general issues, related for instance to multiprocessor systems or to processors with no big available memory. In particular, the latter issue lead us to design many interesting data structures for set representation and membership query, which are based on randomized techniques and allow for big memory savings

    Branch Prediction For Network Processors

    Get PDF
    Originally designed to favour flexibility over packet processing performance, the future of the programmable network processor is challenged by the need to meet both increasing line rate as well as providing additional processing capabilities. To meet these requirements, trends within networking research has tended to focus on techniques such as offloading computation intensive tasks to dedicated hardware logic or through increased parallelism. While parallelism retains flexibility, challenges such as load-balancing limit its scope. On the other hand, hardware offloading allows complex algorithms to be implemented at high speed but sacrifice flexibility. To this end, the work in this thesis is focused on a more fundamental aspect of a network processor, the data-plane processing engine. Performing both system modelling and analysis of packet processing functions; the goal of this thesis is to identify and extract salient information regarding the performance of multi-processor workloads. Following on from a traditional software based analysis of programme workloads, we develop a method of modelling and analysing hardware accelerators when applied to network processors. Using this quantitative information, this thesis proposes an architecture which allows deeply pipelined micro-architectures to be implemented on the data-plane while reducing the branch penalty associated with these architectures

    Characterizing, managing and monitoring the networks for the ATLAS data acquisition system

    Get PDF
    Particle physics studies the constituents of matter and the interactions between them. Many of the elementary particles do not exist under normal circumstances in nature. However, they can be created and detected during energetic collisions of other particles, as is done in particle accelerators. The Large Hadron Collider (LHC) being built at CERN will be the world's largest circular particle accelerator, colliding protons at energies of 14 TeV. Only a very small fraction of the interactions will give raise to interesting phenomena. The collisions produced inside the accelerator are studied using particle detectors. ATLAS is one of the detectors built around the LHC accelerator ring. During its operation, it will generate a data stream of 64 Terabytes/s. A Trigger and Data Acquisition System (TDAQ) is connected to ATLAS -- its function is to acquire digitized data from the detector and apply trigger algorithms to identify the interesting events. Achieving this requires the power of over 2000 computers plus an interconnecting network capable of sustaining a throughput of over 150 Gbit/s with minimal loss and delay. The implementation of this network required a detailed study of the available switching technologies to a high degree of precision in order to choose the appropriate components. We developed an FPGA-based platform (the GETB) for testing network devices. The GETB system proved to be flexible enough to be used as the ba sis of three different network-related projects. An analysis of the traffic pattern that is generated by the ATLAS data-taking applications was also possible thanks to the GETB. Then, while the network was being assembled, parts of the ATLAS detector started commissioning -- this task relied on a functional network. Thus it was imperative to be able to continuously identify existing and usable infrastructure and manage its operations. In addition, monitoring was required to detect any overload conditions with an indication where the excess demand was being generated. We developed tools to ease the maintenance of the network and to automatically produce inventory reports. We created a system that discovers the network topology and this permitted us to verify the installation and to track its progress. A real-time traffic visualization system has been built, allowing us to see at a glance which network segments are heavily utilized. Later, as the network achieves production status, it will be necessary to extend the monitoring to identify individual applications' use of the available bandwidth. We studied a traffic monitoring technology that will allow us to have a better understanding on how the network is used. This technology, based on packet sampling, gives the possibility of having a complete view of the network: not only its total capacity utilization, but also how this capacity is divided among users and software applicati ons. This thesis describes the establishment of a set of tools designed to characterize, monitor and manage complex, large-scale, high-performance networks. We describe in detail how these tools were designed, calibrated, deployed and exploited. The work that led to the development of this thesis spans over more than four years and closely follows the development phases of the ATLAS network: its design, its installation and finally, its current and future operation

    Online learning on the programmable dataplane

    Get PDF
    This thesis makes the case for managing computer networks with datadriven methods automated statistical inference and control based on measurement data and runtime observations—and argues for their tight integration with programmable dataplane hardware to make management decisions faster and from more precise data. Optimisation, defence, and measurement of networked infrastructure are each challenging tasks in their own right, which are currently dominated by the use of hand-crafted heuristic methods. These become harder to reason about and deploy as networks scale in rates and number of forwarding elements, but their design requires expert knowledge and care around unexpected protocol interactions. This makes tailored, per-deployment or -workload solutions infeasible to develop. Recent advances in machine learning offer capable function approximation and closed-loop control which suit many of these tasks. New, programmable dataplane hardware enables more agility in the network— runtime reprogrammability, precise traffic measurement, and low latency on-path processing. The synthesis of these two developments allows complex decisions to be made on previously unusable state, and made quicker by offloading inference to the network. To justify this argument, I advance the state of the art in data-driven defence of networks, novel dataplane-friendly online reinforcement learning algorithms, and in-network data reduction to allow classification of switchscale data. Each requires co-design aware of the network, and of the failure modes of systems and carried traffic. To make online learning possible in the dataplane, I use fixed-point arithmetic and modify classical (non-neural) approaches to take advantage of the SmartNIC compute model and make use of rich device local state. I show that data-driven solutions still require great care to correctly design, but with the right domain expertise they can improve on pathological cases in DDoS defence, such as protecting legitimate UDP traffic. In-network aggregation to histograms is shown to enable accurate classification from fine temporal effects, and allows hosts to scale such classification to far larger flow counts and traffic volume. Moving reinforcement learning to the dataplane is shown to offer substantial benefits to stateaction latency and online learning throughput versus host machines; allowing policies to react faster to fine-grained network events. The dataplane environment is key in making reactive online learning feasible—to port further algorithms and learnt functions, I collate and analyse the strengths of current and future hardware designs, as well as individual algorithms

    Analytical bounds on the threads in IXP1200 network processor

    No full text

    Analytical Bounds on the Threads in IXP1200 Network Processor

    No full text
    Increasing link speeds have placed enormous burden on the processing requirements and the processors are expected to carry out a variety of tasks. Network Processors (NP) [1] [2] is the blanket name given to the processors, which are traded for flexibility and performance. Network Processors are offered by a number of vendors; to take the main burden of processing requirement of network related operations from the conventional processors. The Network Processors cover a spectrum of design tradeoff, that span in between the custom ASIC and the general-purpose processors. IXP1200 (Intel’s network processor) is one among them. This paper focuses on deriving the analytical bounds on the optimum number of threads in IXP1200 at 1Gbps wire speed
    corecore