146 research outputs found

    A Modular Approach to Adaptive Reactive Streaming Systems

    Get PDF
    The latest generations of FPGA devices offer large resource counts that provide the headroom to implement large-scale and complex systems. However, there are increasing challenges for the designer, not just because of pure size and complexity, but also in harnessing effectively the flexibility and programmability of the FPGA. A central issue is the need to integrate modules from diverse sources to promote modular design and reuse. Further, the capability to perform dynamic partial reconfiguration (DPR) of FPGA devices means that implemented systems can be made reconfigurable, allowing components to be changed during operation. However, use of DPR typically requires low-level planning of the system implementation, adding to the design challenge. This dissertation presents ReShape: a high-level approach for designing systems by interconnecting modules, which gives a ‘plug and play’ look and feel to the designer, is supported by tools that carry out implementation and verification functions, and is carried through to support system reconfiguration during operation. The emphasis is on the inter-module connections and abstracting the communication patterns that are typical between modules – for example, the streaming of data that is common in many FPGA-based systems, or the reading and writing of data to and from memory modules. ShapeUp is also presented as the static precursor to ReShape. In both, the details of wiring and signaling are hidden from view, via metadata associated with individual modules. ReShape allows system reconfiguration at the module level, by supporting type checking of replacement modules and by managing the overall system implementation, via metadata associated with its FPGA floorplan. The methodology and tools have been implemented in a prototype for a broad domain-specific setting – networking systems – and have been validated on real telecommunications design projects

    system-level modeling of programmable packet processing systems

    Get PDF
    Computer networks are experiencing explosive growth which is reinforced by the recent exhaustion of the global IPv4 addresses space in 2011 and the tenfold increase in users from 1999 to 2013. The advent of cloud, mobile and IoT is only going to accelerate this growth. This accedes the need for flexible and scalable networks that process packets faster. Programmable packet processing systems have emerged as a solution which aim to find balance between flexibility of supporting different processing functions while maintaining a high processing capability. Designing architectures that support such paradigms is fairly complicated as decisions need to be made for evaluating trade-offs between flexibility and efficiency. Questions like what programmatic interfaces, services, applications and protocols are required need to be answered before synthesis of actual hardware. To evaluate such requirements modelling techniques are required to evaluate architecture decisions accurately early enough in the design phase. In this thesis, we propose a flexible system level modelling methodology for early validation, design and analysis of packet processing applications for programmable forwarding plane architectures. The hardware and software architecture is described in a high level language which can be used to describe forwarding planes from many core network processors to reconfigurable processing pipelines. Device architects can use this for design space exploration, prototyping and validation; where application developers can start pre-silicon application design, development and debugging to evaluate different hardware and software decisions in an industry with ever shrinking market windows

    Boosting XML Filtering with a Scalable FPGA-based Architecture

    Full text link
    The growing amount of XML encoded data exchanged over the Internet increases the importance of XML based publish-subscribe (pub-sub) and content based routing systems. The input in such systems typically consists of a stream of XML documents and a set of user subscriptions expressed as XML queries. The pub-sub system then filters the published documents and passes them to the subscribers. Pub-sub systems are characterized by very high input ratios, therefore the processing time is critical. In this paper we propose a "pure hardware" based solution, which utilizes XPath query blocks on FPGA to solve the filtering problem. By utilizing the high throughput that an FPGA provides for parallel processing, our approach achieves drastically better throughput than the existing software or mixed (hardware/software) architectures. The XPath queries (subscriptions) are translated to regular expressions which are then mapped to FPGA devices. By introducing stacks within the FPGA we are able to express and process a wide range of path queries very efficiently, on a scalable environment. Moreover, the fact that the parser and the filter processing are performed on the same FPGA chip, eliminates expensive communication costs (that a multi-core system would need) thus enabling very fast and efficient pipelining. Our experimental evaluation reveals more than one order of magnitude improvement compared to traditional pub/sub systems.Comment: CIDR 200

    Configurable data center switch architectures

    Get PDF
    In this thesis, we explore alternative architectures for implementing con_gurable Data Center Switches along with the advantages that can be provided by such switches. Our first contribution centers around determining switch architectures that can be implemented on Field Programmable Gate Array (FPGA) to provide configurable switching protocols. In the process, we identify a gap in the availability of frameworks to realistically evaluate the performance of switch architectures in data centers and contribute a simulation framework that relies on realistic data center traffic patterns. Our framework is then used to evaluate the performance of currently existing as well as newly proposed FPGA-amenable switch designs. Through collaborative work with Meng and Papaphilippou, we establish that only small-medium range switches can be implemented on today's FPGAs. Our second contribution is a novel switch architecture that integrates a custom in-network hardware accelerator with a generic switch to accelerate Deep Neural Network training applications in data centers. Our proposed accelerator architecture is prototyped on an FPGA, and a scalability study is conducted to demonstrate the trade-offs of an FPGA implementation when compared to an ASIC implementation. In addition to the hardware prototype, we contribute a light weight load-balancing and congestion control protocol that leverages the unique communication patterns of ML data-parallel jobs to enable fair sharing of network resources across different jobs. Our large-scale simulations demonstrate the ability of our novel switch architecture and light weight congestion control protocol to both accelerate the training time of machine learning jobs by up to 1.34x and benefit other latency-sensitive applications by reducing their 99%-tile completion time by up to 4.5x. As for our final contribution, we identify the main requirements of in-network applications and propose a Network-on-Chip (NoC)-based architecture for supporting a heterogeneous set of applications. Observing the lack of tools to support such research, we provide a tool that can be used to evaluate NoC-based switch architectures.Open Acces

    Zebra : Building Efficient Network Message Parsers for Embedded Systems

    Get PDF
    4 pagesInternational audienceSupporting standard text-based protocols in embedded systems is challenging because of the often limited computational resources that embedded systems provide. To overcome this issue, a promising approach is to build parsers directly in hardware. Unfortunately, developing such parsers is a daunting task for most developers as it is at the crossroads of several areas of expertise, such as low-level network programming, or hardware design. In this paper, we propose Zebra, a generative approach to drastically ease the development of hardware parsers and their use in network applications. To validate our approach, we have used Zebra to generate hardware parsers for widely used protocols, namely HTTP, SMTP, SIP, and RTSP. Our experiments show that Zebra-based parsers are up to 11 times faster than software-based parsers

    A Survey on Data Plane Programming with P4: Fundamentals, Advances, and Applied Research

    Full text link
    With traditional networking, users can configure control plane protocols to match the specific network configuration, but without the ability to fundamentally change the underlying algorithms. With SDN, the users may provide their own control plane, that can control network devices through their data plane APIs. Programmable data planes allow users to define their own data plane algorithms for network devices including appropriate data plane APIs which may be leveraged by user-defined SDN control. Thus, programmable data planes and SDN offer great flexibility for network customization, be it for specialized, commercial appliances, e.g., in 5G or data center networks, or for rapid prototyping in industrial and academic research. Programming protocol-independent packet processors (P4) has emerged as the currently most widespread abstraction, programming language, and concept for data plane programming. It is developed and standardized by an open community and it is supported by various software and hardware platforms. In this paper, we survey the literature from 2015 to 2020 on data plane programming with P4. Our survey covers 497 references of which 367 are scientific publications. We organize our work into two parts. In the first part, we give an overview of data plane programming models, the programming language, architectures, compilers, targets, and data plane APIs. We also consider research efforts to advance P4 technology. In the second part, we analyze a large body of literature considering P4-based applied research. We categorize 241 research papers into different application domains, summarize their contributions, and extract prototypes, target platforms, and source code availability.Comment: Submitted to IEEE Communications Surveys and Tutorials (COMS) on 2021-01-2

    Reconfigurable Computing Systems for Robotics using a Component-Oriented Approach

    Get PDF
    Robotic platforms are becoming more complex due to the wide range of modern applications, including multiple heterogeneous sensors and actuators. In order to comply with real-time and power-consumption constraints, these systems need to process a large amount of heterogeneous data from multiple sensors and take action (via actuators), which represents a problem as the resources of these systems have limitations in memory storage, bandwidth, and computational power. Field Programmable Gate Arrays (FPGAs) are programmable logic devices that offer high-speed parallel processing. FPGAs are particularly well-suited for applications that require real-time processing, high bandwidth, and low latency. One of the fundamental advantages of FPGAs is their flexibility in designing hardware tailored to specific needs, making them adaptable to a wide range of applications. They can be programmed to pre-process data close to sensors, which reduces the amount of data that needs to be transferred to other computing resources, improving overall system efficiency. Additionally, the reprogrammability of FPGAs enables them to be repurposed for different applications, providing a cost-effective solution that needs to adapt quickly to changing demands. FPGAs' performance per watt is close to that of Application-Specific Integrated Circuits (ASICs), with the added advantage of being reprogrammable. Despite all the advantages of FPGAs (e.g., energy efficiency, computing capabilities), the robotics community has not fully included them so far as part of their systems for several reasons. First, designing FPGA-based solutions requires hardware knowledge and longer development times as their programmability is more challenging than Central Processing Units (CPUs) or Graphics Processing Units (GPUs). Second, porting a robotics application (or parts of it) from software to an accelerator requires adequate interfaces between software and FPGAs. Third, the robotics workflow is already complex on its own, combining several fields such as mechanics, electronics, and software. There have been partial contributions in the state-of-the-art for FPGAs as part of robotics systems. However, a study of FPGAs as a whole for robotics systems is missing in the literature, which is the primary goal of this dissertation. Three main objectives have been established to accomplish this. (1) Define all components required for an FPGAs-based system for robotics applications as a whole. (2) Establish how all the defined components are related. (3) With the help of Model-Driven Engineering (MDE) techniques, generate these components, deploy them, and integrate them into existing solutions. The component-oriented approach proposed in this dissertation provides a proper solution for designing and implementing FPGA-based designs for robotics applications. The modular architecture, the tool 'FPGA Interfaces for Robotics Middlewares' (FIRM), and the toolchain 'FPGA Architectures for Robotics' (FAR) provide a set of tools and a comprehensive design process that enables the development of complex FPGA-based designs more straightforwardly and efficiently. The component-oriented approach contributed to the state-of-the-art in FPGA-based designs significantly for robotics applications and helps to promote their wider adoption and use by specialists with little FPGA knowledge
    • …
    corecore