43 research outputs found

    A dynamically reconfigurable pattern matcher for regular expressions on FPGA

    Get PDF
    In this article we describe how to expand a partially dynamic reconfig- urable pattern matcher for regular expressions presented in previous work by Di- vyasree and Rajashekar [2]. The resulting, extended, pattern matcher is fully dynamically reconfigurable. First, the design is adapted for use with parameterisable configurations, a method for Dynamic Circuit Specialization. Using parameteris- able configurations allows us to achieve the same area gains as the hand crafted reconfigurable design, with the benefit that parameterisable configurations can be applied automatically. This results in a design that is more easily adaptable to spe- cific applications and allows for an easier design exploration. Additionally, the pa- rameterisable configuration implementation is also generated automatically, which greatly reduces the design overhead of using dynamic reconfiguration. Secondly, we propose a number of expansions to the original design to overcome several limitations in the original design that constrain the dynamic reconfigurability of the pattern matcher. We propose two different solutions to dynamically change the character that is matched in a certain block. The resulting pattern matcher, after these changes, is fully dynamically reconfigurable, all aspects of the implemented regular expression can be changed at run-time

    ParaFPGA 2011 : high performance computing with multiple FPGAs : design, methodology and applications

    Get PDF
    ParaFPGA 2011 marks the third mini-symposium devoted to the methodology, design and implementation of parallel applications using FPGAs. The focus of the contributions is mainly on organizing parallel applications in multiple FPGAs. This includes experiences from building a supercomputer with FPGAs, automatic and dedicated balancing of different tasks on heterogeneous FPGA constellations and designing optimal interconnects between collaborating FPGAs

    Mejora de la evaluación de expresiones regulares sobre hardware reconfigurable

    Get PDF
    Since the Internet was born, the amount of data that systems process has increased in an exponential way and this is the reason because these systems need to be fast, flexible and powerful. Nowadays, communications keep increasing the speed requirements for data processing, and the FPGA‟s are ideal for this task. In data processing, a huge amount of time is dedicated to pattern matching, frequently involving regular expressions matching. As the amount of patterns to be checked grow up, so does the hardware complexity dedicated to its recognition. Thus it needs to be flexible to be able to adapt to the necessary changes with ease. In this project a VHDL code generator implemented in Java is presented. The code generated describes a regular expressions recognizer of various sets given by parameter, which will be synthetized by an FPGA. This module takes various sets of regular expressions and generates the VHDL code that describes the system which recognizes them. The code generator is flexible, due to great modularity and upgradeability that software offers. Thus, the main advantage of this model consists on the possibility of combining the flexibility of software with the speed of hardware in order to create fast and low cost recognizers in a flexible and easy way

    Parallel computing 2011, ParCo 2011: book of abstracts

    Get PDF
    This book contains the abstracts of the presentations at the conference Parallel Computing 2011, 30 August - 2 September 2011, Ghent, Belgiu

    Applications, tools and techniques on the road to exascale computing

    Get PDF
    This volume of the book series “Advances in Parallel Computing” contains the proceedings of ParCo2011, the 14th biennial ParCo Conference, held from 31 August to 3 September 2011, in Ghent, Belgium. In an era when physical limitations have slowed down advances in the performance of single processing units, and new scientific challenges require exascale speed, parallel processing has gained momentum as a key gateway to HPC (High Performance Computing). Historically, the ParCo conferences have focused on three main themes: Algorithms, Architectures (both hardware and software) and Applications. Nowadays, the scenery has changed from traditional multiprocessor topologies to heterogeneous manycores, incorporating standard CPUs, GPUs (Graphics Processing Units) and FPGAs (Field Programmable Gate Arrays). These platforms are, at a higher abstraction level, integrated in clusters, grids, and clouds. This is reflected in the papers presented at the conference and the contributions as included in these proceedings. An increasing number of new algorithms are optimized for heterogeneous platforms and performance tuning is targeting extreme scale computing. Heterogeneous platforms utilising the compute power and energy efficiency of GPGPUs (General Purpose GPUs) are clearly becoming mainstream HPC systems for a large number of applications in a wide spectrum of application areas. These systems excel in areas such as complex system simulation, real-time image processing and visualisation, etc. High performance computing accelerators may well become the cornerstone of exascale computing applications such as 3-D turbulent combustion flows, nuclear energy simulations, brain research, financial and geophysical modelling. The exploration of new architectures, programming tools and techniques was evidenced by the mini-symposia “Parallel Computing with FPGAs” and “Exascale Programming Models”. The need for exascale hardware and software was also stressed in the industrial session, with contributions from Cray and the European exascale software initiative. Our sincere appreciation goes to the keynote speakers who gave their perspectives on the impact of parallel computing today and the road to exascale computing tomorrow. Our heartfelt thanks go to the authors for their valuable scientific contributions and to the programme committee who reviewed the papers and provided constructive remarks. The international audience was inspired by the quality of the presentations. The attendance and interaction was high and the conference has been an agora where many fruitful ideas were exchanged and explored. We wish to express our sincere thanks to the organizers for the smooth operation of the conference. The University conference centre Het Pand offered an excellent environment for the conference as it allowed delegates to interact informally and easily. A special word of thanks is due to the management and support staff of Het Pand for their proficient and friendly support. The organizers managed to put together an extensive social programme. This included a reception at the medieval Town Hall of Ghent as well as a memorable conference dinner. These social events stimulated interaction amongst delegates and resulted in many new contacts being made. Finally we wish to thank all the many supporters who assisted in the organization and successful running of the event. Erik D'Hollander, Ghent University, Belgium Koen De Bosschere, Ghent University, Belgium Gerhard R. Joubert, TU Clausthal, Germany David Padua, University of Illinois, USA Frans Peters, Philips Research, Netherland

    A Multi-FPGA Networking Architecture and Its Implementation

    Get PDF
    FPGAs show great promise in accelerating compute-bound parallelizable applications by offloading kernels into programmable logic. However, currently FPGAs present significant hurdles in being a viable technology, due to both the capital outlay required for specialized hardware as well as the logic required to support the offloaded kernels on the FPGA. This thesis seeks to change that by making it easy to communicate clusters of FPGAs over IP networks and providing infrastructure for common application use cases, allowing authors to focus on their application and not the procurement and details of interacting with a specific FPGA. Our approach is twofold. First, we develop an FPGA IP network stack and bitfile management system allowing users to upload their logic to a server and have it run on FPGAs accessible through the Internet. Second, we engineer a programmable logic interface which authors can use to move data to their application kernels. This interface provides communication over the Internet as well as the scaffolding typically re-invented for each application by providing I/O between application logic, even if spread across different FPGAs. We utilize Partial Reconfiguration to divide the FPGAs into regions, each of which can host different applications from different users. We then provide a web service through which users can upload their FPGA logic. The service finds a spot for the logic on the FPGAs, reconfigures them to contain the logic, then sends back the user their IP addresses. To ease development of the application pieces themselves, our framework abstracts away the complexity of communicating over IP networks as well as between different FPGAs. Instead we provide an interface to applications consisting simply of a RAM port. Applications write packets of data into the port, and they appear at the other end, whether that other end is across an IP network or another FPGA. Finally, we then prove the feasibility and utility of our approach by implementing it on an array of Xilinx Virtex 5 FPGAs, linked together with GTP serial links and connected via Gigabit Ethernet. We port a compute-bound application based on regular expression string matching to the framework, demonstrating that our approach is feasible for implementing a realistic application

    Identification of dynamic circuit specialization opportunities in RTL code

    Get PDF
    Dynamic Circuit Specialization (DCS) optimizes a Field-Programmable Gate Array (FPGA) design by assuming a set of its input signals are constant for a reasonable amount of time, leading to a smaller and faster FPGA circuit. When the signals actually change, a new circuit is loaded into the FPGA through runtime reconfiguration. The signals the design is specialized for are called parameters. For certain designs, parameters can be selected so the DCS implementation is both smaller and faster than the original implementation. However, DCS also introduces an overhead that is difficult for the designer to take into account, making it hard to determine whether a design is improved by DCS or not. This article presents extensive results on a profiling methodology that analyses Register-Transfer Level (RTL) implementations of applications to check if DCS would be beneficial. It proposes to use the functional density as a measure for the area efficiency of an implementation, as this measure contains both the overhead and the gains of a DCS implementation. The first step of the methodology is to analyse the dynamic behaviour of signals in the design, to find good parameter candidates. The overhead of DCS is highly dependent on this dynamic behaviour. A second stage calculates the functional density for each candidate and compares it to the functional density of the original design. The profiling methodology resulted in three implementations of a profiling tool, the DCS-RTL profiler. The execution time, accuracy, and the quality of each implementation is assessed based on data from 10 RTL designs. All designs, except for the two 16-bit adaptable Finite Impulse Response (FIR) filters, are analysed in 1 hour or less

    FPgrep and FPsed: Packet Payload Processors for Managing the Flow of Digital Content on Local Area Networks and the Internet

    Get PDF
    As computer networks increase in speed, it becomes difficult to monitor and manage the transmitted digital content. To alleviate these problems, hardware-based search (FPgrep) and search-and-replace (FPsed) modules have been developed. FP-grep has the ability to scan packet payloads for a given set of regular expressions and pass or drop packets based on the payload contents. FPsed also scans packet payloads for a set of regular expressions and adds the ability to modify the payload if desired. The hardware circuits that implement the FPgrep and FPsed modules can be generated, compiled, and synthesized using a simple web interface. Once a module is created it is programmed into logic on a Field Programmable Gate Array (FPGA). The FPgrep and FPsed modules use FPGAs to process packets at the full rate of Gigabit-speed networks. Both modules, along with several supporting applications were developed and tested using the Field Programmable Port Extender (FPX) platform. Applications developed for the modules currently include a spam filter, virus protection, an information security filter, as well as a copyright enforcement function
    corecore