669 research outputs found

    HyperFPGA: SoC-FPGA Cluster Architecture for Supercomputing and Scientific applications

    Get PDF
    Since their inception, supercomputers have addressed problems that far exceed those of a single computing device. Modern supercomputers are made up of tens of thousands of CPUs and GPUs in racks that are interconnected via elaborate and most of the time ad hoc networks. These large facilities provide scientists with unprecedented and ever-growing computing power capable of tackling more complex and larger problems. In recent years, the most powerful supercomputers have already reached megawatt power consumption levels, an important issue that challenges sustainability and shows the impossibility of maintaining this trend. With more pressure on energy efficiency, an alternative to traditional architectures is needed. Reconfigurable hardware, such as FPGAs, has repeatedly been shown to offer substantial advantages over the traditional supercomputing approach with respect to performance and power consumption. In fact, several works that advanced the field of heterogeneous supercomputing using FPGAs are described in this thesis \cite{survey-2002}. Each cluster and its architectural characteristics can be studied from three interconnected domains: network, hardware, and software tools, resulting in intertwined challenges that designers must take into account. The classification and study of the architectures illustrate the trade-offs of the solutions and help identify open problems and research lines, which in turn served as inspiration and background for the HyperFPGA. In this thesis, the HyperFPGA cluster is presented as a way to build scalable SoC-FPGA platforms to explore new architectures for improved performance and energy efficiency in high-performance computing, focusing on flexibility and openness. The HyperFPGA is a modular platform based on a SoM that includes power monitoring tools with high-speed general-purpose interconnects to offer a great level of flexibility and introspection. By exploiting the reconfigurability and programmability offered by the HyperFPGA infrastructure, which combines FPGAs and CPUs, with high-speed general-purpose connectors, novel computing paradigms can be implemented. A custom Linux OS and drivers, along with a custom script for hardware definition, provide a uniform interface from application to platform for a programmable framework that integrates existing tools. The development environment is demonstrated using the N-Queens problem, which is a classic benchmark for evaluating the performance of parallel computing systems. Overall, the results of the HyperFPGA using the N-Queens problem highlight the platform's ability to handle computationally intensive tasks and demonstrate its suitability for its use in supercomputing experiments.Since their inception, supercomputers have addressed problems that far exceed those of a single computing device. Modern supercomputers are made up of tens of thousands of CPUs and GPUs in racks that are interconnected via elaborate and most of the time ad hoc networks. These large facilities provide scientists with unprecedented and ever-growing computing power capable of tackling more complex and larger problems. In recent years, the most powerful supercomputers have already reached megawatt power consumption levels, an important issue that challenges sustainability and shows the impossibility of maintaining this trend. With more pressure on energy efficiency, an alternative to traditional architectures is needed. Reconfigurable hardware, such as FPGAs, has repeatedly been shown to offer substantial advantages over the traditional supercomputing approach with respect to performance and power consumption. In fact, several works that advanced the field of heterogeneous supercomputing using FPGAs are described in this thesis \cite{survey-2002}. Each cluster and its architectural characteristics can be studied from three interconnected domains: network, hardware, and software tools, resulting in intertwined challenges that designers must take into account. The classification and study of the architectures illustrate the trade-offs of the solutions and help identify open problems and research lines, which in turn served as inspiration and background for the HyperFPGA. In this thesis, the HyperFPGA cluster is presented as a way to build scalable SoC-FPGA platforms to explore new architectures for improved performance and energy efficiency in high-performance computing, focusing on flexibility and openness. The HyperFPGA is a modular platform based on a SoM that includes power monitoring tools with high-speed general-purpose interconnects to offer a great level of flexibility and introspection. By exploiting the reconfigurability and programmability offered by the HyperFPGA infrastructure, which combines FPGAs and CPUs, with high-speed general-purpose connectors, novel computing paradigms can be implemented. A custom Linux OS and drivers, along with a custom script for hardware definition, provide a uniform interface from application to platform for a programmable framework that integrates existing tools. The development environment is demonstrated using the N-Queens problem, which is a classic benchmark for evaluating the performance of parallel computing systems. Overall, the results of the HyperFPGA using the N-Queens problem highlight the platform's ability to handle computationally intensive tasks and demonstrate its suitability for its use in supercomputing experiments

    A Survey on FPGA-Based Heterogeneous Clusters Architectures

    Get PDF
    In recent years, the most powerful supercomputers have already reached megawatt power consumption levels, an important issue that challenges sustainability and shows the impossibility of maintaining this trend. To this date, the prevalent approach to supercomputing is dominated by CPUs and GPUs. Given their fixed architectures with generic instruction sets, they have been favored with lots of tools and mature workflows which led to mass adoption and further growth. However, reconfigurable hardware such as FPGAs has repeatedly proven that it offers substantial advantages over this supercomputing approach concerning performance and power consumption. In this survey, we review the most relevant works that advanced the field of heterogeneous supercomputing using FPGAs focusing on their architectural characteristics. Each work was divided into three main parts: network, hardware, and software tools. All implementations face challenges that involve all three parts. These dependencies result in compromises that designers must take into account. The advantages and limitations of each approach are discussed and compared in detail. The classification and study of the architectures illustrate the trade-offs of the solutions and help identify open problems and research lines

    A soft processor overlay with tightly-coupled FPGA accelerator

    Get PDF
    FPGA overlays are commonly implemented as coarse-grained reconfigurable architectures with a goal to improve designers’ productivity through balancing flexibility and ease of configuration of the underlying fabric. To truly facilitate full application acceleration, it is often necessary to also include a highly efficient processor that integrates and collaborates with the accelerators while maintaining the benefits of being implemented within the same overlay framework. This paper presents an open-source soft processor that is designed to tightly-couple with FPGA accelerators as part of an overlay framework. RISC-V is chosen as the instruction set for its openness and portability, and the soft processor is designed as a 4-stage pipeline to balance resource consumption and performance when implemented on FPGAs. The processor is generically implemented so as to promote design portability and compatibility across different FPGA platforms. Experimental results show that integrated software-hardware applications using the proposed tightly-coupled architecture achieve comparable performance as hardware-only accelerators while the proposed architecture provides additional run-time flexibility. The processor has been synthesized to both low-end and high-performance FPGA families from different vendors, achieving the highest frequency of 268:67MHz and resource consumption comparable to existing RISC-V designs.postprin
    • …
    corecore