24,241 research outputs found

    EXPLOITING PARTIALLY RECONFIGURABLE FPGA FOR PERFORMANCE ADJUSTMENT IN THE RVC FRAMEWORK

    Get PDF
    International audienceIn this paper, we present a method to implement a specific algorithm using the RVC framework and the dynamic partial reconfiguration (DPR). The DPR is a technique allowing to replace modules in a design at run-time. While, the RVC framework is based on the use a specific language for writing dataflow models called RVC-CAL. The studied algorithm is a Hadamard transform. Several dataflow models of the Hadamard transform can be used in the design process in order to favor speed or power consumption. We show the steps required to implement and switch between two dataflow models (a sequential model and a pipelined model) of the Hadamard transform. Our design allows to user to choose one of two architectures according her requirements of low power and high speed

    Smart technologies for effective reconfiguration: the FASTER approach

    Get PDF
    Current and future computing systems increasingly require that their functionality stays flexible after the system is operational, in order to cope with changing user requirements and improvements in system features, i.e. changing protocols and data-coding standards, evolving demands for support of different user applications, and newly emerging applications in communication, computing and consumer electronics. Therefore, extending the functionality and the lifetime of products requires the addition of new functionality to track and satisfy the customers needs and market and technology trends. Many contemporary products along with the software part incorporate hardware accelerators for reasons of performance and power efficiency. While adaptivity of software is straightforward, adaptation of the hardware to changing requirements constitutes a challenging problem requiring delicate solutions. The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) project aims at introducing a complete methodology to allow designers to easily implement a system specification on a platform which includes a general purpose processor combined with multiple accelerators running on an FPGA, taking as input a high-level description and fully exploiting, both at design time and at run time, the capabilities of partial dynamic reconfiguration. The goal is that for selected application domains, the FASTER toolchain will be able to reduce the design and verification time of complex reconfigurable systems providing additional novel verification features that are not available in existing tool flows

    A HIERARCHICAL IMPLEMENTATION OF HADAMARD TRANSFORM USING RVC-CAL DATAFLOW PROGRAMMING AND DYNAMIC PARTIAL RECONFIGURATION

    Get PDF
    International audienceThis paper presents an efficient design method used to implement a hierarchical architecture of Hadamard transform module. The proposed design method is based on the use of RVCCAL dataflow approach and dynamic partial reconfiguration technique (DPR). The DPR technique allows reconfiguring a part of the FPGA area with different functionalities at runtime. It is a promising solution to increase performance in the system. RVC-CAL is a specific language for writing dataflow models which is introduced by MPEG-RVC video standard. RVC-CAL description is composed of a set of interconnected blocks (actors). Several dataflow models of the same application can be used in the design process. In this work, the hierarchical architecture of Hadamard module is composed of three levels. And each one contains a set of blocks. The DPR is applied between these blocks to switch from level to another. To achieve this implementation, in the first, the Hadamard blocks are described in RVC-CAL language and a specific RVC-CAL tool is used to generate automatically their hardware description. Then, the DPR design flow is applied. In our design method, we use xilinx tools and Virtex-5 FPGA board. To evaluate our implementation, we compare its with two other architectures in terms of area occupation, power consumption and execution time

    Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

    Get PDF
    In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

    Coarse-grained reconfigurable array architectures

    Get PDF
    Coarse-Grained Reconfigurable Array (CGRA) architectures accelerate the same inner loops that benefit from the high ILP support in VLIW architectures. By executing non-loop code on other cores, however, CGRAs can focus on such loops to execute them more efficiently. This chapter discusses the basic principles of CGRAs, and the wide range of design options available to a CGRA designer, covering a large number of existing CGRA designs. The impact of different options on flexibility, performance, and power-efficiency is discussed, as well as the need for compiler support. The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual fine-tuning of source code

    Verifying service continuity in a satellite reconfiguration procedure: application to a satellite

    Get PDF
    The paper discusses the use of the TURTLE UML profile to model and verify service continuity during dynamic reconfiguration of embedded software, and space-based telecommunication software in particular. TURTLE extends UML class diagrams with composition operators, and activity diagrams with temporal operators. Translating TURTLE to the formal description technique RT-LOTOS gives the profile a formal semantics and makes it possible to reuse verification techniques implemented by the RTL, the RT-LOTOS toolkit developed at LAAS-CNRS. The paper proposes a modeling and formal validation methodology based on TURTLE and RTL, and discusses its application to a payload software application in charge of an embedded packet switch. The paper demonstrates the benefits of using TURTLE to prove service continuity for dynamic reconfiguration of embedded software

    Execution modeling in self-aware FPGA-based architectures for efficient resource management

    Get PDF
    SRAM-based FPGAs have significantly improved their performance and size with the use of newer and ultra-deep-submicron technologies, even though power consumption, together with a time-consuming initial configuration process, are still major concerns when targeting energy-efficient solutions. System self-awareness enables the use of strategies to enhance system performance and power optimization taking into account run-time metrics. This is of particular importance when dealing with reconfigurable systems that may make use of such information for efficient resource management, such as in the case of the ARTICo3 architecture, which fosters dynamic execution of kernels formed by multiple blocks of threads allocated in a variable number of hardware accelerators, combined with module redundancy for fault tolerance and other dependability enhancements, e.g. side-channel-attack protection. In this paper, a model for efficient dynamic resource management focused on both power consumption and execution times in the ARTICo3 architecture is proposed. The approach enables the characterization of kernel execution by using the model, providing additional decision criteria based on energy efficiency, so that resource allocation and scheduling policies may adapt to changing conditions. Two different platforms have been used to validate the proposal and show the generalization of the model: a high-performance wireless sensor node based on a Spartan-6 and a standard off-the-shelf development board based on a Kintex-7
    corecore