13 research outputs found

    Module Relocation in Heterogeneous Reconfigurable Systems-on-Chip using the Xilinx Isolation Design Flow

    Get PDF
    International audienceHeterogeneous Reconfigurable Systems-on-Chip (HRSoC) contain as their name suggests, heterogeneous processing elements in a single chip. Namely, several processors, hardware accelerators as well as communication networks between all these components. In order to leverage the programming complexity of this kind of platform, applications are described with software threads, running on processors, and hardware threads, running on FPGA partitions. Combining techniques such as dynamic and partial reconfiguration and partial readback with the knowledge of the bitstream structure offer the ability to target several partitions using a unique configuration file. Such a feature permits to save critical memory resources. In this article, we propose to tackle the issue of designing fully independent partitions, and especially to avoid the routing conflicts which can occur when using the standard Xilinx FPGA design flow. To achieve the relocation process successfully, we propose a new design flow dedicated to the module relocation, using the standard tools and based on the Isolation Design Flow (IDF), a special flow provided by Xilinx for secure FPGA applications

    Modélisation, exploration et estimation de la consommation pour les architectures hétérogènes reconfigurables dynamiquement

    Get PDF
    L'utilisation des accélérateurs reconfigurables, pour la conception de system-on-chip hétérogènes, offre des possibilités intéressantes d'augmentation des performances et de réduction de la consommation d'énergie. En effet, ces accélérateurs sont couramment utilisés en complément d'un (ou de plusieurs) processeur(s) pour permettre de décharger celui-ci (ceux-ci) des calculs intensifs et des traitements de flots de données. Le concept de reconfiguration dynamique, supporté par certains constructeurs de FPGA, permet d'envisager des systèmes beaucoup plus flexibles en offrant notamment la possibilité de séquencer temporellement l'exécution de blocs de calcul sur la même surface de silicium, réduisant alors les besoins en ressources d'exécution. Cependant, la reconfiguration dynamique n'est pas sans impact sur les performances globales du système et il est difficile d'estimer la répercussion des décisions de configuration sur la consommation d'énergie. L'objectif principal de cette thèse consiste à proposer une méthodologie d'exploration permettant d'évaluer l'impact des choix d'implémentation des différentes tâches d'une application sur un system-on-chip contenant une ressource reconfigurable dynamiquement, en vue d'optimiser la consommation d'énergie ou le temps d'exécution. Pour cela, nous avons établi des modèles de consommation des composants reconfigurables, en particulier les FPGAs, qui permettent d'aider le concepteur dans son design. À l'aide d'une méthodologie de mesure sur Virtex-5, nous montrons dans un premier temps qu'il est possible de générer des accélérateurs matériels de tailles variées ayant des performances temporelles et énergétiques diverses. Puis, afin de quantifier les coûts d'implémentation de ces accélérateurs, nous construisons trois modèles de consommation de la reconfiguration dynamique partielle. Finalement, à partir des modèles définis et des accélérateurs produits, nous développons un algorithme d'exploration des solutions d'implémentation pour un système complet. En s'appuyant sur une plate-forme de modélisation à haut niveau, celui-ci analyse les coûts d'implémentation des tâches et leur exécution sur les différentes ressources disponibles (processeur ou région configurable). Les solutions offrant les meilleures performances en fonction des contraintes de conception sont retenues pour être exploitées.The use of reconfigurable accelerators when designing heterogeneous system-on-chip has the potential to increase performance and reduce energy consumption. Indeed, these accelerators are commonly a adjunct to one (or more) processor(s) and unload intensive computations and treatments. The concept of dynamic reconfiguration, supported by some FPGA vendors, allows to consider more flexible systems including the ability to sequence the execution of accelerators on the same silicon area, while reducing resource requirements. However, dynamic reconfiguration may impact overall system performance and it is hard to estimate the impact of configuration decisions on energy consumption.. The main objective of this thesis is to provide an exploration methodology to assess the impact of implementation choices of tasks of an application on a system-on-chip containing a dynamically reconfigurable resource, to optimize the energy consumption or the processing time. Therefore, we have established consumption models of reconfigurable components, particularly FPGAs, which assists the designer. Using a measurement methodology on Virtex-5, we first show the possibility to generate hardware accelerators of various sizes, execution time and energy consumption. Then, in order to quantify the implementation costs of these accelerators, we build three power models of the dynamic and partial reconfiguration. Finally, from these models, we develop an algorithm for the exploration of implementation and allocation possibilities for a complete system. Based on a high-level modeling platform, the implementation costs of the tasks and their performance on various resources (CPU or reconfigurable region) are analyzed. The solutions with the best characteristics, based on design constraints, are extracted.RENNES1-Bibl. électronique (352382106) / SudocSudocFranceF

    Circuit Merging versus Dynamic Partial Reconfiguration -The HoMade Implementation

    Get PDF
    International audienceOne goal of reconfiguration is to save power and occupied resources. In this paper we compare two different kinds of reconfiguration available on field-programmable gate arrays (FPGA) and we discuss their pros and cons. The first method that we study is circuit merging. This type of reconfiguration methods consists in sharing common resources between different circuits. The second method that we explore is dynamic partial reconfiguration (DPR). It is specific to some FPGA, allowing well defined reconfigurable parts to be modified during run-time. We show that DPR, when available, has good and more predictable result in terms of occupied area. There is still a huge overhead in term of time and power consumption during the reconfiguration phase. Therefore we show that circuit merging remains an interesting solution on FPGA because it is not vendor specific and the reconfiguration time is around a clock cycle. Besides, good merging algorithms exist even-though FPGA physical synthesis flow makes it hard to predict the real performance of the merged circuit during the optimization. We establish our comparison in the context of the HoMade processor

    Analysis of the reconfiguration latency and energy overheads for a Xilinx Virtex-5 FPGA

    Get PDF
    In this paper we have evaluated the overhead and the tradeoffs of a set of components usually included in a system with run-time partial reconfiguration implemented on a Xilinx Virtex-5. Our analysis shows the benefits of including a scratchpad memory inside the reconfiguration controller in order to improve the efficiency of the reconfiguration process. We have designed a simple controller for this scratchpad that includes support for prefetching and caching in order to further reduce both the energy and latency overhead

    Hardware Architectural Support for Caching Partitioned Reconfigurations in Reconfigurable Systems

    Get PDF
    The efficiency of the reconfiguration process in modern FPGAs can improve drastically if an on-chip configuration memory is included in the system because it can reduce both the reconfiguration latency and its energy consumption. However, FPGA on-chip memory resources are very limited. Thus, it is very important to manage them effectively in order to improve the reconfiguration process as much as possible even when the size of the on-chip configuration memory is small. This paper presents a hardware implementation of an on-chip configuration memory controller that efficiently manages run-time reconfigurations. In order to optimize the use of the on-chip memory, this controller includes support to deal with configurations that have been divided into blocks of customizable size. When a reconfiguration must be carried out, our controller provides the blocks stored on-chip and looks for the remaining blocks by accessing to the off-chip configuration memory. Moreover, it dynamically decides which blocks must be stored on-chip. To this end, the designed controller implements a simple but efficient technique that allows maximizing the benefits of the on-chip memories. Experimental results will demonstrate that its implementation cost is very affordable and that it introduces negligible run-time management overheads

    Zynq-Based Reconfigurable System for Real-Time Edge Detection of Noisy Video Sequences

    Get PDF
    We implement Zynq-based self-reconfigurable system to perform real-time edge detection of 1080p video sequences. While object edge detection is a fundamental tool in computer vision, noises in the video frames negatively affect edge detection results significantly. Moreover, due to the high computational complexity of 1080p video filtering operations, hardware implementation on reconfigurable hardware fabric is necessary. Here, the proposed embedded system utilizes dynamic reconfiguration capability of Zynq SoC so that partial reconfiguration of different filter bitstreams is performed during run-time according to the detected noise density level in the incoming video frames. Pratt’s Figure of Merit (PFOM) to evaluate the accuracy of edge detection is analyzed for various noise density levels, and we demonstrate that adaptive run-time reconfiguration of the proposed filter bitstreams significantly increases the accuracy of edge detection results while efficiently providing computing power to support real-time processing of 1080p video frames. Performance results on configuration time, CPU usage, and hardware resource utilization are also compared

    Design methodology addressing static/reconfigurable partitioning optimizing software defined radio (SDR) implementation through FPGA dynamic partial reconfiguration and rapid prototyping tools

    Get PDF
    The characteristics people request for communication devices become more and more demanding every day. And not only in those aspects dealing with communication speed, but also in such different characteristics as different communication standards compatibility, battery life, device size or price. Moreover, when this communication need is addressed by the industrial world, new characteristics such as reliability, robustness or time-to-market appear. In this context, Software Defined Radios (SDR) and evolutions such as Cognitive Radios or Intelligent Radios seem to be the technological answer that will satisfy all these requirements in a short and mid-term. Consequently, this PhD dissertation deals with the implementation of this type of communication system. Taking into account that there is no limitation neither in the implementation architecture nor in the target device, a novel framework for SDR implementation is proposed. This framework is made up of FPGAs, using dynamic partial reconfiguration, as target device and rapid prototyping tools as designing tool. Despite the benefits that this framework generates, there are also certain drawbacks that need to be analyzed and minimized to the extent possible. On this purpose, a SDR design methodology has been designed and tested. This methodology addresses the static/reconfigurable partitioning of the SDRs in order to optimize their implementation in the aforementioned framework. In order to verify the feasibility of both the design framework and the design methodology, several implementations have been carried out making use of them. A multi-standard modulator implementing WiFi, WiMAX and UMTS, a small-form-factor cognitive video transmission system and the implementation of several data coding functions over R3TOS, a hardware operating system developed by the University of Edinburgh, are these implementations.Las características que la gente exige a los dispositivos de comunicaciones son cada día más exigentes. Y no solo en los aspectos relacionados con la velocidad de comunicación, sino que también en diferentes características como la compatibilidad con diferentes estándares de comunicación, autonomía, tamaño o precio. Es más, cuando esta necesidad de comunicación se traslada al mundo industrial, aparecen nuevas características como fiabilidad, robustez o plazo de comercialización que también es necesario cubrir. En este contexto, las Radios Definidas por Software (SDR) y evoluciones como las Radios Cognitivas o Radios Inteligentes parecen la respuesta tecnológica que va a satisfacer estas necesidades a corto y medio plazo. Por ello, esta tesis doctoral aborda la implementación de este tipo de sistemas de comunicaciones. Teniendo en cuenta que no existe una limitación, ni en la arquitectura de implementación, ni en el tipo de dispositivo a usar, se propone un nuevo entrono de diseño formado por las FPGAs, haciendo uso de la reconfiguración parcial dinámica, y por las herramientas de prototipado rápido. A pesar de que este entorno de diseño ofrece varios beneficios, también genera algunos inconvenientes que es necesario analizar y minimizar en la medida de lo posible. Con este objetivo, se ha diseñado y verificado una metodología de diseño de SDRs. Esta metodología se encarga del particionado estático/reconfigurable de las SDRs para optimizar su implementación sobre el entrono de diseño antes comentado. Para verificar la viabilidad tanto del entorno, como de la metodología de diseño propuesta, se han realizado varias implementaciones que hacen uso de ambas cosas. Estas implementaciones son: un modulador multi-estándar que implementa WiFi, WiMAX y UMTS, un sistema cognitivo y compacto de transmisión de video y la implementación de varias funciones de codificación de datos sobre R3TOS, un sistema operativo hardware desarrollado por la Universidad de Edimburgo

    Enhancing Real-time Embedded Image Processing Robustness on Reconfigurable Devices for Critical Applications

    Get PDF
    Nowadays, image processing is increasingly used in several application fields, such as biomedical, aerospace, or automotive. Within these fields, image processing is used to serve both non-critical and critical tasks. As example, in automotive, cameras are becoming key sensors in increasing car safety, driving assistance and driving comfort. They have been employed for infotainment (non-critical), as well as for some driver assistance tasks (critical), such as Forward Collision Avoidance, Intelligent Speed Control, or Pedestrian Detection. The complexity of these algorithms brings a challenge in real-time image processing systems, requiring high computing capacity, usually not available in processors for embedded systems. Hardware acceleration is therefore crucial, and devices such as Field Programmable Gate Arrays (FPGAs) best fit the growing demand of computational capabilities. These devices can assist embedded processors by significantly speeding-up computationally intensive software algorithms. Moreover, critical applications introduce strict requirements not only from the real-time constraints, but also from the device reliability and algorithm robustness points of view. Technology scaling is highlighting reliability problems related to aging phenomena, and to the increasing sensitivity of digital devices to external radiation events that can cause transient or even permanent faults. These faults can lead to wrong information processed or, in the worst case, to a dangerous system failure. In this context, the reconfigurable nature of FPGA devices can be exploited to increase the system reliability and robustness by leveraging Dynamic Partial Reconfiguration features. The research work presented in this thesis focuses on the development of techniques for implementing efficient and robust real-time embedded image processing hardware accelerators and systems for mission-critical applications. Three main challenges have been faced and will be discussed, along with proposed solutions, throughout the thesis: (i) achieving real-time performances, (ii) enhancing algorithm robustness, and (iii) increasing overall system's dependability. In order to ensure real-time performances, efficient FPGA-based hardware accelerators implementing selected image processing algorithms have been developed. Functionalities offered by the target technology, and algorithm's characteristics have been constantly taken into account while designing such accelerators, in order to efficiently tailor algorithm's operations to available hardware resources. On the other hand, the key idea for increasing image processing algorithms' robustness is to introduce self-adaptivity features at algorithm level, in order to maintain constant, or improve, the quality of results for a wide range of input conditions, that are not always fully predictable at design-time (e.g., noise level variations). This has been accomplished by measuring at run-time some characteristics of the input images, and then tuning the algorithm parameters based on such estimations. Dynamic reconfiguration features of modern reconfigurable FPGA have been extensively exploited in order to integrate run-time adaptivity into the designed hardware accelerators. Tools and methodologies have been also developed in order to increase the overall system dependability during reconfiguration processes, thus providing safe run-time adaptation mechanisms. In addition, taking into account the target technology and the environments in which the developed hardware accelerators and systems may be employed, dependability issues have been analyzed, leading to the development of a platform for quickly assessing the reliability and characterizing the behavior of hardware accelerators implemented on reconfigurable FPGAs when they are affected by such faults

    Towards the development of flexible, reliable, reconfigurable, and high-performance imaging systems

    Get PDF
    Current FPGAs can implement large systems because of the high density of reconfigurable logic resources in a single chip. FPGAs are comprehensive devices that combine flexibility and high performance in the same platform compared to other platform such as General-Purpose Processors (GPPs) and Application Specific Integrated Circuits (ASICs). The flexibility of modern FPGAs is further enhanced by introducing Dynamic Partial Reconfiguration (DPR) feature, which allows for changing the functionality of part of the system while other parts are functioning. FPGAs became an important platform for digital image processing applications because of the aforementioned features. They can fulfil the need of efficient and flexible platforms that execute imaging tasks efficiently as well as the reliably with low power, high performance and high flexibility. The use of FPGAs as accelerators for image processing outperforms most of the current solutions. Current FPGA solutions can to load part of the imaging application that needs high computational power on dedicated reconfigurable hardware accelerators while other parts are working on the traditional solution to increase the system performance. Moreover, the use of the DPR feature enhances the flexibility of image processing further by swapping accelerators in and out at run-time. The use of fault mitigation techniques in FPGAs enables imaging applications to operate in harsh environments following the fact that FPGAs are sensitive to radiation and extreme conditions. The aim of this thesis is to present a platform for efficient implementations of imaging tasks. The research uses FPGAs as the key component of this platform and uses the concept of DPR to increase the performance, flexibility, to reduce the power dissipation and to expand the cycle of possible imaging applications. In this context, it proposes the use of FPGAs to accelerate the Image Processing Pipeline (IPP) stages, the core part of most imaging devices. The thesis has a number of novel concepts. The first novel concept is the use of FPGA hardware environment and DPR feature to increase the parallelism and achieve high flexibility. The concept also increases the performance and reduces the power consumption and area utilisation. Based on this concept, the following implementations are presented in this thesis: An implementation of Adams Hamilton Demosaicing algorithm for camera colour interpolation, which exploits the FPGA parallelism to outperform other equivalents. In addition, an implementation of Automatic White Balance (AWB), another IPP stage that employs DPR feature to prove the mentioned novelty aspects. Another novel concept in this thesis is presented in chapter 6, which uses DPR feature to develop a novel flexible imaging system that requires less logic and can be implemented in small FPGAs. The system can be employed as a template for any imaging application with no limitation. Moreover, discussed in this thesis is a novel reliable version of the imaging system that adopts novel techniques including scrubbing, Built-In Self Test (BIST), and Triple Modular Redundancy (TMR) to detect and correct errors using the Internal Configuration Access Port (ICAP) primitive. These techniques exploit the datapath-based nature of the implemented imaging system to improve the system's overall reliability. The thesis presents a proposal for integrating the imaging system with the Robust Reliable Reconfigurable Real-Time Heterogeneous Operating System (R4THOS) to get the best out of the system. The proposal shows the suitability of the proposed DPR imaging system to be used as part of the core system of autonomous cars because of its unbounded flexibility. These novel works are presented in a number of publications as shown in section 1.3 later in this thesis

    Hardware/Software Codesign of Embedded Systems with Reconfigurable and Heterogeneous Platforms

    Full text link
    corecore