84 research outputs found
A unified hardware/software runtime environment for FPGA-based reconfigurable computers using BORPH
Fulltext linkThis paper explores the design and implementation of BORPH, an operating system designed for FPGA-based reconfigurable computers. Hardware designs execute as normal UNIX processes under BORPH, having access to standard OS services, such as file system support. Hardware and software components of user designs may, therefore, run as communicating processes within BORPH's runtime environment. The familiar language independent UNIX kernel interface facilitates easy design reuse and rapid application development. To develop hardware designs, a Simulink-based design flow that integrates with BORPH is employed. Performances of BORPH on two on-chip systems implemented on a BEE2 platform are compared. © 2008 ACM.link_to_subscribed_fulltex
Design and Programming Methods for Reconfigurable Multi-Core Architectures using a Network-on-Chip-Centric Approach
A current trend in the semiconductor industry is the use of Multi-Processor Systems-on-Chip (MPSoCs) for a wide variety of applications such as image processing, automotive, multimedia, and robotic systems. Most applications gain performance advantages by executing parallel tasks on multiple processors due to the inherent parallelism. Moreover, heterogeneous structures provide high performance/energy efficiency, since application-specific processing elements (PEs) can be exploited. The increasing number of heterogeneous PEs leads to challenging communication requirements. To overcome this challenge, Networks-on-Chip (NoCs) have emerged as scalable on-chip interconnect. Nevertheless, NoCs have to deal with many design parameters such as virtual channels, routing algorithms and buffering techniques to fulfill the system requirements.
This thesis highly contributes to the state-of-the-art of FPGA-based MPSoCs and NoCs. In the following, the three major contributions are introduced.
As a first major contribution, a novel router concept is presented that efficiently utilizes communication times by performing sequences of arithmetic operations on the data that is transferred. The internal input buffers of the routers are exchanged with processing units that are capable of executing operations. Two different architectures of such processing units are presented. The first architecture provides multiply and accumulate operations which are often used in signal processing applications. The second architecture introduced as Application-Specific Instruction Set Routers (ASIRs) contains a processing unit capable of executing any operation and hence, it is not limited to multiply and accumulate operations. An internal processing core located in ASIRs can be developed in C/C++ using high-level synthesis.
The second major contribution comprises application and performance explorations of the novel router concept. Models that approximate the achievable speedup and the end-to-end latency of ASIRs are derived and discussed to show the benefits in terms of performance. Furthermore, two applications using an ASIR-based MPSoC are implemented and evaluated on a Xilinx Zynq SoC. The first application is an image processing algorithm consisting of a Sobel filter, an RGB-to-Grayscale conversion, and a threshold operation. The second application is a system that helps visually impaired people by navigating them through unknown indoor environments. A Light Detection and Ranging (LIDAR) sensor scans the environment, while Inertial Measurement Units (IMUs) measure the orientation of the user to generate an audio signal that makes the distance as well as the orientation of obstacles audible. This application consists of multiple parallel tasks that are mapped to an ASIR-based MPSoC. Both applications show the performance advantages of ASIRs compared to a conventional NoC-based MPSoC. Furthermore, dynamic partial reconfiguration in terms of relocation and security aspects are investigated.
The third major contribution refers to development and programming methodologies of NoC-based MPSoCs. A software-defined approach is presented that combines the design and programming of heterogeneous MPSoCs. In addition, a Kahn-Process-Network (KPN) –based model is designed to describe parallel applications for MPSoCs using ASIRs. The KPN-based model is extended to support not only the mapping of tasks to NoC-based MPSoCs but also the mapping to ASIR-based MPSoCs. A static mapping methodology is presented that assigns tasks to ASIRs and processors for a given KPN-model. The impact of external hardware components such as sensors, actuators and accelerators connected to the processors is also discussed which makes the approach of high interest for embedded systems
Operating System Concepts for Reconfigurable Computing: Review and Survey
One of the key future challenges for reconfigurable computing is to enable higher design productivity and a more easy way to use reconfigurable computing systems for users that are unfamiliar with the underlying concepts. One way of doing this is to provide standardization and abstraction, usually supported and enforced by an operating system. This article gives historical review and a summary on ideas and key concepts to include reconfigurable computing aspects in operating systems. The article also presents an overview on published and available operating systems targeting the area of reconfigurable computing. The purpose of this article is to identify and summarize common patterns among those systems that can be seen as de facto standard. Furthermore, open problems, not covered by these already available systems, are identified
RTRLIB : a high-level modeling tool for dynamically partially reconfigurable systems
Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Mecânica, 2020.Reconfiguração dinâmica parcial é considerada uma interessante técnica a ser aplicada para o
aumento da flexibilidade de sistemas implementados em FPGA, em função da implementação
dinâmica de módulos de hardware enquanto o restante do circuito permanece em operação. Trata-
se de uma técnica utilizada em sistemas com requisitos muito restritos, como adaptabilidade,
robustez, consumo de potência, custo e tolerância à falhas. Entretanto, a complexidade de desen-
volvimento de sistemas com reconfiguração dinâmica parcial é consideravelmente alta quando
comparada à de sistemas com lógica totalmente estática. Nesse sentido, novas metodologias e
ferramentas de desenvolvimento são necessárias para reduzir a complexidade de implementação
desse tipo de sistema.
Nesse contexto, esse trabalho apresenta o RTRLib, uma ferramenta de modelagem em alto
nível para o desenvolvimento de sistemas com reconfiguração dinâmica parcial em dispositivos
Xilinx Zynq a partir da especificação e parametrização de alguns blocos. Sob condições específi-
cas, o RTRLib automaticamante produz os scripts de hardware e software para implementação da
solução utilizando o Vivado Design Suite e o SDK. Tais scripts são compostos pelos comandos
necessários para a implementação do sistema desde a criação do projeto de hardware até a criação
do arquivo de boot. Uma vez que o RTRLib é composto por IP-Cores previamente caracterizados,
a ferramenta também pode ser utilizada para a análise, em fase de modelagem, do sistema a ser
implementado, por meio da estimação de características importantes do sistema, como o consumo
de recursos e latência.
O presente trabalho também inclui novas funcionalidades implementadas no RTRLib no con-
texto do design de hardware e de software, como: generalização do script de hardware, mapea-
mento de IO, floorplanning por meio de uma GUI, criação de um gerador de script de software,
gerador de template de aplicação standalone que faz uso do partial reconfiguration controller
(PRC) e implementação de uma biblioteca para aplicações FreeRTOS.
Por fim, quatro estudos de casos foram implementados para demonstrar as funcionalidades da
ferramenta: um sistema de classificação de terrenos baseado em redes neurais, um sistema com
regressores lineares utilizado para controle de uma prótese miocinética de mão e, por último, uma
aplicação hipotética de um sistema com requisitos de tempo real.Partial dynamic reconfiguration is considered an interesting technique to increase flexibility in
FPGA designs due to the dynamic replacement of hardware modules while the remainder of the
circuit remains in operation. It is used in systems with hard requirements such as adaptability,
robustness, power consumption, cost, and fault-tolerance. However, the complexity to develop
dynamically partially reconfigurable systems in considerably higher comparing with static de-
signs. Therefore, new design methodologies and tools have been required to reduce the design
complexity of such systems.
In this context, this work presents the RTRLib, a high-level modeling tool for the development
of dynamically reconfigurable systems on Xilinx Zynq devices by a simple system specification
and parametrization of some blocks. Under specific conditions, RTRLib automatically generates
the hardware and software scripts to implement the solution using Vivado and SDK. These scripts
are composed by the sequential design steps from hardware project creation to the boot image
elaboration. Since RTRLib is composed of pre-characterized IP-Cores, the tool also can be used
to analyze the system behavior during the design process by the early estimation of essential
characteristics of the system such as resource consumption and latency.
The present work also includes the new functionalities implemented on RTRLib in the context
of the hardware and the software design, such as: hardware script generalization, IO mapping,
floorplanning by a GUI, software script creation, generator of a standalone template application
that uses PRC, and implementation of a FreeRTOS library application.
Finally, four case studies were implemented to demonstrate the tool capability: a system
for terrain classification based on neuron networks, a linear regressor system used to control a
myokinetic-based prosthetic hand, and a hypothetical real-time application
Optimising and evaluating designs for reconfigurable hardware
Growing demand for computational performance, and the rising cost for chip design and
manufacturing make reconfigurable hardware increasingly attractive for digital system implementation.
Reconfigurable hardware, such as field-programmable gate arrays (FPGAs),
can deliver performance through parallelism while also providing flexibility to enable
application builders to reconfigure them. However, reconfigurable systems, particularly
those involving run-time reconfiguration, are often developed in an ad-hoc manner. Such
an approach usually results in low designer productivity and can lead to inefficient designs.
This thesis covers three main achievements that address this situation. The first
achievement is a model that captures design parameters of reconfigurable hardware and
performance parameters of a given application domain. This model supports optimisations
for several design metrics such as performance, area, and power consumption. The second
achievement is a technique that enhances the relocatability of bitstreams for reconfigurable
devices, taking into account heterogeneous resources. This method increases the flexibility
of modules represented by these bitstreams while reducing configuration storage size and
design compilation time. The third achievement is a technique to characterise the power
consumption of FPGAs in different activity modes. This technique includes the evaluation
of standby power and dedicated low-power modes, which are crucial in meeting the
requirements for battery-based mobile devices
Just In Time Assembly (JITA) - A Run Time Interpretation Approach for Achieving Productivity of Creating Custom Accelerators in FPGAs
The reconfigurable computing community has yet to be successful in allowing programmers to access FPGAs through traditional software development flows. Existing barriers that prevent programmers from using FPGAs include: 1) knowledge of hardware programming models, 2) the need to work within the vendor specific CAD tools and hardware synthesis. This thesis presents a series of published papers that explore different aspects of a new approach being developed to remove the barriers and enable programmers to compile accelerators on next generation reconfigurable manycore architectures. The approach is entitled Just In Time Assembly (JITA) of hardware accelerators. The approach has been defined to allow hardware accelerators to be built and run through software compilation and run time interpretation outside of CAD tools and without requiring each new accelerator to be synthesized. The approach advocates the use of libraries of pre-synthesized components that can be referenced through symbolic links in a similar fashion to dynamically linked software libraries. Synthesis still must occur but is moved out of the application programmers software flow and into the initial coding process that occurs when programming patterns that define a Domain Specific Language (DSL) are first coded. Programmers see no difference between creating software or hardware functionality when using the DSL. A new run time interpreter is introduced to assemble the individual pre-synthesized hardware accelerators that comprise the accelerator functionality within a configurable tile array of partially reconfigurable slots at run time. Quantitative results are presented that compares utilization, performance, and productivity of the approach to what would be achieved by full custom accelerators created through traditional CAD flows using hardware programming models and passing through synthesis
Dynamic partial reconfiguration management for high performance and reliability in FPGAs
Modern Field-Programmable Gate Arrays (FPGAs) are no longer used to implement
small “glue logic” circuitries. The high-density of reconfigurable logic resources in
today’s FPGAs enable the implementation of large systems in a single chip. FPGAs
are highly flexible devices; their functionality can be altered by simply loading a new
binary file in their configuration memory. While the flexibility of FPGAs is
comparable to General-Purpose Processors (GPPs), in the sense that different
functions can be performed using the same hardware, the performance gain that can
be achieved using FPGAs can be orders of magnitudes higher as FPGAs offer the
ability for customisation of parallel computational architectures.
Dynamic Partial Reconfiguration (DPR) allows for changing the functionality of
certain blocks on the chip while the rest of the FPGA is operational. DPR has
sparked the interest of researchers to explore new computational platforms where
computational tasks are off-loaded from a main CPU to be executed using dedicated
reconfigurable hardware accelerators configured on demand at run-time. By having a
battery of custom accelerators which can be swapped in and out of the FPGA at runtime,
a higher computational density can be achieved compared to static systems
where the accelerators are bound to fixed locations within the chip. Furthermore, the
ability of relocating these accelerators across several locations on the chip allows for
the implementation of adaptive systems which can mitigate emerging faults in the
FPGA chip when operating in harsh environments. By porting the appropriate fault
mitigation techniques in such computational platforms, the advantages of FPGAs can
be harnessed in different applications in space and military electronics where FPGAs
are usually seen as unreliable devices due to their sensitivity to radiation and extreme
environmental conditions.
In light of the above, this thesis investigates the deployment of DPR as: 1) a method
for enhancing performance by efficient exploitation of the FPGA resources, and 2) a
method for enhancing the reliability of systems intended to operate in harsh
environments. Achieving optimal performance in such systems requires an efficient
internal configuration management system to manage the reconfiguration and
execution of the reconfigurable modules in the FPGA. In addition, the system needs
to support “fault-resilience” features by integrating parameterisable fault detection
and recovery capabilities to meet the reliability standard of fault-tolerant
applications. This thesis addresses all the design and implementation aspects of an
Internal Configuration Manger (ICM) which supports a novel bitstream relocation
model to enable the placement of relocatable accelerators across several locations on
the FPGA chip. In addition to supporting all the configuration capabilities required to
implement a Reconfigurable Operating System (ROS), the proposed ICM also
supports the novel multiple-clone configuration technique which allows for cloning
several instances of the same hardware accelerator at the same time resulting in much
shorter configuration time compared to traditional configuration techniques. A faulttolerant
(FT) version of the proposed ICM which supports a comprehensive faultrecovery
scheme is also introduced in this thesis. The proposed FT-ICM is designed
with a much smaller area footprint compared to Triple Modular Redundancy (TMR)
hardening techniques while keeping a comparable level of fault-resilience.
The capabilities of the proposed ICM system are demonstrated with two novel
applications. The first application demonstrates a proof-of-concept reliable FPGA
server solution used for executing encryption/decryption queries. The proposed
server deploys bitstream relocation and modular redundancy to mitigate both
permanent and transient faults in the device. It also deploys a novel Built-In Self-
Test (BIST) diagnosis scheme, specifically designed to detect emerging permanent
faults in the system at run-time. The second application is a data mining application
where DPR is used to increase the computational density of a system used to
implement the Frequent Itemset Mining (FIM) problem
- …