44 research outputs found
Analysis of runtime re-configuration systems
In recent years Programmable Logic Devices (PLD) and in particular Field Programmable Gate Arrays (FPGAs) have seen a tremendous increase in sales and applications in the area of embedded systems. The main advantage of FPGAs is the flexibility that they offer a designer in reconfiguring the hardware. The flexibility achieved through re-configuration of FPGAs usually incurs an overhead of extra execution time, data memory and also power dissipation; FPGAs provide an ideal template for run-time reconfigurable (RTR) designs. Only recently have RTR enabling design tools that bypass the traditional synthesis and bitstream generation process for FPGAs become available, JBits is one of them. With run-time reconfiguration of FPGAs, we can perform partial reconfiguration, which allows reconfiguration of a part of an FPGA while the other part is executing some functional computation. The partial reconfiguration of a function can be performed earlier than the time when the function is really needed. Such configuration pre-fetch can hide the reconfiguration overhead more effectively; This thesis will implement a reconfigurable system and study the effect of runtime reconfiguration using VERILOG and a new Java based tool JBITS. This work will provide pointers to high level synthesis tools targeting runtime re-configuration
Dynamic Scheduling, Allocation, and Compaction Scheme for Real-Time Tasks on FPGAs
Run-time reconfiguration (RTR) is a method of computing on reconfigurable logic, typically FPGAs, changing hardware configurations from phase to phase of a computation at run-time. Recent research has expanded from a focus on a single application at a time to encompass a view of the reconfigurable logic as a resource shared among multiple applications or users. In real-time system design, task deadlines play an important role. Real-time multi-tasking systems not only need to support sharing of the resources in space, but also need to guarantee execution of the tasks. At the operating system level, sharing logic gates, wires, and I/O pins among multiple tasks needs to be managed. From the high level standpoint, access to the resources needs to be scheduled according to task deadlines. This thesis describes a task allocator for scheduling, placing, and compacting tasks on a shared FPGA under real-time constraints. Our consideration of task deadlines is novel in the setting of handling multiple simultaneous tasks in RTR. Software simulations have been conducted to evaluate the performance of the proposed scheme. The results indicate significant improvement by decreasing the number of tasks rejected
Design and resource management of reconfigurable multiprocessors for data-parallel applications
FPGA (Field-Programmable Gate Array)-based custom reconfigurable computing machines have established themselves as low-cost and low-risk alternatives to ASIC (Application-Specific Integrated Circuit) implementations and general-purpose microprocessors in accelerating a wide range of computation-intensive applications. Most often they are Application Specific Programmable Circuiits (ASPCs), which are developer programmable instead of user programmable. The major disadvantages of ASPCs are minimal programmability, and significant time and energy overheads caused by required hardware reconfiguration when the problem size outnumbers the available reconfigurable resources; these problems are expected to become more serious with increases in the FPGA chip size. On the other hand, dominant high-performance computing systems, such as PC clusters and SMPs (Symmetric Multiprocessors), suffer from high communication latencies and/or scalability problems.
This research introduces low-cost, user-programmable and reconfigurable MultiProcessor-on-a-Programmable-Chip (MPoPC) systems for high-performance, low-cost computing. It also proposes a relevant resource management framework that deals with performance, power consumption and energy issues. These semi-customized systems reduce significantly runtime device reconfiguration by employing userprogrammable processing elements that are reusable for different tasks in large, complex applications. For the sake of illustration, two different types of MPoPCs with hardware FPUs (floating-point units) are designed and implemented for credible performance evaluation and modeling: the coarse-grain MIMD (Multiple-Instruction, Multiple-Data) CG-MPoPC machine based on a processor IP (Intellectual Property) core and the mixed-mode (MIMD, SIMD or M-SIMD) variant-grain HERA (HEterogeneous Reconfigurable Architecture) machine. In addition to alleviating the above difficulties, MPoPCs can offer several performance and energy advantages to our data-parallel applications when compared to ASPCs; they are simpler and more scalable, and have less verification time and cost. Various common computation-intensive benchmark algorithms, such as matrix-matrix multiplication (MMM) and LU factorization, are studied and their parallel solutions are shown for the two MPoPCs. The performance is evaluated with large sparse real-world matrices primarily from power engineering. We expect even further performance gains on MPoPCs in the near future by employing ever improving FPGAs. The innovative nature of this work has the potential to guide research in this arising field of high-performance, low-cost reconfigurable computing.
The largest advantage of reconfigurable logic lies in its large degree of hardware customization and reconfiguration which allows reusing the resources to match the computation and communication needs of applications. Therefore, a major effort in the presented design methodology for mixed-mode MPoPCs, like HERA, is devoted to effective resource management. A two-phase approach is applied. A mixed-mode weighted Task Flow Graph (w-TFG) is first constructed for any given application, where tasks are classified according to their most appropriate computing mode (e.g., SIMD or MIMD). At compile time, an architecture is customized and synthesized for the TFG using an Integer Linear Programming (ILP) formulation and a parameterized hardware component library. Various run-time scheduling schemes with different performanceenergy objectives are proposed. A system-level energy model for HERA, which is based on low-level implementation data and run-time statistics, is proposed to guide performance-energy trade-off decisions. A parallel power flow analysis technique based on Newton\u27s method is proposed and employed to verify the methodology
Modeling and automated synthesis of reconfigurable interfaces
Stefan IhmorPaderborn, Univ., Diss., 200
Efficient reconfigurable architectures for 3D medical image compression
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Recently, the more widespread use of three-dimensional (3-D) imaging modalities,
such as magnetic resonance imaging (MRI), computed tomography (CT), positron
emission tomography (PET), and ultrasound (US) have generated a massive amount
of volumetric data. These have provided an impetus to the development of other
applications, in particular telemedicine and teleradiology. In these fields, medical
image compression is important since both efficient storage and transmission of data
through high-bandwidth digital communication lines are of crucial importance.
Despite their advantages, most 3-D medical imaging algorithms are computationally intensive with matrix transformation as the most fundamental operation involved in the transform-based methods. Therefore, there is a real need for high-performance systems, whilst keeping architectures exible to allow
for quick upgradeability with real-time applications. Moreover, in order to obtain
efficient solutions for large medical volumes data, an efficient implementation of
these operations is of significant importance. Reconfigurable hardware, in the form of field programmable gate arrays (FPGAs) has been proposed as viable system
building block in the construction of high-performance systems at an economical price.
Consequently, FPGAs seem an ideal candidate to harness and exploit their inherent
advantages such as massive parallelism capabilities, multimillion gate counts, and
special low-power packages. The key achievements of the work presented in this thesis are summarised as follows. Two architectures for 3-D Haar wavelet transform (HWT) have been proposed based on transpose-based computation and partial reconfiguration suitable for 3-D medical imaging applications. These applications require continuous hardware servicing, and as a result dynamic partial reconfiguration (DPR) has been introduced. Comparative study for both non-partial and partial reconfiguration implementation has shown that DPR offers many advantages and leads to a compelling solution for implementing computationally intensive applications such as 3-D medical image compression. Using DPR, several large systems are mapped to small hardware resources, and the area, power consumption as well as maximum frequency are
optimised and improved. Moreover, an FPGA-based architecture of the finite Radon transform (FRAT)with three design strategies has been proposed: direct implementation of pseudo-code with a sequential or pipelined description, and block random access memory (BRAM)- based method. An analysis with various medical imaging modalities has been carried out. Results obtained for image de-noising implementation using FRAT exhibits
promising results in reducing Gaussian white noise in medical images. In terms of
hardware implementation, promising trade-offs on maximum frequency, throughput
and area are also achieved. Furthermore, a novel hardware implementation of 3-D medical image compression system with context-based adaptive variable length coding (CAVLC)
has been proposed. An evaluation of the 3-D integer transform (IT) and the discrete
wavelet transform (DWT) with lifting scheme (LS) for transform blocks reveal that
3-D IT demonstrates better computational complexity than the 3-D DWT, whilst
the 3-D DWT with LS exhibits a lossless compression that is significantly useful for
medical image compression. Additionally, an architecture of CAVLC that is capable
of compressing high-definition (HD) images in real-time without any buffer between
the quantiser and the entropy coder is proposed. Through a judicious parallelisation, promising results have been obtained with limited resources. In summary, this research is tackling the issues of massive 3-D medical volumes data that requires compression as well as hardware implementation to accelerate the
slowest operations in the system. Results obtained also reveal a significant achievement in terms of the architecture efficiency and applications performance.Ministry of Higher Education Malaysia (MOHE),
Universiti Tun Hussein Onn Malaysia (UTHM) and the British Counci
RTRLIB : a high-level modeling tool for dynamically partially reconfigurable systems
Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Mecânica, 2020.Reconfiguração dinâmica parcial é considerada uma interessante técnica a ser aplicada para o
aumento da flexibilidade de sistemas implementados em FPGA, em função da implementação
dinâmica de módulos de hardware enquanto o restante do circuito permanece em operação. Trata-
se de uma técnica utilizada em sistemas com requisitos muito restritos, como adaptabilidade,
robustez, consumo de potência, custo e tolerância à falhas. Entretanto, a complexidade de desen-
volvimento de sistemas com reconfiguração dinâmica parcial é consideravelmente alta quando
comparada à de sistemas com lógica totalmente estática. Nesse sentido, novas metodologias e
ferramentas de desenvolvimento são necessárias para reduzir a complexidade de implementação
desse tipo de sistema.
Nesse contexto, esse trabalho apresenta o RTRLib, uma ferramenta de modelagem em alto
nível para o desenvolvimento de sistemas com reconfiguração dinâmica parcial em dispositivos
Xilinx Zynq a partir da especificação e parametrização de alguns blocos. Sob condições específi-
cas, o RTRLib automaticamante produz os scripts de hardware e software para implementação da
solução utilizando o Vivado Design Suite e o SDK. Tais scripts são compostos pelos comandos
necessários para a implementação do sistema desde a criação do projeto de hardware até a criação
do arquivo de boot. Uma vez que o RTRLib é composto por IP-Cores previamente caracterizados,
a ferramenta também pode ser utilizada para a análise, em fase de modelagem, do sistema a ser
implementado, por meio da estimação de características importantes do sistema, como o consumo
de recursos e latência.
O presente trabalho também inclui novas funcionalidades implementadas no RTRLib no con-
texto do design de hardware e de software, como: generalização do script de hardware, mapea-
mento de IO, floorplanning por meio de uma GUI, criação de um gerador de script de software,
gerador de template de aplicação standalone que faz uso do partial reconfiguration controller
(PRC) e implementação de uma biblioteca para aplicações FreeRTOS.
Por fim, quatro estudos de casos foram implementados para demonstrar as funcionalidades da
ferramenta: um sistema de classificação de terrenos baseado em redes neurais, um sistema com
regressores lineares utilizado para controle de uma prótese miocinética de mão e, por último, uma
aplicação hipotética de um sistema com requisitos de tempo real.Partial dynamic reconfiguration is considered an interesting technique to increase flexibility in
FPGA designs due to the dynamic replacement of hardware modules while the remainder of the
circuit remains in operation. It is used in systems with hard requirements such as adaptability,
robustness, power consumption, cost, and fault-tolerance. However, the complexity to develop
dynamically partially reconfigurable systems in considerably higher comparing with static de-
signs. Therefore, new design methodologies and tools have been required to reduce the design
complexity of such systems.
In this context, this work presents the RTRLib, a high-level modeling tool for the development
of dynamically reconfigurable systems on Xilinx Zynq devices by a simple system specification
and parametrization of some blocks. Under specific conditions, RTRLib automatically generates
the hardware and software scripts to implement the solution using Vivado and SDK. These scripts
are composed by the sequential design steps from hardware project creation to the boot image
elaboration. Since RTRLib is composed of pre-characterized IP-Cores, the tool also can be used
to analyze the system behavior during the design process by the early estimation of essential
characteristics of the system such as resource consumption and latency.
The present work also includes the new functionalities implemented on RTRLib in the context
of the hardware and the software design, such as: hardware script generalization, IO mapping,
floorplanning by a GUI, software script creation, generator of a standalone template application
that uses PRC, and implementation of a FreeRTOS library application.
Finally, four case studies were implemented to demonstrate the tool capability: a system
for terrain classification based on neuron networks, a linear regressor system used to control a
myokinetic-based prosthetic hand, and a hypothetical real-time application