5 research outputs found
A Functionality-Based Runtime Relocation System for Circuits on Heterogeneous FPGAs
Runtime relocation of circuits on field-programmable gate arrays (FPGAs) has been proposed for achieving many desirable features including fault tolerance, defragmentation, and system load balancing. However, the changes in the architectural composition of FPGAs have made relocation more challenging mainly because FPGAs have become more heterogeneous. Previous and state-of-the-art circuit relocation systems on FPGAs have relied only on direct bitstream relocation which requires the source and destination resource layouts to be the same, as well as access to the design bitstream for manipulation. Hence, their efficiency on modern heterogeneous chips greatly reduces, and mostly cannot be applied to encrypted bitstreams of intellectual property blocks. In this brief, we present a circuit relocator which augments direct bitstream relocation with a functionality-based relocation scheme. We demonstrate the feasibility of the proposed technique using a CORDIC application and show that an average of over 2.6-fold increase in the number of relocations can be obtained compared to only direct bitstream relocation at the expense of a small memory overhead and manageable relocation time for this case study
Interconnect architectures for dynamically partially reconfigurable systems
Dynamically partially reconfigurable FPGAs (Field-Programmable Gate Arrays) allow
hardware modules to be placed and removed at runtime while other parts of the system
keep working. With their potential benefits, they have been the topic of a great
deal of research over the last decade. To exploit the partial reconfiguration capability of
FPGAs, there is a need for efficient, dynamically adaptive communication infrastructure
that automatically adapts as modules are added to and removed from the system.
Many bus and network-on-chip (NoC) architectures have been proposed to exploit this
capability on FPGA technology. However, few realizations have been reported in the
public literature to demonstrate or compare their performance in real world applications.
While partial reconfiguration can offer many benefits, it is still rarely exploited in practical
applications. Few full realizations of partially reconfigurable systems in current
FPGA technologies have been published. More application experiments are required to
understand the benefits and limitations of implementing partially reconfigurable systems
and to guide their further development. The motivation of this thesis is to fill this
research gap by providing empirical evidence of the cost and benefits of different interconnect
architectures. The results will provide a baseline for future research and will
be directly useful for circuit designers who must make a well-reasoned choice between
the alternatives.
This thesis contains the results of experiments to compare different NoC and bus interconnect
architectures for FPGA-based designs in general and dynamically partially
reconfigurable systems. These two interconnect schemes are implemented and evaluated
in terms of performance, area and power consumption using FFT (Fast Fourier
Transform) andANN(Artificial Neural Network) systems as benchmarks. Conclusions
drawn from these results include recommendations concerning the interconnect approach
for different kinds of applications. It is found that a NoC provides much better
performance than a single channel bus and similar performance to a multi-channel bus
in both parallel and parallel-pipelined FFT systems. This suggests that a NoC is a better choice for systems with multiple simultaneous communications like the FFT. Bus-based
interconnect achieves better performance and consume less area and power than NoCbased
scheme for the fully-connected feed-forward NN system. This suggests buses
are a better choice for systems that do not require many simultaneous communications
or systems with broadcast communications like a fully-connected feed-forward NN.
Results from the experiments with dynamic partial reconfiguration demonstrate that
buses have the advantages of better resource utilization and smaller reconfiguration
time and memory than NoCs. However, NoCs are more flexible and expansible. They
have the advantage of placing almost all of the communication infrastructure in the
dynamic reconfiguration region. This means that different applications running on the
FPGA can use different interconnection strategies without the overhead of fixed bus
resources in the static region.
Another objective of the research is to examine the partial reconfiguration process and
reconfiguration overhead with current FPGA technologies. Partial reconfiguration allows
users to efficiently change the number of running PEs to choose an optimal powerperformance
operating point at the minimum cost of reconfiguration. However, this
brings drawbacks including resource utilization inefficiency, power consumption overhead
and decrease in system operating frequency. The experimental results report a
50% of resource utilization inefficiency with a power consumption overhead of less
than 5% and a decrease in frequency of up to 32% compared to a static implementation.
The results also show that most of the drawbacks of partial reconfiguration implementation
come from the restrictions and limitations of partial reconfiguration design flow.
If these limitations can be addressed, partial reconfiguration should still be considered
with its potential benefits.Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 201
Design Optimizations for Tiled Partially Reconfigurable Systems
Koester M, Luk W, Hagemeyer J, Porrmann M, Rückert U. Design Optimizations for Tiled Partially Reconfigurable Systems. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2010;19(6):1048-1061.In partially reconfigurable architectures, system components can be dynamically loaded and unloaded allowing resources to be shared over time. Dynamic system components are represented by partial reconfiguration (PR) modules. In comparison to a static system, the design of a partially reconfigurable system requires additional design steps, such as partitioning the device resources into static and dynamic regions. We present the concept of tiled PR regions, which enables a flexible online-placement of PR modules. Dynamic reconfiguration requires a suitable communication infrastructure to interconnect the static and dynamic system components. We present an embedded communication macro, a communication infrastructure that interconnects PR modules in a tiled PR region. Efficient online-placement of PR modules depends not only on the placement algorithm, but also on design-time aspects such as the chosen synthesis regions of the PR modules. We propose a design method for selecting suitable synthesis regions for the PR modules aiming to optimize their placement at run-time
Efficient runtime placement management for high performance and reliability in COTS FPGAs
Designing high-performance, fault-tolerant multisensory electronic systems for
hostile environments such as nuclear plants and outer space within the constraints of
cost, power and flexibility is challenging. Issues such as ionizing radiation, extreme
temperature and ageing can lead to faults in the electronics of these systems. In
addition, the remote nature of these environments demands a level of flexibility and
autonomy in their operations. The standard practice of using specially hardened
electronic devices for such systems is not only very expensive but also has limited
flexibility.
This thesis proposes novel techniques that promote the use of Commercial Off-The-
Shelf (COTS) reconfigurable devices to meet the challenges of high-performance
systems for hostile environments. Reconfigurable hardware such as Field
Programmable Gate Arrays (FPGA) have a unique combination of flexibility and
high performance. The flexibility offered through features such as dynamic partial
reconfiguration (DPR) can be harnessed not only to achieve cost-effective designs as
a smaller area can be used to execute multiple tasks, but also to improve the
reliability of a system as a circuit on one portion of the device can be physically
relocated to another portion in the case of fault occurrence. However, to harness
these potentials for high performance and reliability in a cost-effective manner, novel
runtime management tools are required. Most runtime support tools for
reconfigurable devices are based on ideal models which do not adequately consider
the limitations of realistic FPGAs, in particular modern FPGAs which are
increasingly heterogeneous. Specifically, these tools lack efficient mechanisms for
ensuring a high utilization of FPGA resources, including the FPGA area and the
configuration port and clocking resources, in a reliable manner.
To ensure high utilization of reconfigurable device area, placement management is a
key aspect of these tools. This thesis presents novel techniques for the management
of hardware task placement on COTS reconfigurable devices for high performance
and reliability. To this end, it addresses design-time issues that affect efficient
hardware task placement, with a focus on reliability. It also presents techniques to
maximize the utilization of the FPGA area in runtime, including techniques to
minimize fragmentation. Fragmentation leads to the creation of unusable areas due to
dynamic placement of tasks and the heterogeneity of the resources on the chip.
Moreover, this thesis also presents an efficient task reuse mechanism to improve the
availability of the internal configuration infrastructure of the FPGA for critical
responsibilities like error mitigation. The task reuse scheme, unlike previous
approaches, also improves the utilization of the chip area by offering
defragmentation.
Task relocation, which involves changing the physical location of circuits is a
technique for error mitigation and high performance. Hence, this thesis also provides
a functionality-based relocation mechanism for improving the number of locations to
which tasks can be relocated on heterogeneous FPGAs. As tasks are relocated, clock
networks need to be routed to them. As such, a reliability-aware technique of clock
network routing to tasks after placement is also proposed.
Finally, this thesis offers a prototype implementation and characterization of a
placement management system (PMS) which is an integration of the aforementioned
techniques. The performance of most of the proposed techniques are tested using
data processing tasks of a NASA JPL spectrometer application. The results show that
the proposed techniques have potentials to improve the reliability and performance of
applications in hostile environment compared to state-of-the-art techniques. The task
optimization technique presented leads to better capacity to circumvent permanent
faults on COTS FPGAs compared to state-of-the-art approaches (48.6% more errors
were circumvented for the JPL spectrometer application). The proposed task reuse
scheme leads to approximately 29% saving in the amount of configuration time. This
frees up the internal configuration interface for more error mitigation operations. In
addition, the proposed PMS has a worst-case latency of less than 50% of that of state-of-
the-art runtime placement systems, while maintaining the same level of placement
quality and resource overhead