2,164 research outputs found
A Micro Power Hardware Fabric for Embedded Computing
Field Programmable Gate Arrays (FPGAs) mitigate many of the problemsencountered with the development of ASICs by offering flexibility, faster time-to-market, and amortized NRE costs, among other benefits. While FPGAs are increasingly being used for complex computational applications such as signal and image processing, networking, and cryptology, they are far from ideal for these tasks due to relatively high power consumption and silicon usage overheads compared to direct ASIC implementation. A reconfigurable device that exhibits ASIC-like power characteristics and FPGA-like costs and tool support is desirable to fill this void. In this research, a parameterized, reconfigurable fabric model named as domain specific fabric (DSF) is developed that exhibits ASIC-like power characteristics for Digital Signal Processing (DSP) style applications. Using this model, the impact of varying different design parameters on power and performance has been studied. Different optimization techniques like local search and simulated annealing are used to determine the appropriate interconnect for a specific set of applications. A design space exploration tool has been developed to automate and generate a tailored architectural instance of the fabric.The fabric has been synthesized on 160 nm cell-based ASIC fabrication process from OKI and 130 nm from IBM. A detailed power-performance analysis has been completed using signal and image processing benchmarks from the MediaBench benchmark suite and elsewhere with comparisons to other hardware and software implementations. The optimized fabric implemented using the 130 nm process yields energy within 3X of a direct ASIC implementation, 330X better than a Virtex-II Pro FPGA and 2016X better than an Intel XScale processor
Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review
The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER
Coarse-grained reconfigurable array architectures
Coarse-Grained Reconfigurable Array (CGRA) architectures accelerate the same inner loops that benefit from the high ILP support in VLIW architectures. By executing non-loop code on other cores, however, CGRAs can focus on such loops to execute them more efficiently. This chapter discusses the basic principles of CGRAs, and the wide range of design options available to a CGRA designer, covering a large number of existing CGRA designs. The impact of different options on flexibility, performance, and power-efficiency is discussed, as well as the need for compiler support. The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual fine-tuning of source code
Interconnect architectures for dynamically partially reconfigurable systems
Dynamically partially reconfigurable FPGAs (Field-Programmable Gate Arrays) allow
hardware modules to be placed and removed at runtime while other parts of the system
keep working. With their potential benefits, they have been the topic of a great
deal of research over the last decade. To exploit the partial reconfiguration capability of
FPGAs, there is a need for efficient, dynamically adaptive communication infrastructure
that automatically adapts as modules are added to and removed from the system.
Many bus and network-on-chip (NoC) architectures have been proposed to exploit this
capability on FPGA technology. However, few realizations have been reported in the
public literature to demonstrate or compare their performance in real world applications.
While partial reconfiguration can offer many benefits, it is still rarely exploited in practical
applications. Few full realizations of partially reconfigurable systems in current
FPGA technologies have been published. More application experiments are required to
understand the benefits and limitations of implementing partially reconfigurable systems
and to guide their further development. The motivation of this thesis is to fill this
research gap by providing empirical evidence of the cost and benefits of different interconnect
architectures. The results will provide a baseline for future research and will
be directly useful for circuit designers who must make a well-reasoned choice between
the alternatives.
This thesis contains the results of experiments to compare different NoC and bus interconnect
architectures for FPGA-based designs in general and dynamically partially
reconfigurable systems. These two interconnect schemes are implemented and evaluated
in terms of performance, area and power consumption using FFT (Fast Fourier
Transform) andANN(Artificial Neural Network) systems as benchmarks. Conclusions
drawn from these results include recommendations concerning the interconnect approach
for different kinds of applications. It is found that a NoC provides much better
performance than a single channel bus and similar performance to a multi-channel bus
in both parallel and parallel-pipelined FFT systems. This suggests that a NoC is a better choice for systems with multiple simultaneous communications like the FFT. Bus-based
interconnect achieves better performance and consume less area and power than NoCbased
scheme for the fully-connected feed-forward NN system. This suggests buses
are a better choice for systems that do not require many simultaneous communications
or systems with broadcast communications like a fully-connected feed-forward NN.
Results from the experiments with dynamic partial reconfiguration demonstrate that
buses have the advantages of better resource utilization and smaller reconfiguration
time and memory than NoCs. However, NoCs are more flexible and expansible. They
have the advantage of placing almost all of the communication infrastructure in the
dynamic reconfiguration region. This means that different applications running on the
FPGA can use different interconnection strategies without the overhead of fixed bus
resources in the static region.
Another objective of the research is to examine the partial reconfiguration process and
reconfiguration overhead with current FPGA technologies. Partial reconfiguration allows
users to efficiently change the number of running PEs to choose an optimal powerperformance
operating point at the minimum cost of reconfiguration. However, this
brings drawbacks including resource utilization inefficiency, power consumption overhead
and decrease in system operating frequency. The experimental results report a
50% of resource utilization inefficiency with a power consumption overhead of less
than 5% and a decrease in frequency of up to 32% compared to a static implementation.
The results also show that most of the drawbacks of partial reconfiguration implementation
come from the restrictions and limitations of partial reconfiguration design flow.
If these limitations can be addressed, partial reconfiguration should still be considered
with its potential benefits.Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 201
Pixie: A heterogeneous Virtual Coarse-Grained Reconfigurable Array for high performance image processing applications
Coarse-Grained Reconfigurable Arrays (CGRAs) enable ease of programmability
and result in low development costs. They enable the ease of use specifically
in reconfigurable computing applications. The smaller cost of compilation and
reduced reconfiguration overhead enables them to become attractive platforms
for accelerating high-performance computing applications such as image
processing. The CGRAs are ASICs and therefore, expensive to produce. However,
Field Programmable Gate Arrays (FPGAs) are relatively cheaper for low volume
products but they are not so easily programmable. We combine best of both
worlds by implementing a Virtual Coarse-Grained Reconfigurable Array (VCGRA) on
FPGA. VCGRAs are a trade off between FPGA with large routing overheads and
ASICs. In this perspective we present a novel heterogeneous Virtual
Coarse-Grained Reconfigurable Array (VCGRA) called "Pixie" which is suitable
for implementing high performance image processing applications. The proposed
VCGRA contains generic processing elements and virtual channels that are
described using the Hardware Description Language VHDL. Both elements have been
optimized by using the parameterized configuration tool flow and result in a
resource reduction of 24% for each processing elements and 82% for each virtual
channels respectively.Comment: Presented at 3rd International Workshop on Overlay Architectures for
FPGAs (OLAF 2017) arXiv:1704.0880
FPGA dynamic and partial reconfiguration : a survey of architectures, methods, and applications
Dynamic and partial reconfiguration are key differentiating capabilities of field programmable gate arrays (FPGAs). While they have been studied extensively in academic literature, they find limited use in deployed systems. We review FPGA reconfiguration, looking at architectures built for the purpose, and the properties of modern commercial architectures. We then investigate design flows, and identify the key challenges in making reconfigurable FPGA systems easier to design. Finally, we look at applications where reconfiguration has found use, as well as proposing new areas where this capability places FPGAs in a unique position for adoption
- …