2,831 research outputs found
The Chameleon Architecture for Streaming DSP Applications
We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8 mm in a 130 nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool
Reconfigurable Mobile Multimedia Systems
This paper discusses reconfigurability issues in lowpower hand-held multimedia systems, with particular emphasis on energy conservation. We claim that a radical new approach has to be taken in order to fulfill the requirements - in terms of processing power and energy consumption - of future mobile applications. A reconfigurable systems-architecture in combination with a QoS driven operating system is introduced that can deal with the inherent dynamics of a mobile system. We present the preliminary results of studies we have done on reconfiguration in hand-held mobile computers: by having reconfigurable media streams, by using reconfigurable processing modules and by migrating functions
Intrinsically Evolvable Artificial Neural Networks
Dedicated hardware implementations of neural networks promise to provide faster, lower power operation when compared to software implementations executing on processors. Unfortunately, most custom hardware implementations do not support intrinsic training of these networks on-chip. The training is typically done using offline software simulations and the obtained network is synthesized and targeted to the hardware offline. The FPGA design presented here facilitates on-chip intrinsic training of artificial neural networks. Block-based neural networks (BbNN), the type of artificial neural networks implemented here, are grid-based networks neuron blocks. These networks are trained using genetic algorithms to simultaneously optimize the network structure and the internal synaptic parameters. The design supports online structure and parameter updates, and is an intrinsically evolvable BbNN platform supporting functional-level hardware evolution. Functional-level evolvable hardware (EHW) uses evolutionary algorithms to evolve interconnections and internal parameters of functional modules in reconfigurable computing systems such as FPGAs. Functional modules can be any hardware modules such as multipliers, adders, and trigonometric functions. In the implementation presented, the functional module is a neuron block. The designed platform is suitable for applications in dynamic environments, and can be adapted and retrained online. The online training capability has been demonstrated using a case study. A performance characterization model for RC implementations of BbNNs has also been presented
Interconnect architectures for dynamically partially reconfigurable systems
Dynamically partially reconfigurable FPGAs (Field-Programmable Gate Arrays) allow
hardware modules to be placed and removed at runtime while other parts of the system
keep working. With their potential benefits, they have been the topic of a great
deal of research over the last decade. To exploit the partial reconfiguration capability of
FPGAs, there is a need for efficient, dynamically adaptive communication infrastructure
that automatically adapts as modules are added to and removed from the system.
Many bus and network-on-chip (NoC) architectures have been proposed to exploit this
capability on FPGA technology. However, few realizations have been reported in the
public literature to demonstrate or compare their performance in real world applications.
While partial reconfiguration can offer many benefits, it is still rarely exploited in practical
applications. Few full realizations of partially reconfigurable systems in current
FPGA technologies have been published. More application experiments are required to
understand the benefits and limitations of implementing partially reconfigurable systems
and to guide their further development. The motivation of this thesis is to fill this
research gap by providing empirical evidence of the cost and benefits of different interconnect
architectures. The results will provide a baseline for future research and will
be directly useful for circuit designers who must make a well-reasoned choice between
the alternatives.
This thesis contains the results of experiments to compare different NoC and bus interconnect
architectures for FPGA-based designs in general and dynamically partially
reconfigurable systems. These two interconnect schemes are implemented and evaluated
in terms of performance, area and power consumption using FFT (Fast Fourier
Transform) andANN(Artificial Neural Network) systems as benchmarks. Conclusions
drawn from these results include recommendations concerning the interconnect approach
for different kinds of applications. It is found that a NoC provides much better
performance than a single channel bus and similar performance to a multi-channel bus
in both parallel and parallel-pipelined FFT systems. This suggests that a NoC is a better choice for systems with multiple simultaneous communications like the FFT. Bus-based
interconnect achieves better performance and consume less area and power than NoCbased
scheme for the fully-connected feed-forward NN system. This suggests buses
are a better choice for systems that do not require many simultaneous communications
or systems with broadcast communications like a fully-connected feed-forward NN.
Results from the experiments with dynamic partial reconfiguration demonstrate that
buses have the advantages of better resource utilization and smaller reconfiguration
time and memory than NoCs. However, NoCs are more flexible and expansible. They
have the advantage of placing almost all of the communication infrastructure in the
dynamic reconfiguration region. This means that different applications running on the
FPGA can use different interconnection strategies without the overhead of fixed bus
resources in the static region.
Another objective of the research is to examine the partial reconfiguration process and
reconfiguration overhead with current FPGA technologies. Partial reconfiguration allows
users to efficiently change the number of running PEs to choose an optimal powerperformance
operating point at the minimum cost of reconfiguration. However, this
brings drawbacks including resource utilization inefficiency, power consumption overhead
and decrease in system operating frequency. The experimental results report a
50% of resource utilization inefficiency with a power consumption overhead of less
than 5% and a decrease in frequency of up to 32% compared to a static implementation.
The results also show that most of the drawbacks of partial reconfiguration implementation
come from the restrictions and limitations of partial reconfiguration design flow.
If these limitations can be addressed, partial reconfiguration should still be considered
with its potential benefits.Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 201
Lessons learned from the design of a mobile multimedia system in the Moby Dick project
Recent advances in wireless networking technology and the exponential development of semiconductor technology have engendered a new paradigm of computing, called personal mobile computing or ubiquitous computing. This offers a vision of the future with a much richer and more exciting set of architecture research challenges than extrapolations of the current desktop architectures. In particular, these devices will have limited battery resources, will handle diverse data types, and will operate in environments that are insecure, dynamic and which vary significantly in time and location. The research performed in the MOBY DICK project is about designing such a mobile multimedia system. This paper discusses the approach made in the MOBY DICK project to solve some of these problems, discusses its contributions, and accesses what was learned from the project
Smart technologies for effective reconfiguration: the FASTER approach
Current and future computing systems increasingly require that their functionality stays flexible after the system is operational, in order to cope with changing user requirements and improvements in system features, i.e. changing protocols and data-coding standards, evolving demands for support of different user applications, and newly emerging applications in communication, computing and consumer electronics. Therefore, extending the functionality and the lifetime of products requires the addition of new functionality to track and satisfy the customers needs and market and technology trends. Many contemporary products along with the software part incorporate hardware accelerators for reasons of performance and power efficiency. While adaptivity of software is straightforward, adaptation of the hardware to changing requirements constitutes a challenging problem requiring delicate solutions. The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) project aims at introducing a complete methodology to allow designers to easily implement a system specification on a platform which includes a general purpose processor combined with multiple accelerators running on an FPGA, taking as input a high-level description and fully exploiting, both at design time and at run time, the capabilities of partial dynamic reconfiguration. The goal is that for selected application domains, the FASTER toolchain will be able to reduce the design and verification time of complex reconfigurable systems providing additional novel verification features that are not available in existing tool flows
- âŠ