06141 Abstracts Collection -- Dynamically Reconfigurable Architectures by  et al.
06141 Abstracts Collection
Dynamically Reconﬁgurable Architectures
 Dagstuhl Seminar 
Peter M. Athanas1, Jürgen Becker2, Gordon Brebner3 and Jürgen Teich4
1 Virginia Polytechnic Institute, US
athanas@vt.edu
2 Universität Karlsruhe, DE
becker@itiv.uni-karlsruhe.de
3 Xilinx - San José, US
Gordon.Brebner@xilinx.com
4 Universität Erlangen, DE
teich@informatik.uni-erlangen.de
Abstract. From 02.04.06 to 07.04.06, the Dagstuhl Seminar 06141 Dy-
namically Reconﬁgurable Architectures was held in the International
Conference and Research Center (IBFI), Schloss Dagstuhl. During the
seminar, several participants presented their current research, and on-
going work and open problems were discussed. Abstracts of the presen-
tations given during the seminar as well as abstracts of seminar results
and ideas are put together in this paper. The ﬁrst section describes the
seminar topics and goals in general. Links to extended abstracts or full
papers are provided, if available.
Keywords. Dynamically run-time reconﬁgurable computing architec-
tures, adaptive systems, computational models, circuit technologies, sys-
tem architecture, CAD tool support
06141 Executive Summary  Dynamically Reconﬁgurable
Architectures
Dynamic and partial reconﬁguration of hardware architectures such as FPGAs
and XPPs brings an additional level of ﬂexibility in the design of electronic
systems by exploiting the possibility of conﬁguring functions on-demand during
run-time. This has led to many new ways of approaching existing research topics
in the area of hardware design and optimization techniques. For example, the
possibility of performing adaptation during run-time raises questions in the areas
of dynamic control, real-time response, on-line power management and design
complexity, since the reconﬁgurability increases the design space towards inﬁnity.
This Dagstuhl Seminar on Reconﬁgurable Architectures has aimed at raising a
few of these topics e.g. on-line placement, pre-routing/on-line routing trade-oﬀ,
power minimization etc., and also at presenting novel ideas on how to overcome
the diﬃculties introduced in dynamic reconﬁgurable systems.
Dagstuhl Seminar Proceedings 06141
Dynamically Reconﬁgurable Architectures
http://drops.dagstuhl.de/opus/volltexte/2006/838
2 P. M. Athanas, J. Becker, G. Brebner and J. Teich
Keywords: Reconﬁgurable Computing, Reconﬁgurable Supercomputing, Or-
ganic Computing, Dynamic Reconﬁguration, Reconﬁgurable Hardware
Joint work of: Becker, Jürgen; Teich, Jürgen; Brebner, Gordon; Athanas, Peter
M.
Extended Abstract: http://drops.dagstuhl.de/opus/volltexte/2006/837
IP Cores Protection in FPGA Environment, Cryptographic
Identiﬁcation Primitives
Wael Adi (TU Braunschweig, D)
With the advent of multi-million gate chips, Field Programmable Gate Arrays
(FPGAs) have achieved high usability for design veriﬁcation, exchange, test and
even production. Adding to this is the possibility of reusing readily available
licensed IP to shorten the design cycle. A major concern for IP owners is the
possible over-deployment of the IP into more devices than originally licensed.
In this presentation, two system based on both public and secret-key cryptog-
raphy embedded in a secured design exchange protocol for protecting the rights
of the IP owner areintroduced. The systems consist of hardware-supported de-
sign encryption and secured device authentication protocols. Design encryption
based on secured device identiﬁcation ensures that the IP can only be deployed
into explicitly identiﬁed and agreed upon devices. The system is devised for an
uncomplicated trustable design exchange scenario. The public-key functions use
modular squaring (Rabin Lock) on the FPGA chip instead of exponentiation to
reduce the hardware complexity.
Keywords: IP core protection, FPGA design protection, combined secret and
Public-Key Identiﬁcation, Electronic mutation, Write-only-memory, provable se-
cret identity
See also: Wael Adi, R. Ernst, Bassel Soudan, A.Hanoun "VLSI Design Ex-
change with Intellectual Property Protection in FPGA Environment Using both
Secret and Public-Key Cryptography". IEEE Computer Society Annual Sympo-
sium on VLSI, ISVLSI 2006, Karlsruhe, Germany, March. 2006
The (empty?) Promise of FPGA Supercomputing
Peter M. Athanas (Virginia Polytechnic Institute, USA)
There have been some notable success stories in the past that give merit to
the viability of the creation of an FPGA-based supercomputer. When examin-
ing the computing potential of these devices, they appear to oﬀer competitive
computational characteristics that are highly competitive to contemporary high-
performance processors. Recently, there have been supercomputer-class process-
ing blades oﬀered by the leading high-performance computing specialist, yet the
Dynamically Reconﬁgurable Architectures 3
sales of these nodes have been less than spectacular. This talk examines why this
may be the case, and explores the viability and cost-performance of FPGA-based
supercomputers.
Keywords: FPGA supercomputing
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/732
The ADRES coarse-grained reconﬁgurable Array
Processor
Mladen Berekovic (Delft University of Technology, NL)
ADRES is a new coarse grain reconﬁgurable processor device, that is fully pro-
grammable in C. ADRES supports design-time and runtime reconﬁgurability for
a broad range of multi-mode embedded applications, such as such as MPEG-2,
MPEG-4, AVC/H.264 or Scalable Video Coding.
To address the challenges of multimode operations, ADRES provides im-
proved power eﬃciency and performance within acceptable area constraints.
ADRES targets a power eﬃciency of at least 40 MOPS/mW while being able
to handle a peak performance of 20 GOPS, requirements that clearly go beyond
the specs of any state-of-the-art core.
The ADRES array processor is a ﬂexible template instead of a concrete in-
stance. An architecture description language is developed to specify diﬀerent
ADRES instances with full compiler support. A script-based technique allows a
designer to easily generate diﬀerent instances by specifying diﬀerent values for
the communication topology, supported operation set, resource allocation and
timing of the target architecture. Together with a retargetable simulator and
compiler, this toolchain allows for architecture exploration and development of
application domain speciﬁc processors.
ADRES supports a VLIW-like programming model with a pure VLIW mode
for legacy code, and an array mode with very high dataﬂow-parallelism for the
processing of compute intensive loops. In parallel to ADRES, an C-compiler that
can create seamless code for both modes using modulo scheduling techniques and
that is based on IMPACT was developed. The compiler takes automatically care
of all necessary data transfers between the two modes. Several applications from
the wireless and from the multimedia domain have been mapped on ADRES.
As an example, an ADRES based system can perform AVC decoding in CIF
resolution with less then 50 MHz on a 4x4 array on compiled C-Code. MPEG-
2 CIF decoding needs only 27 MHz. Several key benchmark kernel loops have
been mapped on ADRES and the results show that ADRES can extend the
performance of state-of-the Art VLIW DSPs by a factor of 7, which makes
ADRES an attractive alternative to multi-core DSP solutions. Synthesis results
show, that such an ADRES core consumes less then 2 mm2 in 90nm technology
and can run with more then 500 MHz. Considering that the array size can be
4 P. M. Athanas, J. Becker, G. Brebner and J. Teich
further increased to 8x8 or beyond, ADRES oﬀers signiﬁcant room for further
speedup.
ADRES signiﬁcantly beats a state-of-the art DSP, like the TI C64x, with
fewer resources and at comparable power consumption. At the same time ADRES
oﬀers performance and power scalability by using more resources and larger array
sizes. Therefore, ADRES based designs will reduce the need for multi-processor
solutions.
Keywords: Reconﬁgurable arrays, Tensilica, IMPACT, VLIW, DSP
HW/SW Codesign for Reconﬁgurable System-on-Chip
using a Process Model
Neil W. Bergmann (The University of Queensland, AU)
Abstract: Reconﬁgurable System-on-Chip is a powerful method to harness the
power of FPGA technology. However, there is a very limited pool of designers
who can build hardware-software designs. This paper describes a way to simplify
rSoC design, by making hardware coprocessors appear like software processes in
a Linux development environment. We explain both the ways that hardware
processes are controlled by software ghost processes, and also how hardware and
software processes communicate.
Keywords: Reconﬁgurable System-on-Chip, FPGA, Embedded System
Full Paper:
http://www.itee.uq.edu.au/∼bergmann
Adaptive On-Chip Multiprocessing
Christophe Bobda (TU Kaiserslautern, D)
The last decades have experienced a continuous growth in the performance of
multiprocessor according to the Moore law. This increase in performance is
mainly due to two main reasons: the clock frequency that keep growing and
eﬃcient use von Instruction Level Parallelism (ILP). With the diﬃculty to con-
tinuously improve the clock frequency as well as the ILP has investigation, Chip
multiprocessor have begun to appear as an alternative to increase performance
through processor level parallelisms. Most of the solutions proposed and devel-
oped are mainly SMP-based. Furthermore the architecture is ﬁxed, thus limiting
the architectural ﬂexibility. With the growing capacity of FPGAs, it is more and
more possible to place several soft or hardcore processors working in parallel on
a given FPGA. Moreover, the ﬂexible logic allows for the runtime adaptivity of
applications by exchanging hardware accelerator.
In this talk, we present the on going work in adaptive on-chip multiprocessor
at the University of Kaiserslautern.
Dynamically Reconﬁgurable Architectures 5
Keywords: Multiprocessing on Chip, FPGA
Full Paper:
http://soes.informatik.uni-kl.de/people/bobda
Dynamically Adaptable Behaviours (?)
Gordon Brebner (Xilinx - San José, USA)
This talk introduced the general topic of new thought models for programmabil-
ity and conﬁgurability. A main suggestion was to focus on application behaviours
rather than architectures, and thus on adaptability rather than programmabil-
ity. This would build on the success of research into reconﬁgurable architectures,
to move to a focus on the needs of particular problem instances within broad
application domains. Using hyper-programmability, or some alternative mecha-
nism, each instance can be mapped to its own tailored architecture. Downstream
from this, further future research areas were discussed. These included: making
a big push towards a "third way" of system design that is neither hardware nor
software like; devising apt computational models and theories of programming;
systematised application and environment speciﬁcity; and harnessing program-
mability within future technologies in a natural manner. A major means of fa-
cilitating this future vision will be through education, especially of young people
as yet unversed in ﬁrst and/or second way thinking.
Reconﬁgurable Architectures and Instruction Sets:
Programmability, Code Generation, and Program
Execution
Rainer Buchty (Universität Karlsruhe, D)
Within Self-reconﬁguring systems two basic problems arise: ﬁrstly, on instruction
level reconﬁgurable instruction sets make program generation and execution in-
herently diﬃcult. Secondly, reconﬁguration must not violate certain restrictions
vital for the running application.
We describe a combined low-overhead approach which targets both prob-
lems by instrumenting an attributed low-overhead run-time environment which
is able to dynamically map application-speciﬁc instructions to a variety of im-
plementation alternatives while strictly adhering to given application demands.
Our approach can be used application-independent and is suitable for use within
the adaptive planning stage of a Self-X system as demonstrated by a reference
implementation.
Keywords: Self-X, Instruction Set Reconﬁguration, Run-time Environment,
Code Generation, Programming
Extended Abstract: http://drops.dagstuhl.de/opus/volltexte/2006/733
6 P. M. Athanas, J. Becker, G. Brebner and J. Teich
Enabling RTR for industry
Oliver Diessel (Univ. of New South Wales, AU)
This talk explores the promise of run tme reconﬁgurable (RTR) technology and
makes an attempt to identify critical support elements that need to be put in
place in order to overcome barriers to enhanced RTR uptake in industry.
We outline a research project underway at the University of New South Wales
to develop a positioning satellite receiver that exploits the diversity in satellite
signals to mitigate the eﬀects of interference. This project is examined as a case
study to motivate the discovery of challenges an industrial organisation faces
engineering a dynamically reconﬁgurable product.
Our progress towards the development of a methodology for providing com-
munications infrastructure for module-based applications illustrates one of the
eﬀorts necessary to develop useful synthesis tools for RTR applications develop-
ment.
We conclude with suggestions for how the academic community can better
assist the commercial development of real applications.
Keywords: Run-time reconﬁguration, industry support, design tools, module-
based design, communications
Joint work of: Diessel, Oliver; Koh, Shannon
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/734
Reconﬁguration Time Aware Processing on FPGAs
Florian Dittmann (Universität Paderborn, D)
The possibility of partial reconﬁguration of FPGAs during run-time can be used
to implement systems that adapt their execution area over time. Two things are
presented in this context:
1) For detailed investigations of partial reconﬁguration, the two topics mod-
eling and practical realization of reconﬁgurable systems must be rooted in the
design process. We have developed a tool that meets this requirement. It eases
the design of partial bitstreams for Xilinx FPGAs for research purpose. The tool
wraps the obstacles of partial bitstream generation, motivating people new to
this ﬁeld. Moreover, the backend of the tool, a single UML class diagram that
represents the whole characteristics of the reconﬁgurable system under devel-
opment abstractly, allows to model reconﬁgurable systems in a comprehensive
manner on a high level of abstraction. The UML diagram is ﬁlled during the
design process until enough information for the generation of bitstreams is avail-
able.
2) In the single machine environment, several scheduling algorithms exist that
allow to quantify schedules with respect to feasibility, optimality, etc. In contrast,
Dynamically Reconﬁgurable Architectures 7
reconﬁgurable devices execute tasks in parallel, which intentionally collides with
the single machine principle and seems to require new methods and evaluation
strategies for scheduling. However, the reconﬁguration phases of adaptable ar-
chitectures usually take place sequentially. Run-time adaptation is realized using
an exclusive port, which is occupied for some reasonable time during reconﬁg-
uration. Thus, we can ﬁnd an analogy to the single machine environment. We
investigate the appliance of single processor scheduling algorithms to task recon-
ﬁguration on reconﬁgurable systems. We determine necessary adaptations and
propose methods to evaluate the scheduling algorithms.
Keywords: Real-Time, Partial Reconﬁguration, Reconﬁguration Time Schedul-
ing
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/735
Technological Aspects for 3D FPGAs architectures with
optoelectronic interconnects
Dietmar Fey (Universität Jena, D)
In the talk a review of proposals concerning the use of optoelectronic intercon-
nects for FPGAs is given. The perspectives and the beneﬁts of using optics for
dynamically optical interconnects as well as optically reconﬁguration of archi-
tectures are critically compared to the possibility of electronic 3D chip stack
technology. The author sees a chance for optics in the optically reconﬁguration
of context memories in smart optical sensors since in this application an optical
interface is anyway available or rather necessary for the detection of images. This
interface could also be used for a dynamic reconﬁguration in combination with
modern 3D chip mounting and assembly technologies to realise smart and very
compact CMOS camera chip stacks for embedded systems.
Keywords: Optoelctronic interconnects, 3D FPGAs, multi-context architec-
tures, smart otical sensors
Bridging the Gap between Relocatability and Available
Technology: The Erlangen Slot Machine
Diana Göhringer (Universität Erlangen, D)
We present an FPGA-based reconﬁgurable platform called Erlangen Slot Ma-
chine (ESM). The main advantages of this platform are: First, the possibility
for each module to access peripherals independent from its location through
a programmable crossbar, and local SRAM banks for individual modules. This
physical design eases the implementation of run-time reconﬁgurable partial mod-
ules and enables an unrestricted relocation of modules on the device. We present
our two-board ESM implementation and demonstrate a partially reconﬁgurable
video ﬁlter application as well as a relocatable computer game including a ded-
icated inter-module communication scheme.
8 P. M. Athanas, J. Becker, G. Brebner and J. Teich
Keywords: FPGA-based reconﬁgurable platform, inter-module communication,
crossbar, video ﬁlter demo
Joint work of: Göhringer, Diana; Majer, Mateusz; Teich, Jürgen
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/736
Reconﬁgurable Supercomputing: What are the Problems?
What are the Solutions?
Reiner Hartenstein (TU Kaiserslautern, D)
The dominant paradox of supercomputing is the contrast between decades of
excellent technology development (e. g. see the Gordon Moore curve) versus the
almost stalled progress in sustained performance in many application areas. The
counterparts to this supercomputing paradox are the 3 paradoxes of Reconﬁg-
urable Computing: the low power paradox, the high performance paradox, as
well as the education paradox. Despite to the fact, that are very power-hungry,
software to conﬁgware migrations have been reported which reduce the tens or
hundreds of thousand dollars electricity bill by an order of magnitude.
Despite of the really awful technological parameters of FPGAs brilliant re-
sults with speedups by up to 4 orders of magnitude have been reported from
software to FPGA migrations.
The presentation discusses the reasons of these paradoxes. A very important
aspect is the way, how data and processors are brought together. From this
point of view, supercomputing has used for decades the wrong road map, based
on the wrong machine paradigm being extremely memory-cycle-hungry. The
presentation illustrates, why the not instruction-stream-centered basic machine
paradigm of Reconﬁgurable Computing provides the right road map to new
horizons of supercomputing.
The Reconﬁgurable Computing education paradox shows, that its extremely
high pervasiveness to applications in practically all disciplines of embedded sys-
tem as well as scientiﬁc computing is possible, although computing-related cur-
riculum recommendations completely ignore these subject areas. This leads to
the conclusion, that these achievements have been implemented primarily from
experts with backgrounds diﬀerent from computing sciences. CS departments are
the best possible institutions to overcome the methodology fragmentation be-
tween many FPGA application disciplines using their own domain-speciﬁc tool
trick boxes, and, to develop models covering all aspects which the application
disciplines have in common. Refusing to take this responsibility the undergrad-
uate CS-related curricula also miss the most important job markets for their
graduates.
Keywords: FPGA, reconﬁgurable computing, supercomputing, reconﬁgurable
computing education, curricula, low power, high performance, conﬁgware
Dynamically Reconﬁgurable Architectures 9
FlexFilm - an Image Processor for Digital Film Processing
Sven Heithecker (TU Braunschweig, D)
Digital ﬁlm processing is characterized by a resolution of at least 2K (2048x1536
pixels per frame at 30 bit/pixel and 24 pictures/s, data rate of 2.2 GBit/s); higher
resolutions of 4K (8.8 GBit/s) and even 8K (35.2 GBit/s) are on their way. Real-
time processing at this data rate is beyond the scope of today's standard and
DSP processors, and ASICs are not economically viable due to the small mar-
ket volume. Therefore, an FPGA-based approach was followed in the FlexFilm
project. Diﬀerent applications are supported on a single hardware platform by
using diﬀerent FPGA conﬁgurations.
The multi-board, multi-FPGA hardware/software architecture is based on
Xilinx Virtex-II Pro FPGAs which contain the reconﬁgurable image stream
processing data path, large SDRAM memories for multiple frame storage and a
PCI express communication backbone network. The FPGA-embedded CPU is
used for control and less computation intensive tasks.
This paper will focus on three key aspects: a) the used design methodology
which combines macro component conﬁguration and macro-level ﬂoorplanning
with weak programmability using distributed microcoding, b) the global com-
munication framework with communication scheduling and c) the conﬁgurable,
multi-stream scheduling SDRAM controller with QoS support by access priori-
tization and traﬃc shaping.
As an example, a complex noise reduction algorithm including a 2.5 dimen-
sions DWT and a full 16x16 motion estimation at 24 fps requiring a total of
203 Gops/s net computing performance and a total of 28 Gbit/s DDR-SDRAM
frame memory bandwidth will be shown.
Keywords: Digital ﬁlm, FPGA, reconﬁgurable, stream-based architechture,
weak programming, SDRAM-controller, QoS, communication centric, communi-
cation scheduling, PCI-Express
Joint work of: Heithecker, Sven; do Carmo Lucas, Amilcar; Ernst, Rolf
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/737
Reconﬁgurable Processing Units vs. Reconﬁgurable
Interconnects
Andreas Herkersdorf (TU München, D)
The question we proposed to explore with the seminar participants is whether the
dynamic reconﬁgurable computing community is paying suﬃcient attention to
the subject of dynamic reconﬁgurable SoC interconnects. By SoC interconnect,
we refer to architecture- or system-level building blocks such as on-chip buses,
crossbars, add-drop rings or meshed NoCs.
10 P. M. Athanas, J. Becker, G. Brebner and J. Teich
Our motivation to systematically investigate this question originates from
conceptual and architectural challenges in the FlexPath project. FlexPath is
a new Network Processor architecture that ﬂexibly maps networking functions
onto both SW programmable CPU resources and (re-)conﬁgurable HW building
blocks in a way that diﬀerent packet ﬂows are forwarded via diﬀerent, optimized
processing paths. Packets with well deﬁned processing requirements may even
bypass the central CPU complex (AutoRoute). In consequence, CPU processing
resources are more eﬀectively used and the overall NP throughput is improved
compared to conventional NPU architectures.
The following requirements apply with respect to the dynamic adaptation of
the processing paths: The rule basis for NPU-internal processing path lookup is
updated in the order of 100us, packet inter-arrival time is in the order of 100ns.
Partial reconﬁguration of the rule basis (and/or interconnect structure) with
state of the art techniques would take several ms resulting in a continuously
blocked system. However, performing path selection with conventional lookup
table search and updates (and a statically conﬁgured on-chip bus) takes con-
siderably less than 100ns. Hence, is there a need for new conceptual approaches
with respect to dynamic SoC interconnect reconﬁguration, or is this a "no issue"
as conventional techniques are suﬃcient?
Keywords: Reconﬁgurable SoC interconnect
Extended Abstract: http://drops.dagstuhl.de/opus/volltexte/2006/779
AMIDAR: A new Modell for Adaptive Processors
Christian Hochberger (TU Dresden, D)
The AMIDAR model is a new model for processor construction. It borrows con-
cepts from microprogramming, dataﬂow computers and other well known tech-
niques from the computer architecture. The central aim of the AMIDAR model
is to provide a processor that can be adapted easily to the running application.
Adaptation can occur in diﬀerent ways: The communication infrastructure can
be adapted, the type and number of functional units can be adapted and ﬁnally,
specialized functional units can be synthesized and integrated into the processor.
The talk explains the basic ideas of AMIDAR, the achievements we have
already made (simulator, various proﬁling techniques), ongoing work (hardware
implementation, synthesis) and will ﬁnally show the open chalenges (synthesis
algorithms, hardware architectures).
Keywords: Adaptive Processor, Java Bytecode Processor
Full Paper:
www.amidar.de
Dynamically Reconﬁgurable Architectures 11
Physical 2D Morphware and Power Reduction Methods
for Everyone
Michael Hübner (Universität Karlsruhe, D)
Requiered ﬂexibility for future embeded systems including dynamic and par-
tial reconﬁgurable hardware in order to optimize system status is the challenge
for actual and future research. Exploiting adaptive hardware for introducing
online-placement and routing provides the degree of freedom for run-time multi-
adaptive system integration. The presentation introduces novel methods and
tools for today and future technology.
Keywords: 2D Placement, Run-time Reconﬁguration
Joint work of: Becker, Jürgen; Paulsson, Katarina; Hübner, Michael
Physical 2D Morphware and Power Reduction Methods
for Everyone
Michael Hübner (Universität Karlsruhe, D)
Dynamic and partial reconﬁguration discovers more and more the focus in aca-
demic and industrial research. Modern systems in e.g. avionic and automotive
applications exploit the parallelism of hardware in order to reduce power con-
sumption and to increase performance. State of the art reconﬁgurable FPGA
devices allows reconﬁguring parts of their architecture while the other conﬁg-
ured architecture stays undisturbed in operation. This dynamic and partial re-
conﬁguration allows therefore adapting the architecture to the requirements of
the application while run-time. The diﬀerence to the traditional term of software
and its related sequential architecture is the possibility to change the paradigm
of brining the data to the respective processing elements. Dynamic and partial
reconﬁguration enables to bring the processing elements to the data and is there-
fore a new paradigm. The shift from the traditional microprocessor approaches
with sequential processing of data to parallel processing reconﬁgurable architec-
tures forces to introduce new paradigms with the focus on computing in time
and space.
Keywords: 2D online placement and routing, Reconﬁgurable Computing
Joint work of: Becker, Jürgen; Paulsson, Katarina; Hübner, Michael
Extended Abstract: http://drops.dagstuhl.de/opus/volltexte/2006/739
12 P. M. Athanas, J. Becker, G. Brebner and J. Teich
Managing power amongst a group of networked embedded
fpgas using dynamic reconﬁguration and task migration
David Kearney (Univ. of South Australia, AU)
Small unpiloted aircraft (UAVs) each have limited power budgets. If a group
(swarm) of small UAVs is organized to perform a common task such as geo-
location then it is possible to share the total power across the group by intro-
ducing task mobility inside the group supported by an ad hoc wireless network
(where the communication encoding/decodeing is also done on fpgas). In this
presentation I will describe research into the construction of a distributed op-
erating system where partial dynamic reconﬁguration and network mobility are
combined so that fpga tasks can be moved to make the best use of the total
power available in a swarm of UAVs.
Keywords: Dynamic reconﬁguration unpiloted aircraft operating system
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/740
Superscalar Technology for Reconﬁgurable Processors
Bernd Klauer (Helmut-Schmidt-Universität - Hamburg, D)
The ﬁrst superscalar architectures have been designed to allow instruction level
parallelism without involving the programmer or the compiler into the paral-
lelization process. Within an instruction window the issue logic schedules instruc-
tions by the dataﬂow principle and maps them on unemployed functional units
which are suitable to perform the calculation as indicated by the opcode. The
maximum total of concurrently executable instructions is then only restricted
by the size of the instruction window and by the total of functional units. With
this execution scheme it is possible to build processors with identical instruction
sets but diﬀerently equipped execution stages.
As the execution stages can be diﬀerently equipped with functional units
without aﬀecting the executability of the code they are an interesting subject
for reconﬁguration.
In the presentation a superscalar processor is proposed that contains such
a reconﬁgurable execution stage together with an extended issue logic and a
conﬁguration management unit. This unit controls the reconﬁguration of the
execution stage. It decides whether the current conﬁguration of the execution
stage is suitable for the upcoming computations or not. If it rates another con-
ﬁgurations better than the current conﬁguration of the execution stage, it can
trigger the reconﬁguration procedure. Together with the processor architecture
some simulation an benchmarking results are shown.
Keywords: Superscalar reconﬁgurable architecture
Joint work of: Klauer, Bernd; Niyonkuru, Adronis
Dynamically Reconﬁgurable Architectures 13
Back-End Issues in Hardware/Software-Compilation
Andreas Koch (TU Darmstadt, D)
This presentation will given an overview over some of the issues that need to be
addressed in the design and implementation of the hardware-generating back-
end of a hardware/software compiler. COMRADE is a compile ﬂow that aims to
map programs formulated in standard ANSI C to an adaptive computer, such
that compute-intensive parts are accelerated on a reconﬁgurable compute unit,
while less critical or unsuitable parts are executed on a standard processor. The
talk shows some of the techniques that are used in the COMRADE back-end.
Keywords: Reconﬁgurable, hardware/software, compilation
Enabling RTR for industry
Shannon Koh (Univ. of New South Wales, AU)
This talk explores the promise of run tme reconﬁgurable (RTR) technology and
makes an attempt to identify critical support elements that need to be put in
place in order to overcome barriers to enhanced RTR uptake in industry.
We outline a research project underway at the University of New South Wales
to develop a positioning satellite receiver that exploits the diversity in satellite
signals to mitigate the eﬀects of interference. This project is examined as a case
study to motivate the discovery of challenges an industrial organisation faces
engineering a dynamically reconﬁgurable product.
Our progress towards the development of a methodology for providing com-
munications infrastructure for module-based applications illustrates one of the
eﬀorts necessary to develop useful synthesis tools for RTR applications develop-
ment. We conclude with suggestions for how the academic community can better
assist the commercial development of real applications.
Floating Point FPGAs
Philip Leong (Imperial College London, GB)
In this talk, a case for developing an FPGA speciﬁcally optimised for ﬂoating
point applications is given. Applications include signal processing, embedded
systems and high performance computing, and such a device is likely to have
speed and power consumption advantages over conventional FPGA and micro-
processor technology. The performance improvement obtained by adding ﬂoating
point units (FPUs) to an existing ﬁne grain FPGA is estimated to be 2-10x on a
number of benchmark applications. Diﬀerent architectures involving ﬂash based
dynamic reconﬁguration and the sharing of conﬁguration bits are also discussed.
Keywords: FPGAs, ﬂoating point, embedded blocks
14 P. M. Athanas, J. Becker, G. Brebner and J. Teich
FlexFilm - an Image Processor for Digital Film Processing
Amilcar Lucas (TU Braunschweig, D)
On this presentation a multi-board, multi-FPGA hard- ware/software architec-
ture, for computation intensive, high resolution (2048x2048 pixels), real-time (24
frames per sec- ond) digital Film processing. It is based on Xilinx Virtex- II Pro
FPGAs, large SDRAM memories for multiple frame storage and a PCI express
communication network. The ar- chitecture reaches record performance running
a complex noise reduction algorithm including a 2.5 dimensions DWT and a full
16x16 motion estimation at 24 fps requiring a total of 203 Gops/s net computing
performance and a to- tal of 28 Gbit/s DDR-SDRAM frame memory bandwidth.
To increase design productivity and yet achieve high clock rates (125MHz),
the architecture combines macro com- ponent conﬁguration and macro level
ﬂoorplanning with weak programmability using distributed microcoding.
Keywords: Weak-programming, stream-based architechture, digital ﬁlm, re-
conﬁgurable, FPGA, SDRAM-controller, QoS
Joint work of: Lucas, Amilcar; Heithecker, Sven
Full Paper:
www.ﬂexﬁlm.org
Bridging the Gap between Relocatability and Available
Technology: The Erlangen Slot Machine
Mateusz Majer (Universität Erlangen, D)
We present a new concept as well as the implementation of an FPGA-based re-
conﬁgurable platform, the Erlangen Slot Machine (ESM). The main advantages
of this platform are: ﬁrst, the possibility for each module to access its periph-
eries independent from its location through a programmable crossbar, and local
SRAMs banks for individual modules. This support eases the design and imple-
mentation of run-time reconﬁgurable partial modules and enables an unrestricted
relocation of modules on the device.
We present our two board ESM implementation and demonstrate a partially
reconﬁgurable video ﬁlter application.
Keywords: FPGA-based reconﬁgurable platform, inter-module communication,
crossbar, video ﬁlter demo
Joint work of: Majer, Mateusz; Göhringer, Diana; Teich, Jürgen
Dynamically Reconﬁgurable Architectures 15
Pre-Routed FPGA Cores for Rapid System Construction
in a Dynamic Reconﬁgurable System
Douglas Maskell (Nanyang Technological University - SGP)
We present a new concept as well as the implementation of an FPGA-based re-
conﬁgurable platform, the Erlangen Slot Machine (ESM). The main advantages
of this platform are: ﬁrst, the possibility for each module to access its periph-
eries independent from its location through a programmable crossbar, and local
SRAMs banks for individual modules. This support eases the design and imple-
mentation of run-time reconﬁgurable partial modules and enables an unrestricted
relocation of modules on the device.
We present our two board ESM implementation and demonstrate a partially
reconﬁgurable video ﬁlter application.
Keywords: FPGA-based reconﬁgurable platform, inter-module communication,
crossbar, video ﬁlter demo
Joint work of: Maskell, Douglas ; Oliver, Timothy F.
Extended Abstract: http://drops.dagstuhl.de/opus/volltexte/2006/741
Multi-level Reconﬁgurable Architectures - The Switch
Model
Martin Middendorf (Universität Leipzig, D)
Reconﬁgurable hardware has been successfully deployed for accelerating compu-
tationally demanding applications. While providing enormous ﬂexibility dynam-
ically reconﬁguring applications suﬀer from the large reconﬁguration overhead
inﬂicted by contemporary architectures. We propose to design multi-level recon-
ﬁgurable architectures that can help to reduce this reconﬁguration overhead by
introducing diﬀerent levels of reconﬁguration, each streamlining the capabilities
of its lower levels.
In this talk we present formal models for multi-level reconﬁguration. For the
switch model were reconﬁgurable units are seen a cost model that measures
the reconﬁguration costs are deﬁned. Based on this cost model we study the
complexity of several optimization problems. One problem is to ﬁnd for a given
algorithm the optimal time steps when reconﬁguration operations should be
done of the diﬀerent levels. Other problems are to ﬁnd the optimal number
of reconﬁguration levels and the best granularity for reconﬁguration of on the
diﬀerent levels. We present algorithms to solve this problems and present results
for several applications.
16 P. M. Athanas, J. Becker, G. Brebner and J. Teich
Low Level Compiler for XMonarch
Vincent J. Mooney (Georgia Institute of Technology, USA)
Ongoing research at the Center for Research on Embedded Systems and Tech-
nology (CREST) at Georgia Tech presents a low level compiler for the XMonarch
chip which includes a novel Field Programmable Compute Array (FPCA). The
FPCA is a course-grained reconﬁgurable logic device, and the compiler infrastruc-
ture used is a modiﬁed version of Trimaran.
Keywords: Compiler, FPGA
Design of a Hardware/Software RTOS for FPGAs with
Processors
Vincent J. Mooney (Georgia Institute of Technology, USA)
Moore's prediction  commonly known as Moore's "Law" but not a scientiﬁc law
in the strict sense  indicates that in the next few years we will have digital cir-
cuits with ten billion transistors (we already have a billion+ transistor processor
chip from Intel).
Clearly, a portion of the billion-transistor integrated circuit market will con-
sist of traditional ASICs, e.g., for super-high volume devices such as cell phones.
Another portion of the billion-transistor integrated circuit market will be domi-
nated by processor designs such as Intel's Merced/Itanium architecture.
The rest of the picture is less clear; however, some percentage will likely
be dominated by customizable heterogeneous multiprocessor chips with a rea-
sonable (say, 30-60) percent of the chip consisting of reconﬁgurable and custom
digital logic. For lack of a better term, we will refer to such Customizable Hetero-
geneous Multiprocessor chips as CHM chips. One example of a CHM chip is the
Virtex-4. Standard argumentation in favor of RISC indicates that a processor's
compiler and architecture must be designed together or codesigned. Similarly,
we will argue that CHM chips require codesign of the architecture and the RTOS
to run on the architecture.
The Hardware/Software Codesign Group at Georgia Tech is working on some
ideas in this domain. Speciﬁcally, we will give a brief overview of three recent
projects:
 (i) design of a System-on-a-Chip Lock Cache (SoCLC) where lock variables
are placed in a special lock cache in a CHM chip  a client-server example
using SoCLC shows a reduction in lock latency by a factor of up to 3.65X
resulting in an overall speedup of 31% for the application,
 (ii) a specialized hardware structure and associated algorithm which speeds
up deadlock detection by two to three orders of magnitude in reconﬁgurable
logic when compared with software algorithms, resulting in a 38% overall
speedup in a practical deadlock scenario, and
Dynamically Reconﬁgurable Architectures 17
 (iii) a SoC Dynamic Memory Management Unit (SoCDMMU) integrated
with a software RTOS and able to provide worst-case second-level memory
allocation in 16 cycles in a four-processor SoC example, resulting in an exam-
ple where average case application transition time is 4.4X faster and worst
case application transition time is over 10X faster using the SoCDMMU
versus the traditional software approach.
The talk will end with a brief description of a hardware/software RTOS genera-
tion framework able to integrate any mix of the three hardware RTOS units (i, ii
or iii above) together with a software RTOS. This talk was given as the keynote
at opening of the FPGAworld 2004 Conference.
Keywords: Hardware/software codesign, RTOS, real-time
Full Paper:
http://codesign.ece.gatech.edu/papers/papers.html
See also: FPGAworld Conference 2004, Vasteras, Sweden
Center for Research on Embedded Systems and
Technology
Vincent J. Mooney (Georgia Institute of Technology, USA)
The Center for Research on Embedded Systems and Technology (CREST) at
Georgia Tech is a small research center within the School of Electrical and Com-
puter Engineering. A brief overview of CREST's major goals in education, re-
search and commercialization is provided.
Keywords: Embedded
Hardware/Software Codesign of a Real-Time Operating
System
Vincent J. Mooney (Georgia Institute of Technology, USA)
The hardware/software codesign group at Georgia Tech has carried out a number
of projects in the broad area of codesign of a Real-Timed Operating System. A
brief description of some interesting results  plus additional comments  are
provided.
Keywords: Codesign
18 P. M. Athanas, J. Becker, G. Brebner and J. Teich
DynaCORE: An Adaptive System-on-Chip Architecture
for Deep Packet Processing in Network Applications
Thilo Pionteck (Universität Lübeck, D)
Current network devices have to keep up with increasing bandwidth, growing
complexity and rapid changes in network protocols and applications. Conven-
tional hardware systems cannot meet with the required ﬂexibility. Software-
based solutions or even hybrid systems such as network processors that combine
hardware and software solutions do not achieve the performance requirements.
Hence, other architectural solutions have to be found in order to cope with the
increasing data rates of network applications.
We address this problem with DynaCORE, an application speciﬁc coproces-
sor for ooading computationally intensive tasks from a network processor. The
system-on-chip architecture is based on an adaptable network-on-chip which al-
lows the dynamic replacement of hardware modules as well as the adaptation
of the on-chip communication structure. The exploitation of partial dynamic
reconﬁguration allows a rapid adaptation towards modiﬁed system behaviors.
The design of DynaCORE, its performance requirements as well as its hardware
structure are introduced in this presentation.
Keywords: Network-on-Chip, Network Processor
Mapping Periodic Realtime Tasks to Reconﬁgurable
Hardware
Marco Platzner (Universität Paderborn, D)
The increasing densities of FPGAs and the availability of dynamic reconﬁgu-
ration modes enable hardware multitasking. Circuits are turned into hardware
tasks that are scheduled, loaded, and executed on the reconﬁgurable resource
during runtime. During the last years, the feasibility of hardware multitask-
ing has been demonstrated by several prototypes. For application domains that
combine high performance demands with dynamic task sets, a multitasking en-
vironment is even essential.
In this talk, we concentrate on the execution of periodic real-time tasks
in a hardware multitasking environment, a problem that has not yet received
suﬃcient attention. We ﬁrst present three scheduling approaches: global EDF,
MSDL, and partitioned EDF. Global and partitioned EDF are techniques adopted
from multiprocessor scheduling, MSDL is a server-based scheduling technique
trying to minimize the FPGA reconﬁguration overhead. We discuss the construc-
tion of the schedules, eﬃcient schedulability tests, and evaluate the scheduling
performance by means of a simulation experiment. Then we turn to implementation-
oriented issues and compare the scheduling approaches with respect to suitable
FPGA execution models and the number of required device reconﬁgurations.
Dynamically Reconﬁgurable Architectures 19
Dynamically Reconﬁgurable Processor-Like Architectures
Wolfgang Rosenstiel (Universität Tübingen, D)
Traditionally, FPGAs are deployed due to their ﬂexibility to change the appli-
cation over time. Newly developed architectures can be reconﬁgured within one
clock cycle so that components of a device can re-used within a single appli-
cation promising a better price-performance ratio. The reconﬁguration keeping
pace with the execution yields an additional degree of freedom that constitutes
a new principle of reconﬁguration. We name this principle processor-like recon-
ﬁguration. A silicon-proven processor-like reconﬁgurable architecture is NEC's
Dynamically Reconﬁgurable Processor Architecture (DRP) which we use to val-
idate parts of our research.
In our current work, it is evaluated how processor-like reconﬁguration can
be exploited by a high-level compiler and which architectural resources are
needed for an eﬃcient mapping of applications. To accomplish this, the CRC
model (Conﬁgurable Reconﬁgurable Core) was developed as a general model for
processor-like reconﬁgurable architectures. The features of the CRC model are
modiﬁed according to the requirements imposed by mapping applications onto
it. For this application mapping, well known techniques from C-based hardware
synthesis and from compilers for VLIW processors are deployed. Instances of
the CRC model can be synthesized and analyzed at the gate-level for a detailed
assessment including a comparison to FPGAs. First results on mapping a real-
world example from visual computing have shown considerable advantages of
processor-like reconﬁgurable architectures compared to FPGAs.
Besides the fast reconﬁguration mechanism for the functionality, we extend
the concept of processor-like reconﬁguration for voltage sources. The power dissi-
pation in each time step, the total energy consumption as well as the energy-delay
product can be reduced enormously by temporal-spatial voltage assignment. In
contrast to other voltage scaling approaches no adaptation of the clock frequency
is required.
In particular for coarse-grained reconﬁgurable architectures, a designer must
consider how each application makes use of the provided architectural resources.
Therefore, application-domain speciﬁc architectures are developed taking into
account various techniques that may be used by the compiler.
Keywords: Reconﬁgurable computing, synthesis, compiler, power optimization
20 P. M. Athanas, J. Becker, G. Brebner and J. Teich
Implications of Organic Computing for Reconﬁgurable
Computing
Hartmut Schmeck (Universität Karlsruhe, D)
The new research program of Organic Computing leads to a number of chal-
lenges for system design and architecture. Major requirements are properties
like robustness, adaptivity, and ﬂexibility. Adaptivity can be seen as a prereq-
uisite for achieving robust behaviour in spite of disturbing external inﬂuences,
ﬂexibility refers to the capability of showing diﬀerent types of behaviour invoked
by dynamically changing requirements of the execution environment. Runtime
Reconﬁguration should be a promising technique for providing this robust, adap-
tive, and ﬂexible behaviour. The key challenge, though, is the adequate design of
observer-controller architectures which are suggested for achieving the necessary
control of self-organized behaviour of networked collections of intelligent items.
Self-organization will be indispensable because of the infeasibility to manage
these systems explicitly and individually, but control is also necessary to pre-
vent undesired behaviour of the system, either locally or globally. Reconﬁgurable
Computing has the potential to provide key components of these architectures
and to allow for true organic behaviour.
QUKU: A Coarse Grained Paradigm for FPGAs
Sunil Shukla (Universität Karlsruhe, D)
To ﬁll the gap between increasing demand for reconﬁgurability and performance
eﬃciency, CGRAs are seen to be an emerging platform. The advantage lies in
quick dynamic reconﬁguration and power eﬃciency. Despite having these ad-
vantages they have failed to show their mark. This paper describes the QUKU
architecture, which uses a coarse-grained dynamically reconﬁgurable PE array
(CGRA) overlaid on an FPGA. The low-speed reconﬁgurability of the FPGA
is used to optimize the CGRA for diﬀerent applications, whilst the high-speed
CGRA reconﬁguration is used within an application for operator re-use.
Keywords: FPGA, CGRA, Reconﬁguration
Joint work of: Shukla, Sunil; Bergmann, Neil W.; Becker, Jürgen
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/742
Full Paper:
http://www.itee.uq.edu.au/∼sunil/research.htm
Dynamically Reconﬁgurable Architectures 21
Eﬃcient architectures for streaming applications
Gerard Smit (University of Twente, NL)
This presentation will focus on algorithms and reconﬁgurable tiled architectures
for streaming DSP applications. The tile concept will not only be applied on chip
level but also on board-level and system-level. The tile concept has a number of
advantages:
1. depending on the requirements more or less tiles can be switched on/oﬀ,
2. the tile structure ﬁts well to future IC process technologies, more tiles will
be available in advanced process technologies, but the complexity per tile
stays the same,
3. the tile concept is fault tolerant, faulty tiles can be discarded and
4. tiles can be conﬁgured in parallel.
Because processing and memory is combined in the tiles, tasks can be executed
eﬃciently on tiles (locality of reference).
There are a number of application domains that can be considered as stream-
ing DSP applications: for example wireless baseband processing (for Hiper-
LAN/2, WiMax, DAB, DRM, DVB), multimedia processing (e.g. MPEG, MP3
coding/decoding), medical image processing, color image processing, sensor process-
ing (e.g. remote surveillance cameras) and phased array radar systems. In this
presentation the key characteristics of streaming DSP applications are high-
lighted, and the characteristics of the processing architectures to eﬃciently sup-
port these types of applications are addressed.
Keywords: Reconﬁgurable streaming eﬃcient
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/743
Optoelectronic methods to beat Moore's Law
John Snowdon (Heriot-Watt University Edinburgh, GB)
It has been obvious for sometime that a new technology will be required if
Moore's law is to be supported. It can readily be seen that optoelectronics oﬀers
a bandwidth higher than copper and indeed that latency can be traded oﬀ against
bandwith when interfacing is involved. One possibility is to build the compute
structure around the optics.
Dynamically Reconﬁgurable Systems-on-Chip
Walter Stechele (TU München, D)
The design space for dynamically reconﬁgurable SoCs can be seen in three di-
mensions:
22 P. M. Athanas, J. Becker, G. Brebner and J. Teich
1. the system architecture for computation and communication, ranging from
dataﬂow-oriented dedicated logic blocks to instruction ﬂow-oriented micro-
processor cores, from dedicated point-to-point connections to Networks-on-
Chip.
2. the granularity of reconﬁgurable elements, ranging from simple logic Look-
Up-Tables to complex hardware accelerator engines and reconﬁgurable in-
terconnect structures.
3. the conﬁguration life cycle, ranging from application changes (in the order of
seconds) to instruction-based reconﬁguration (in the order of nanoseconds).
We propose to use dynamically reconﬁgurable computing for video processing in
driver assistance applications. In future automotive systems, video-based driver
assistance will improve security. Video processing for driver assistance requires
real time implementation of complex algorithms. A pure software implemen-
tation, based on low cost embedded CPUs in automotive environments, does
not oﬀer the required real time processing. Therefore hardware acceleration is
necessary. Dedicated hardware circuits (ASICs) can oﬀer the required real time
processing, but they do not oﬀer the necessary ﬂexibility. Speciﬁc driving condi-
tions, e.g. highway, country side, urban traﬃc, tunnel, require speciﬁc optimized
algorithms. Reconﬁgurable hardware oﬀers high potential for real time video
processing and adaptability to various driving conditions.
Our system architecture consists of embedded CPU cores for high-level appli-
cation code, dedicated hardware accelerator engines for low level pixel process-
ing, and an application-speciﬁc memory system. The hardware accelerators and
the memory system are dynamically reconﬁgurable, i.e. hardware accelerator en-
gines can be exchanged during runtime, controlled by the application code on
the CPU. The life cycle of a conﬁguration depends on the change of driving
conditions. A requirement on the reconﬁguration time is given by the frame rate
of the video signal, e.g. 40 msec for the exchange and relocation of new engines.
Keywords: Dynamic reconﬁguration, design space, video processing
Extended Abstract: http://drops.dagstuhl.de/opus/volltexte/2006/744
Libraries for Reconﬁgurable Computing
Jürgen Teich (Universität Erlangen, D)
We would like to present ideas for the subsequent Monday evening breakout
session on the concept of libraries for dynamically reconﬁgurable computers.
In particular, we would like to address the following questions:
(1) What are the application domains where such libraries could be useful?
(2) What are the module types that could be collected?
Finally, also the questions of exchange formats and on the moderation of
such libraries will be addressed.
Dynamically Reconﬁgurable Architectures 23
Keywords: Library of Modules for Dynamically Reconﬁgurable Computers
Joint work of: Teich, Jürgen; van der Veen, Jan
Mojette Transform Implementation on Reconﬁgurable
hHrdware
József Vásárhelyi (University of Miskolc, H)
Inscribing invisible marks (watermarking) into an image has diﬀerent applica-
tions such as copyright, steganography or data integrity checking. Many diﬀerent
techniques have been employed for the last years in diﬀerent spaces (Fourier,
wavelet, Mojette domains, etc.). The presentation outline the development work
related to create a functional block scheme of Mojette transform and Inverse
Mojette transform using reconﬁgurable hardware.
Keywords: Image processing, Mojette transformation
Joint work of: Vásárhelyi, József; Serfözö, Péter
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/746
PISC: Polymorphic Instruction Set Computers
Stamatis Vassiliadis (Delft University of Technology, NL)
We introduce a new paradigm in the computer architecture referred to as Poly-
morphic Instruction Set Computers (PISC). This new paradigm, in diﬀerence
to RISC/CISC, introduces hardware extended functionality on demand without
the need of ISA extensions. We motivate the necessity of PISCs through an ex-
ample, which arises several research problems unsolvable by traditional architec-
tures and ﬁxed hardware designs. More speciﬁcally, we address a new framework
for tools, supporting reconﬁgurability; new architectural and microarchitectural
concepts; new programming paradigm allowing hardware and software to coex-
ist in a program; and new spacial compilation techniques. The paper illustrates
the theoretical performance boundaries and eﬃciency of the proposed paradigm
utilizing established evaluation metrics such as potential zero execution (PZE)
and the Amdahl's law. Overall, the PISC paradigm allows designers to ride the
Amdahl's curve easily by considering the speciﬁc features of the reconﬁgurable
technology and the general purpose processors in the context of application spe-
ciﬁc execution scenarios.
Keywords: Polymorphic Processors, Polymorphic Instruction Set, Reconﬁg-
urable Computing, microcode, PISC
Joint work of: Vassiliadis, Stamatis; Kuzmanov, Georgi; Wong, Stephan;
Moscu-Panainte, Elena; Gaydadjiev, Georgi; Bertels, Koen; Cheresiz, Dmitry
24 P. M. Athanas, J. Becker, G. Brebner and J. Teich
Full Paper:
http://ce.et.tudelft.nl/publications.php
See also: S. Vassiliadis, G.K. Kuzmanov, S. Wong, E. Moscu Panainte, G.
N. Gaydadjiev, K. Bertels, D. Cheresiz, PISC: Polymorphic Instruction Set
Computers, Proceedings of the International Workshop on Applied Reconﬁg-
urable Computing (ARC 2006), pp. 274-286, Delft, The Netherlands, March
2006, LNCS 3985
Reliability-Aware Power Management of Multi-Systems
(CMPSoCs)
Klaus Waldschmidt (Universität Frankfurt, D)
Long-term reliability of processors in SoCs and NoCs is experiencing growing
attention lately, since decreasing feature sizes and increasing temperatures have
a negative inﬂuence on the lifespan. Recent work suggests an interplay between
power management and reliability, since power management strategies aﬀect the
temperature of processors. Power management strategies are examine, which
target to actively inﬂuence the long-term reliability of a multi-core processor, in
integrated systems as e.g. for SoCs and NoCs.
The approach shows that dynamic parallelism can improve the reliability
of multi-core systems signiﬁcantly. First results were achieved by simulating a
multi-processor using the Self Distributing Virtual Machine (SDVM) as a basis.
Keywords: Organic Computing, Adaptivity, Power Management, Power Con-
sumption, Reliability, SDVM
Joint work of: Waldschmidt, Klaus; Haase, Jan; Hofmann, Andreas; Damm,
Markus; Hauser, Dennis
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/745
A Reconﬁgurable Outer Modem Platform for Future
Communications Systems
Norbert Wehn (TU Kaiserslautern, D)
Future mobile and wireless communications networks require ﬂexible modem
architectures with high performance.
Eﬃcient utilization of application speciﬁc ﬂexibility is key to fulﬁll these
requirements.
For high throughput a single processor can not provide the necessary com-
putational power. Hence multi-processor architectures become necessary. This
Dynamically Reconﬁgurable Architectures 25
paper presents a multi-processor platform based on a new dynamically reconﬁg-
urable application speciﬁc instruction set processor (dr-ASIP) for the application
domain of channel decoding. Inherently parallel decoding tasks can be mapped
onto individual processing nodes. The implied challenging inter-processor com-
munication is eﬃciently handled by a Network-on-Chip (NoC) such that the
throughput of each node is not degraded. The dr-ASIP features Viterbi and
Log-MAP decoding for support of convolutional and turbo codes of more than
10 currently speciﬁed mobile and wireless standards.
Furthermore, its ﬂexibility allows for adaptation to future systems.
Keywords: Domain-speciﬁc reconﬁgurable platform, channel coding, outer-
modem
Joint work of: Wehn, Norbert; Vogt, Timo; Neeb, Christian
Full Paper: http://drops.dagstuhl.de/opus/volltexte/2006/730
Implementing High Performance DSP Systems on
Heterogeneous Programmable Platforms
Roger Woods (Queen's University of Belfast, GB)
The talk will look at the issues of implementing complex DSP systems on hetero-
geneous platforms comprising GPPs, DSP processors and FPGAs. The emphasis
is to develop a high level design ﬂow that allows optimisation to be carried out
in a top-down manner but which can eﬃciently exploit IP cores that will have
pre-determined features e.g. latency. The talk will describe how dataﬂow has
been used and then modiﬁed to allow this to happen. A design example of a nor-
malised lattice ﬁlter will be presented. Some conclusions and future work will be
outlined along with application to reconﬁgurable systems.
Keywords: DSP systems, high level, design , synchronous dataﬂow , IP cores ,
system level design
Joint work of: Woods, Roger; McAllister, John
Towards an Automated Design of Application-speciﬁc
Reconﬁgurable Logic
Peter Zipf (TU Darmstadt, D)
Reconﬁgurable logic is known to have the potential to provide better solutions
than direct ASIC implementations or processors in some situations.
A necessary prerequisite for area advantages compared to ASICs or a better
energy eﬃciency than processors is an application speciﬁc design of the reconﬁg-
urable unit. Adapting it to the speciﬁc requirements of an application helps to
26 P. M. Athanas, J. Becker, G. Brebner and J. Teich
compensate for the area and speed penalty introduced by reconﬁgurability. The
data paths of reconﬁgurable units are best suited for data ﬂow oriented tasks,
but for many applications, both control ﬂow and data ﬂow must be handled, so a
integration of the reconﬁgurable unit into a processor environment is an appro-
priate choice. By analysing the existing design ﬂow and integration possibilities
for reconﬁgurable units, a basis for discussing possible automation schemes and
a standardised interface is deﬁned.
Possible future research could investigate an automated design support for
the building blocks of reconﬁgurable units and the deﬁnition of a standard
processor interface for some classes of reconﬁgurable units.
Keywords: Application-speciﬁc reconﬁgurable units, processor integration, de-
sign automation
Joint work of: Zipf, Peter; Glesner, Manfred
Extended Abstract: http://drops.dagstuhl.de/opus/volltexte/2006/731
