A High-Level Reconfigurable Computing Platform Software Frameworks by Nathan, Darran et al.
ar
X
iv
:c
s/0
40
50
15
v1
  [
cs
.A
R]
  5
 M
ay
 20
04
PROJECT PROTEUS 1
A High-Level Reconfigurable Computing Platform
Software Frameworks
Darran Nathan, Kelvin Lim Mun Kit, Kelly Choo Hon Min,
Philip Wong Jit Chin, Andreas Weisensee
Abstract— Reconfigurable computing refers to the use of
processors, such as Field Programmable Gate Arrays (FP-
GAs), that can be modified at the hardware level to take
on different processing tasks. A reconfigurable comput-
ing platform describes the hardware and software base on
top of which modular extensions can be created, depending
on the desired application. Such reconfigurable computing
platforms can take on varied designs and implementations,
according to the constraints imposed and features desired
by the scope of applications. This paper introduces a PC-
based reconfigurable computing platform software frame-
works that is flexible and extensible enough to abstract
the different hardware types and functionality that differ-
ent PCs may have. The requirements of the software plat-
form, architectural issues addressed, rationale behind the
decisions made, and frameworks design implemented are dis-
cussed.
Keywords— reconfigurable computing, software platform,
project proteus
I. Introduction
Computer processors have for many years been designed
based on the von-Neumann or Harvard architectures. Soft-
ware to be run on these processors are compiled into a set
of processor-specific instructions, which are loaded during
run-time and executed sequentially. Such sequential pro-
cessing of an instruction every few clock cycles works well
enough for typical PC applications such as text editors,
which have low data processing requirement.
However, PCs are also often used for computationally in-
tensive high-throughput data processing, especially in sci-
entific research work. The sequential nature of the typical
PC processor, such as the Intel Pentium, becomes a major
processing bottleneck in such situations. The solution to
this problem has been to use processors with greater clock-
speeds, or to network several of these PCs together into a
cluster or computational grid [1].
More recently, there has been an increasing interest in
the use of reconfigurable hardware chips for such compu-
tationally and data intensive processing. These chips, such
as Field Programmable Gate Arrays (FPGAs), possess a
fundamentally different architecture from the typical von-
Neumann or Harvard type processors. The algorithms to
be executed are normally defined in a hardware descrip-
tion language and compiled into a bitstream, which will
be downloaded to the FPGA as and when use of the algo-
rithm is desired. This bitstream download will reconfigure
the hardware logic on the FPGA accordingly, allowing data
The authors are with the DSP Technology Centre, School of
Engineering, NgeeAnn Polytechnic, Singapore. (e-mail: dar-
ran@projectproteus.org [Darran Nathan]).
passed into the FPGA to be processed in hardware, in par-
allel.
Several reconfigurable computing research projects [2] [3]
[4] focus on developing new, improved designs of reconfig-
urable chips. Other groups [5] [6] [7] utilize off-the-shelf
FPGAs, such as those from Xilinx [8], and work on is-
sues such as logic placement and routing optimization [9].
Project Proteus [10] was initiated by the DSP Technology
Centre of NgeeAnn Polytechnic (Singapore) to develop a
low-cost FPGA-based reconfigurable computing platform
for typical PCs, with off-the-shelf hardware components
and a portable software platform layer, that is flexible and
extensible enough to abstract the different hardware types
and functionality that different PCs may have. This pa-
per discusses the requirements and design of this software
platform.
Section II describes the requirements of the Proteus Soft-
ware Platform, Section III discusses the architectural issues
addressed and the design of the software platform, Section
IV explains how the software platform deploys algorithms
to available hardware, and finally Section V concludes this
paper.
II. Requirements of the Proteus Software
Platform
To understand the architectural design of the software
platform, it will be useful to first discuss the requirements
imposed by the desired use and level of flexibility of the
platform.
Firstly, the goal of the project has been to develop a
PC-based reconfigurable computing platform. PCs run a
variety of operating systems (OS), such as Microsoft Win-
dows and Linux. It is therefore desirable for the software
platform to be portable across various OS environments.
Secondly, being PC-based also brings the advantage of
being able to utilize the various PC resources, such as plen-
tiful RAM and harddisk storage space, and network con-
nectivity. The software platform must be able to abstract
access to sink / source data from these resources. On top
of that, there must also be the possibility of using several
FPGA chips concurrently (which may exist on several dif-
ferent PCI boards).
Thirdly, the high level of variability of available numbers
and types of PC resources as well as reconfigurable proces-
sors means that the software platform has to be highly
modular, with hardware abstraction modules that can be
dynamically loaded according to the available resources.
Fourthly, this wide resource variation also has an im-
PROJECT PROTEUS 2
plication on the deployability of algorithms - certain algo-
rithm implementations may be suitable for execution only
on certain processor types, eg) a reconfigurable hardware
bitstream compiled for a Xilinx Virtex FPGA cannot be
downloaded to an Altera [11] Stratix FPGA, though both
chips may exist in the same PC. The software platform will
therefore have to match the available hardware types with
the available compatible algorithm implementations.
Finally, all this need for flexibility in the software plat-
form of being able to load different hardware abstraction
and algorithm implementation modules means that such
modules should be easily created in a high-level language
that most programmers are familiar and comfortable with.
III. Architecture of the software platform
Considering the requirements set out in Section II, a
high-level and modular software platform frameworks was
designed.
The requirements for portability across OS environ-
ments, modularity of extensions, and ease of programma-
bility, led to the Java language being selected for imple-
mentation of the software platform. This allows the soft-
ware platform to be run on any computer that has a Java
Virtual Machine (JVM) installed, while the high-level and
object-oriented nature of the language satisfies the require-
ments of dynamically loadable modules that can be easily
programmed in a widely-adopted language.
To modularize its functionality, the software platform
has been divided into four main component blocks: the
Proteus Software Platform (PSP) core, which holds the
common set of interfaces and functionality, and three other
components: the Proteus Application, Hardware Abstrac-
tion Modules (HAMs), and Software Modules, that are de-
ployed according to the available functionality on the PC
and the desired application. The use of Java allows each of
these component modules to be distributed as individual
JAR files. This segmentation is illustrated in Figure 1 and
described in greater detail below.
Proteus Application
Software Modules
Hardware Abstraction
Modules
PS
P 
Co
re

Fig. 1
Components of the Proteus Software Platform
A. Software Modules
An Algorithm block defines a unit of operations that
receives data at an input, processes it, and sends the results
out through an output. This is commonly represented by
a block as shown in Figure 2.
AlgorithmInput AlgorithmOutput
Algorithm
Fig. 2
A typical algorithm block representation
Each of these blocks is usually of a processor-specific
implementation, such as a compiled Java class, or an FPGA
hardware implementation bitstream.
However, the Proteus Software Platform is intended to
be run in environments where the available processor types
are variable and determined only during run-time, and
where Algorithms may have a number of implementations
for different processor types.
Hence there is a need for a different Algorithm structure,
one which allows for a high level description of the connec-
tivity between Algorithm blocks, while allowing each block
to have multiple implementations for the various processor
types.
The resulting design takes on a ’shell/implementation’
architecture, as shown in Figure 3. In this structure, the
Algorithm ’shells’ are connected up to one another, and
define the input/output data types. A ’shell’ can be associ-
ated with multiple ’implementations’, each of which is com-
patible with a different processor type (such as an FPGA
or JVM).
Jvm Implementation
FPGA Implementation
ASIC Implementation
AlgorithmInput AlgorithmOutput
Sh
el
l
Im
pl
em
en
ta
tio
ns

Algorithm
Fig. 3
The Algorithm ’shell/implementation’ structure
Connecting up a number of ’shells’ will therefore create a
high level data flow graph, ensuring that data will be passed
correctly from one algorithm to the next, independent of
where the associated ’implementations’ are deployed. This
is illustrated in Figure 4.
PROJECT PROTEUS 3
Jvm Implementation
Shell
Implementations
Algorithm X
Jvm Implementation
Algorithm Y
Aggregator Algorithm
Fig. 4
Connecting up a number of Algorithm ’shells’ to form a
data flow graph
B. Hardware Abstraction Modules (HAMs)
The need for the ability of the software platform to utilize
various kinds of processor types and other PC resources
implies a need to define a common layer of abstraction
to all these resources. This abstraction layer must provide
information on the type of Algorithm implementations that
are compatible with corresponding physical hardware, as
well as whether a compatible Algorithm implementation
can be deployed to that hardware (eg, if the processor is
not already overloaded).
The abstraction layer designed to satisfy the above re-
quirements consists of modelling the desired properties of
one or more physical hardware resources in one or more
’virtual processor’ entities, as shown in Fig 5.
FPGA
da
ug
hte
rca
rd
Physical
Hardware
Abstracted
Hardware
Software
Module
Virtual Processor
Type = ‘FPGA’
Virtual Processor
Type = ‘JVM’
Algorithms
Fig. 5
Abstraction of physical hardware via ’Virtual Processors’
These ’virtual processors’ will be queried by the software
platform to determine the compatibility and deployability
of a particular Algorithm implementation, as described in
detail in Section IV.
For a particular physical hardware resource (such as an
FPGA processor board, or a storage media), the ’virtual
processor’ is part of a larger package called the ’Hardware
Abstraction Module’ (HAM), which is a distribution JAR
of all the hardware-specific components (such as the inter-
faces to the software platform, and the OS-specific device
drivers).
C. Proteus Application
The Proteus Application serves two purposes - it pro-
vides an administrative interface to the end-user, and de-
fines the mechanism by which data is passed from one Al-
gorithm to another.
The administrative interface allows the end-user to per-
form such operations as starting / stopping the platform
or selecting the desired algorithm for download.
The data passing mechanism is defined in the Proteus
Application because various techniques exist, such as Com-
municating Sequential Processes (CSP) [12] and Dataflow
Process Networks (PN) [13], and utilization of a particular
mechanism is application-dependent.
Figure 6 shows the set-up of Processors and Algo-
rithms, with the AlgorithmImplementations deployed to
corresponding Processors. The portions concerning data-
exchange, which have to be implemented by the Proteus
Application, are marked accordingly.
Algorithm 1
type="jvm"
type="fpga.xilinx.virtex"
type="fpga.xilinx.virtex"
Y FPGA Algorithm ImplementationX Jvm Algorithm Implementation
type="jvm"
JVM
Virtual Processor
FPGA
Virtual
Processor
Implemented by
Proteus
Application
Algorithm
Output
Algorithm
Input
Legend
Data Path
Algorithm X Algorithm Y
So
ftw
ar
e 
M
od
ul
e
Ab
st
ra
ct
ed
 H
ar
dw
ar
e
Fig. 6
Data exchange mechanism implemented by Proteus
Application
IV. Deployment of Algorithms
For the software platform to perform the tasks of match-
ing Algorithm implementations with virtual processors, a
technique of tagging both of these with some common form
of type compatibility identification is needed. This tagging
should offer the ability to define different levels of com-
patibility, such as that at a specific chip model or at a
higher family level. For example, an FPGA Algorithm im-
plementation may be compatible with only the Xilinx Vir-
tex XCV100 chip, or may be compatible with all chips in
the Virtex family, and should be allowed to be tagged as
such.
The tagging mechanism designed consists of a string in
the general form ”type.make.family.model.otherInfo”, that
can have any number of descriptors separated by dots
(”.”), depending on the level at which an Algorithm im-
PROJECT PROTEUS 4
plementation or virtual processor is specific. For exam-
ple, an Algorithm implementation that can be downloaded
to a Xilinx Virtex series XCV100 chip may be tagged
”fpga.xilinx.virtex.xcv100”, while a virtual processor that
accepts all Xilinx Virtex Algorithm implementations may
have that of ”fpga.xilinx.virtex”. The specificity of a tag
increases with the number of descriptors. Such a scheme
can be illustrated as a tree graph, as shown in Figure 7.
type
make
family
model
FPGA
Altera Xilinx
Virtex XC
XCV100 XCV150
m
o
re
 s
pe
cif
ic
Fig. 7
Algorithm implementation / virtual processor tagging
scheme
In this tree graph, the least specific descriptor is at
the top - in this case the ’type’ level, with the descrip-
tor value ”FPGA”. Moving down one level introduces the
next more specific ’make’ descriptors, so a tag here may
be ”FPGA.Xilinx”. When the Proteus Software Platform
tests whether an Algorithm implementation is compatible
with a virtual processor, it needs to only ensure that the
tag of the virtual processor is located at the same point
on the tree, or is an ancestor of that of the Algorithm
implementation. That is, a more specific (lower in the
tree) Algorithm implementation can only be deployed to
an equal or less specific virtual processor (equal or higher
in the tree). For example, an Algorithm implementation
tagged ”FPGA.Xilinx.Virtex.XCV100” is compatible with
a virtual processor of type ”FPGA.Xilinx.Virtex”, but not
”FPGA.Xilinx.Virtex.revB”.
For each Algorithm to be deployed, the software platform
runs through the list of available Algorithm implementa-
tions and virtual processors to identify those that are com-
patible. Once a match is found, the virtual processor is
queried if the matching Algorithm implementation can be
deployed to it. This deployability step is necessary to test
if the associated hardware has the necessary capacity to
run the compatible Algorithm implementation, e.g. if an
FPGA has sufficient available space. If not, the process
is repeated till a match that is both compatible and de-
ployable is found. This flow for Algorithm deployment is
illustrated in Figure 8.
Select next Algorithm to deploy
Select next available
Algorithm Implementation
Select next compatible
virtual processor
Test if Algorithm Implementation is
deployable to virtual processor
Start
End Failed
Next selected
No more available
Next selected
No more available
Next selected
No more available
No
Yes
Algorithm deployed
End
Successful
Fig. 8
Algorithm deployment flow
V. Conclusion
The software platform designed satisfies the require-
ments for a high-level, portable reconfigurable computing
platform frameworks that is highly modular and flexible
enough to utilize the varied resources available on different
PCs.
Acknowledgments
We gratefully acknowledge the funding support provided
by the NgeeAnn Kongsi (Singapore) and NgeeAnn Poly-
technic’s Innovation & Enterprise Office.
References
[1] Globus Alliance http://www.globus.org
[2] RAW Project, Laboratory for Computer Science, MIT
http://cag-www.lcs.mit.edu/raw/
[3] PipeRench Project, Carnegie Mellon University
http://www.ece.cmu.edu/research/piperench/
[4] Garp Project, BRASS Research Group, UC Berkeley
http://brass.cs.berkeley.edu/garp.html
[5] Reconfigurable Computing Group, University of Massachusetts
http://www.ecs.umass.edu/ece/tessier/rcg/
[6] FPGA Research Group, University of Toronto
http://www.eecg.toronto.edu/EECG/RESEARCH/FPGA.html
[7] BYU Configurable Computing Laboratory
http://splish.ee.byu.edu/projects/projects.html
[8] Xilinx Inc. http://www.xilinx.com
PROJECT PROTEUS 5
[9] VPR: Versatile Packing, Placement and Routing for FPGAs
http://www.eecg.toronto.edu/∼vaughn/vpr/vpr.html
[10] Project Proteus, DSP Technology Centre, NgeeAnn Polytechnic,
Singapore http://www.projectproteus.org
[11] Altera Inc. http://www.altera.com
[12] C.A.R. Hoare Communicating Sequential Processes Prentice
Hall, 1986
[13] Edward A. Lee, Thomas M. Parks Dataflow Process Networks
Proceedings of the IEEE, vol. 83, no. 5, pp. 773-801, May 1995
