Operating system support for interface virtualisation of reconfigurable coprocessors by Vuletić, Miljan et al.
Operating System Support for Interface Virtualisation
of Reconfigurable Coprocessors
Miljan Vuletic´, Ludovic Righetti, Laura Pozzi, and Paolo Ienne
Swiss Federal Institute of Technology Lausanne
Processor Architecture Laboratory
IN-F Ecublens, 1015 Lausanne, Switzerland
{Miljan.Vuletic, Ludovic.Righetti, Laura.Pozzi, Paolo.Ienne}@epfl.ch
Abstract
Reconfigurable Systems-on-Chip (SoC) consist of large
Field-Programmable Gate-Arrays (FPGAs) and standard
processors. The reconfigurable logic can be used for
application-specific coprocessors to speedup execution of
applications. The widespread use is limited by the complex-
ity of interfacing software applications with coprocessors.
We present a virtualisation layer that lowers the interfacing
complexity and improves the portability. The layer shifts
the burden of moving data between processor and copro-
cessor from the programmer to the Operating System (OS).
A reconfigurable SoC running Linux is used to prove the
concept.
1. Introduction and Goals
When interfacing application-specific coprocessors with
the rest of the system, designers should respect the spe-
cific interface between the processor and the FPGA. Also,
programmers must explicitly take into account availability
and size of shared memories between processor and FPGA.
Therefore, any new host platform requires redesigning soft-
ware and hardware. Our contribution significantly reduces
the complexity of the programming and design paradigms
and improves the portability of codesigned applications.
The programmer of a traditional system equipped with
an OS is abstracted from the characteristics of the memory
system: he/she generates virtual memory addresses ignor-
ing whether they physically exists. This illusion results in
programming simplicity and code portability. The draw-
back is that the automatic allocation of pages by the OS is,
in general, suboptimal. Figure 1 shows software (Virtual
Memory Manager—VMM) and hardware (Memory Man-
agement Unit—MMU) components of a virtual memory
system.
Processor DiskPHYSICALMEMORY
VIRTUAL
MEMORY
(infinite)
OS (Virtual Memory Manager)Memory Management Unit
FPGA
Coprocessor
Processor
and Memory
PHYSICAL
DP RAM
VIRTUAL
DP RAM
(infinite)
OS (Virtual Interface Manager)Interface Management Unit
a) Virtual Memory System
b) Virtualised Interface
Figure 1. Virtualisation.
Our goal is to describe applications in high-level lan-
guages and corresponding coprocessors in hardware de-
scription languages independently of the target hardware.
An augmented OS, a compiler, and a synthesiser must be
sufficient to port the accelerated application across different
systems.
Virtual Interface Management. The programmer of a
reconfigurable computer should design data exchanges be-
tween the processor and the coprocessor independently of
the physical system. The coprocessor designer should gen-
erate abstract addresses rather than physical ones. To sup-
port this abstraction, two components are added to the ba-
sic system: (1) A hardware support for the translation from
abstract to real addresses (Interface Management Unit—
IMU). (2) A software support in the OS to place dynami-
cally data objects on the interface (Virtual Interface Man-
ager—VIM). Figure 1 shows a practical instance of virtu-
alised interface.
1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1530-1591/04 $20.00 (c) 2004 IEEE 
IDEA
input data size
4KB 8KB 16KB 32KB
26ms 53ms 105ms 211ms
e
xc
e
e
ds
 a
va
ila
bl
e 
m
em
or
y
e
xc
e
e
ds
 a
va
ila
bl
e 
m
em
or
y
19x
20x
15x
n
o
rm
a
l c
op
ro
ce
ss
or
co
pr
oc
es
so
r w
ith
 IM
U
SW
SW (IMU)
SW (DP)
HW
Coprocessor 
versions:
Pure SW 
version:
15x
15x
14ms
15x
e
xe
cu
tio
n 
tim
e 
(m
s)
Figure 2. Measurements on IDEA kernel.
2. Interface Virtualisation Components
Three components implement the virtualisation:
(1) OS Services. Three system calls are provided to
software designers. Firstly, FPGA_LOAD loads a copro-
cessor definition in the reconfigurable hardware. Secondly,
FPGA_MAP_OBJECT identifies the data used by the copro-
cessor. Finally, FPGA_EXECUTE performs the data map-
ping, initialises the IMU, and launches the coprocessor.
(2) Hardware Interface. All coprocessor memory ac-
cesses go through the IMU, which translates them to real
addresses. If no translation is possible, the OS is requested
to handle the translation fault and dynamically place miss-
ing data to interfacing resources (e.g., shared or dual-ported
memory) between processor and coprocessor.
(3) Interface Management. If interrupted by the IMU,
the OS rearranges the current mapping to the on-chip
memory—logically organised in pages—to resolve the page
fault. Once the interrupt is resolved, the coprocessor ex-
its from the stalled state and continues. During operation
(if needed) and at task completion, the interface manager
copies the produced data back to the user space.
3. Experimental Setup and Demonstration
A VIM system was implemented using a board based on
the Altera Excalibur device (with an ARM processor and
FPGA) and running the Linux OS. The IMU is designed in
synthesizable VHDL and interfaces the coprocessor with a
dual port memory. The VIM is realised as a kernel mod-
ule. The IDEA cryptographic algorithm is implemented for
demonstration. The software uses the VIM services to call
the coprocessor interfaced with the IMU.
Figure 2 shows execution times for three versions of
the application: pure software, a classic coprocessor, and a
VIM-based coprocessor. For the VIM-based version, three
components of the execution time are measured: (1) hard-
ware time, (2) software time for the dual port memory man-
agement, and (3) software time for the IMU management.
Even with virtualisation, coprocessors provide significant
advantage. The overhead for the IMU management is ac-
ceptable (5–7% of the total execution time). There is a hard-
ware execution overhead of up to 20% comparing to the typ-
ical coprocessor. This is due to the FPGA implementation
of the IMU and it could be lowered. The experiments for
the VIM-based version are performed by simply changing
the input data size, with no need to modify neither the appli-
cation code, nor the coprocessor design, even when the data
sets exceed the capacity of the available dual-port memory.
4. Related Work
Standardised buses [3] and memory wrappers [2] make
the details of the underlying memory interface transparent
to the designer. Our idea is not in the standardisation of
the interface details but in the dynamic allocation of the in-
terfacing memory. Some researchers have considered man-
aging reconfigurable lattice across different tasks [4] and
reconfigurable hardware virtualisation [1] providing the il-
lusion of infinite resources. The type of virtualisation we
introduce is orthogonal and complementary to these.
5. Conclusions
In this work, we add a Virtual Interface Manager to a
reconfigurable computing platform in order to achieve a
straightforward programming paradigm, and ease the porta-
bility of applications. The approach is tested on a real
system by running a complex cryptographic algorithm en-
hanced with an application-specific coprocessor. A signifi-
cant speed-up is achieved while the virtualisation overhead
is shown to be acceptable.
References
[1] M. Dales. Managing a reconfigurable processor in a general
purpose workstation environment. In Proceedings of the De-
sign, Automation and Test in Europe Conference and Exhibi-
tion, Munich, Mar. 2003.
[2] F. Gharsalli, S. Meftali, F. Rousseau, and A. A. Jerraya. Auto-
matic generation of embedded memory wrapper for multipro-
cessor SoC. In Proceedings of the 39th Design Automation
Conference, New Orleans, La., June 2002.
[3] C. K. Lennard, P. Schaumont, G. De Jong, A. Haverinen, and
P. Hardee. Standards for system-level design: Practical re-
ality or solution in search of a question? In Proceedings of
the Design, Automation and Test in Europe Conference and
Exhibition, pages 576–583, Paris, Mar. 2000.
[4] H. Walder and M. Platzner. Online scheduling for block-
partitioned reconfigurable devices. In Proceedings of the De-
sign, Automation and Test in Europe Conference and Exhibi-
tion, Munich, Mar. 2003.
2
