Self Reconfiguration of Clock Networks on FPGA: Methodology for partial reconfiguration of synchronous modules at run-time by Hansen, Sindre
Master of Science in Electronics
June 2011
Kjetil Svarstad, IET
Submission date:
Supervisor: 
Norwegian University of Science and Technology
Department of Electronics and Telecommunications
Self Reconfiguration of Clock Networks
on FPGA
Methodology for partial reconfiguration of synchronous modules at
run-time
Sindre Hansen

Problem Description
With the project as basis, the objectives are:
- To further develop the methodology for clocked, dynamic, reconfigurable
modules.
- To integrate the methodology for clocking with the HWOS-methodologies
previously developed.
The methods should be tested by implementing a chosen system on an FPGA.
The results should be compared with published results if such exist.
Assignment given: 17. January 2011
Supervisor: Kjetil Svarstad, IET
i
Summary
In this thesis, methodology for partial self-reconfiguration of synchronous mod-
ules has been developed. A simple software-based scheduler has been built for
scheduling synchronous modules on the FPGA. The motivation behind this
was that partial reconfiguration of synchronous modules at run-time had not
been performed earlier in the AHEAD-project. Also, the project report writ-
ten by the same author as this thesis has shown that a synchronous module
can be replaced in a bitfile. However, the project report did not perform this
reconfiguration at run-time.
Based on the project report, the problem has been decomposed and simple
tests using clocked flip-flop designs have been performed on the FPGA. These
tests forms a proof-of-concept for partial self-reconfiguration of synchronous
modules on the Virtex-4 FPGA. However, the tests also showed that the re-
configuration time was quite high. It took several seconds to write one partial
bitstream to the configuration memory.
Vegard Endresen has previously made a backend module for data transfer
between the HWOS and a reconfigurable module. Experiments were performed
in this thesis to see if the clocking methodology could be integrated into this
backend module. The module could be built with the methodology, but a
running solution on the FPGA was not shown.
The software part of the HWOS was rewritten from scratch as the previous
version was not thoroughly analyzed. A round-robin scheduler using priority
queues has been implemented. A test-driven development technique has been
used for development, hopefully making the system more robust. The sched-
uler is a part of a daemon running on the embedded system, where a message
server handles requests for new processes and a placer places new tasks on the
FPGA. The complete system was initially based on ideas and code developed
by Sverre Hamre and Vegard Endresen in previous AHEAD-projects.
Foreword
This is the final report on my master thesis in the programme option Design
of Digital Systems under the study programme Electronics at the Norwegian
University of Science and Technology (NTNU). The work has been done during
20 weeks in the spring semester of 2011.
For me personally the work has been challenging, but the learning outcome
has been large. It is exciting to walk through all the phases of building an
embedded system, from planning and researching to the final implementation.
I definitely think this experience will help me as an engineer out in the industry.
The challenges when building a complex system like this are many. It is a
lot of work to read up on the existing work in the field, especially since partial
self-reconfiguration on FPGA is a rather new concept. The practical solutions
may be few and there may not be a “de facto standard” way of doing things.
There’s not tons of learning books on the subject, but articles and research
projects with suggested implementations. Another challenge is working with
an embedded system with parts in both hardware and software. In such a
complex system, there are many parts that will have to work together and the
use of several tools have to be learnt to accomplish the final solution.
I see the learning outcome of this thesis as threefold. Firstly, I have gained
a lot of practical experience with the tools and the technology. I have learned
a lot on C-programming, automated testing and working with the Xilinx tools.
The second is all the new knowledge I’ve gained on partial self-reconfiguration
and how a hardware operative system can be built in practise. The last is
how to carry out a quite large project. It is interesting to see how important
a good working methodology is (methodology is presented in section 1.2) and
how I should relate to work done by others. Likewise important is how I can
document and structure my work so others can relate to my work.
I would like to thank professor Kjetil Svarstad for guidance and help dur-
ing the semester. I also would like to thank Magnus Namork, who has been
writing a master thesis on the Network On Chip-part of the AHEAD-project,
for tool-specific discussions.
Trondheim, june 2011
Sindre Hansen
i
Acronyms and expressions
FPGA
Field Programmable Gate Array. An array of programmable logic de-
signed to be reconfigured several times by the designer or end-user.
BRAM
Block Random Access Memory. Small memory blocks integrated on the
FPGA.
LUT
Lookup-table. In digital logic, a LUT is typically implemented using
multiplexers.
CLB
Configurable Logic Block. One unit of the programmable logic on the
FPGA. Typically has LUTs, multiplexers, flip-flops, latches and more.
Slice
The CLB on Xilinx Virtex FPGAs.
FF
Flip-flop. One bit storage element (register) on the FPGA. There are
typically one or more of these in each CLB.
Bitfile
In this thesis, this is typically a file containing a complete or partial
configuration for the programmable logic on the FPGA.
HWOS
Hardware Operative System. In this thesis, HWOS is a operative system
for controlling execution of tasks on an FPGA.
AHEAD
Ambient Hardware, Embedded Architectures on Demand.
i
Contents
1 Introduction 1
1.1 What has been done . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Contribution from this thesis . . . . . . . . . . . . . . . . . . . . 3
1.4 Outline of this report . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Related work 5
2.1 Work done in the AHEAD-project . . . . . . . . . . . . . . . . . 5
2.1.1 Sverre Hamre’s master thesis (june 2009): Framework
for self reconfigurable system on Xilinx FPGA . . . . . . 5
2.1.2 Vegard Endresen’s master thesis (june 2010): Hardware-
software intercommunication in reconfigurable systems . 5
2.1.3 Sindre Hansen’s project report (december 2010): Self
Reconfiguration of Clock Networks on FPGA . . . . . . 6
2.2 Related work on partial self-reconfiguration and HWOS . . . . . 6
2.2.1 Klaus Danne (2004): Memory Management to Support
Multitasking on FPGA Based Systems . . . . . . . . . . 6
3 Theory 8
3.1 Development tools . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Base platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.2 The development board . . . . . . . . . . . . . . . . . . 9
3.2.3 The FPGA . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.4 Base VHDL-design for the Suzaku-V . . . . . . . . . . . 11
3.2.5 Internal Configuration Access Port (ICAP) . . . . . . . . 11
ii
CONTENTS iii
3.2.6 ATMARK-dist and uClinux-dist . . . . . . . . . . . . . . 11
3.2.7 uClibc . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Addressing the bitstream for Virtex-4 . . . . . . . . . . . . . . . 12
3.3.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.3 Addressing of frames . . . . . . . . . . . . . . . . . . . . 13
3.3.4 The term frame in CLBRead . . . . . . . . . . . . . . . . 16
3.3.5 The term frame in icap write . . . . . . . . . . . . . . . 16
3.3.6 The term frame in the documentation from Xilinx . . . . 16
3.4 The existing framework for partial self-reconfiguration . . . . . . 17
3.4.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4.2 CLBRead . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4.3 icap write . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.4 Bus macros . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5 Synchronous design . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.3 Motivation for using synchronous design . . . . . . . . . 20
3.5.4 Problems in asynchronous designs . . . . . . . . . . . . . 20
3.5.5 Timing requirements for a reconfigurable module . . . . 21
3.6 Defining an interface for clock signals . . . . . . . . . . . . . . . 22
3.6.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.6.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.6.3 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.6.4 Global clock buffer, BUFGCE . . . . . . . . . . . . . . . 24
3.6.5 Clock root spines . . . . . . . . . . . . . . . . . . . . . . 24
3.6.6 User Constraints File (UCF) . . . . . . . . . . . . . . . . 26
3.6.7 Directed Routing (DIRT) . . . . . . . . . . . . . . . . . 26
3.6.8 Setting up a base design . . . . . . . . . . . . . . . . . . 27
3.6.9 Setting up a reconfigurable module . . . . . . . . . . . . 27
3.6.10 Making base design compatible with synchronous modules 29
CONTENTS iv
3.6.11 Making a scalable solution . . . . . . . . . . . . . . . . . 30
3.7 Scheduling on FPGA . . . . . . . . . . . . . . . . . . . . . . . . 31
3.7.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.7.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.7.3 Type of scheduling decisions for reconfigurable hardware 32
3.7.4 Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.7.5 Queue structure . . . . . . . . . . . . . . . . . . . . . . . 34
3.7.6 Interrupter . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.7.7 Types of scheduling . . . . . . . . . . . . . . . . . . . . . 36
3.8 Automated testing . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.8.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.8.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.8.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.8.4 Test-driven development (TDD) . . . . . . . . . . . . . . 38
3.8.5 Check: A unit testing framework for C . . . . . . . . . . 39
4 Implementation: Partial reconfiguration of synchronous mod-
ules at run-time 41
4.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4 Requirements and design . . . . . . . . . . . . . . . . . . . . . . 43
4.4.1 Setting up the base designs on the FPGA . . . . . . . . 43
4.4.2 Building reconfigurable modules in ISE . . . . . . . . . . 45
4.5 Development of methodology through test suites . . . . . . . . . 46
4.5.1 Analysis of first test suite: Simple flip-flop designs . . . . 46
4.5.2 Analysis of second test suite: Instruction- and data cache
backend . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5 Implementation: Scheduler and HWOS 49
5.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2 Structure of code and compiling . . . . . . . . . . . . . . . . . . 49
5.3 Documentation of HWOS in Doxygen . . . . . . . . . . . . . . . 49
CONTENTS v
5.4 Work done by Vegard Endresen . . . . . . . . . . . . . . . . . . 50
5.5 General structure of the HWOS . . . . . . . . . . . . . . . . . . 50
5.6 General structure of the message server . . . . . . . . . . . . . . 50
5.7 General structure of the placer . . . . . . . . . . . . . . . . . . . 51
5.8 General structure of the timer . . . . . . . . . . . . . . . . . . . 51
5.9 General structure of the scheduler . . . . . . . . . . . . . . . . . 51
5.10 List of scheduler queues: hsqlist . . . . . . . . . . . . . . . . . . 52
5.11 Process structure: hprocess . . . . . . . . . . . . . . . . . . . . 55
5.12 Scheduler queue structure: hsqueue . . . . . . . . . . . . . . . . 55
5.13 Rewritten version of icap write: hicap . . . . . . . . . . . . . . . 56
6 Verification and results: Partial reconfiguration of synchronous
modules at run-time 57
6.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.2 First test suite: Simple flip-flop designs . . . . . . . . . . . . . . 57
6.2.1 Test case 1: Cut the reconfigurable modules from the
base designs . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.2.2 Test case 2: Build the base design and the reconfigurable
module separately . . . . . . . . . . . . . . . . . . . . . . 59
6.2.3 Test case 3: Build the base design with DIRT . . . . . . 60
6.3 Second test suite: Instruction- and data cache backend . . . . . 62
6.3.1 Test case 1: Make the backend compatible with syn-
chronous modules . . . . . . . . . . . . . . . . . . . . . . 62
6.3.2 Test case 2: Make the backend compatible with syn-
chronous modules using dummy module . . . . . . . . . 63
7 Verification and results: Scheduler 65
7.1 Portability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.2 Test strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.3 Description of test suites . . . . . . . . . . . . . . . . . . . . . . 66
7.4 Test results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.4.1 The HWOS-daemon . . . . . . . . . . . . . . . . . . . . 66
7.4.2 The HWOS-library . . . . . . . . . . . . . . . . . . . . . 67
CONTENTS vi
8 Discussion 68
8.1 Partial self-reconfiguration of synchronous modules . . . . . . . 68
8.2 Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9 Conclusion 71
10 Further work 72
10.1 Better testing of each part of the complete framework . . . . . . 72
10.2 Further development . . . . . . . . . . . . . . . . . . . . . . . . 73
10.2.1 Partial self-reconfiguration . . . . . . . . . . . . . . . . . 73
Bibliography 74
A Tutorial for uClinux 77
A.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
A.2 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
A.3 Download and compile ATMARK-dist . . . . . . . . . . . . . . 78
A.4 Setting up Network File System (NFS) on Suzaku and develop-
ment machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.5 Creating kernel modules . . . . . . . . . . . . . . . . . . . . . . 82
A.6 Compiling HWICAP-driver for uClinux . . . . . . . . . . . . . . 83
B Compiling the HWOS-code 84
C VHDL-code 87
D HWOS 95
D.1 The HWOS daemon . . . . . . . . . . . . . . . . . . . . . . . . 95
D.2 HWOS-library . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Chapter 1
Introduction
1.1 What has been done
For partial reconfiguration of synchronous modules, several experiments have
been performed on the FPGA using the Suzaku development board. A proof of
concept has been developed and it has been shown that synchronous modules
can be placed on the FPGA by writing them to ICAP (Internal Configuration
Access Port).
For integration with the previously developed HWOS-methodologies, sev-
eral experiments has been done with the instruction- and data cache backend
made in [End10]. The software part of the HWOS-daemon has also been
rewritten from scratch.
For a real-world system, a software-based scheduler has been integrated in
the HWOS-daemon. This scheduler is working, but it has not been performed
scheduling using FPGA-based modules.
1.2 Methodology
The following are the methodology and working philosophy adopted in this
thesis. Much of the work done in this thesis is experimental and based on an
existing framework. Parts of the system has been designed for or somehow
depends on the propertiary, Xilinx-based FPGA platform. This is in contrast
to a completely constructive approach where you build your own system from
scratch. Some of the listed points are extra precautions that should be taken
because of this fact.
Isolating the problem should be done at an early stage. The reconfigurable
framework consists of many parts that will have to work together and if a
simple run time reconfigurable module should be tested, a complete system
would have to be set up. As discussed in [Han10], this process takes quite
1
1. Introduction 2
some time and the documentation of the FPGA-platform from Xilinx does not
always contain the level of detail needed.
Making good assumptions is another aspect that should be given extra
thought in this kind of thesis. The framework is built during several master
thesises and the experimental approach is central in most of them. Because
of this and because the field is rather new and unexplored, one should be es-
pecially critical to claims and hypothesis in earlier work and available articles
on the subject. Vague formulations must of course be avoided. It should be
made clear if some parts are uncertain or not thoroughfully tested.
A broad litterature search should be done. It is really helpful to see how
other people have solved the same or a similiar problem. Their experiences
and thoughts can be important guidelines when choosing an implementation or
making design decisions. If other implementations are easily accessible, they
can be integrated into the project and with that “reinventing the wheel” can
be avoided.
The portability of the framework should be taken into consideration. Tar-
geting a completely abstract FPGA is probably to limiting, but the technology
and the components on the test-platform should be compared to available tech-
nology on other FPGAs. This is especially important for the work done on the
given framework, as one of the goals is to be more independent of propertiary
tools. The embedded software in this thesis has been written in ANSI-C. It can
easily be compiled on a standard PC (x86) architecture and for the embedded
PowerPC-processor that is present on the Suzaku development board. This
can be valuable if the code must be ported to another microprocessor.
The code should be well structured. The code in this thesis has been written
in the low-level languages C and VHDL. In contrast to higher level languages
like Java or Python, these languages does not contain the same amount of ab-
straction from implementation-specific details. Poorly written code can there-
fore be more difficult to read and isolating any problems may become hard if
the code is not modularized good enough. Several object-oriented principles
like data abstraction, modularity, polymorphism and encapsulation has been
utilized when writing the C-code for this thesis. This has proven crucial when
debugging and testing the system.
The developed systems should be well tested. This can be a weakness in
some of the previous master thesises on self reconfiguration. Because proof
of concepts in the field take so much time to develop, it must be taken into
consideration that some of the previous work has not been rigorously tested.
Testing and documentation has been given large emphasizing in this thesis.
Code has been documented using the documentation system Doxygen and
the test framework Check has been used for testing the different parts of the
embedded software.
An iterative approach has in general been used for developing the method-
ologies and systems in this thesis. Experiments has been set up in test suites
1. Introduction 3
and test cases.
The implemented system, the decisions taken and the results should be dis-
cussed. All of these can also be compared to available litterature on the field.
Even if it may be hard to do so, weaknesses in the implemented system and
any bad decisions in the design should be revealed and discussed. These expe-
riences are highly valuable to people that want to do further development or
others doing research in the same field.
Most important of all, the implemented system should function determinis-
tically. A product that has low power consumption, high performance and low
space requirements may be completely useless if it fails now and then. This
is especially true for embedded and real-time systems. For a hard real-time
system, missing a task’s deadline means the task has failed completely and
there is no reason to finish it. For an embedded system, it might be impossible
to do any further development or bug-fixing after it has been installed in the
field. It’s even worse if the system has been mass-produced. In any case, if the
system is non-deterministic and suddenly fails, this could become very costly,
ruin the company’s reputation or even threaten human lifes.
1.3 Contribution from this thesis
The most important contributions from this thesis are:
• A more robust software-part of the HWOS and a well-tested library for
code used in this HWOS.
• A simple round-robin scheduler made specifically for partial reconfigura-
tion of FPGA-based tasks.
• A proof of concept showing that partial self-reconfiguration of synchronous
modules can be performed on an FPGA.
• A methodology for constraining and making a static interface for clock
signals.
1.4 Outline of this report
The next chapters are divided into Related work, Theory, Implementation and
Verification and results.
Related work presents some of the earlier work done in the AHEAD-project
and related work done worldwide.
Theory is a chapter for introducing the reader to the most relevant back-
ground theory for this thesis. Some of the theory, like parts of the scheduling
1. Introduction 4
theory, is quite general, but it should be discussed how these parts can be re-
lated to the problem in this thesis. Most of the references to the bibliography
will from the Theory chapter.
Implementation shows how the final system has been implemented, what
choices has been taken and why they were taken. This chapter will make heavy
use of information and discussions from the Theory chapter.
Verification and results will document which parts of the implemented sys-
tem was tested and results from the verification of the system.
Inside the Theory- and Implementation-sections there are also Objective-
sections. For the Theory, these small sections will answer the question “why is
this information provided?”, while the question “why has this been done?” is
answered in the case of the Implementation-part. The Objective-sections are
marked with the symbol in figure 1.1 to make it extra clear that they should
answer these questions and nothing else.
O
Figure 1.1: Symbol for Objective-sections.
Chapter 2
Related work
The most relevant earlier and related work is presented in this chapter. The
work is briefly evaluated and some critisism has been given.
This chapter has two purposes. The first is to give the reader a better
overview of the work done in the AHEAD-project and discuss some of the
challenges the earlier work reveals. The second purpose is to introduce the most
relevant research done on partial self-reconfiguration and hardware operative
systems for FPGA.
2.1 Work done in the AHEAD-project
2.1.1 Sverre Hamre’s master thesis (june 2009): Frame-
work for self reconfigurable system on Xilinx FPGA
The most important work done by Sverre Hamre [Ham09] (seen from the per-
spective of this thesis) is probably the software icap write. This work has been
integrated in the scheduler implemented in this thesis and makes it possible
to replace parts of the logic on the FPGA (icap write is further discussed in
section 3.4.3).
Sverre Hamre seems to have tested icap write only on a few different partial
configurations. Because the program is such an important part of the HWOS,
it should be much better tested.
2.1.2 Vegard Endresen’s master thesis (june 2010): Hardware-
software intercommunication in reconfigurable sys-
tems
Essentially, Vegard Endresen’s work [End10] is an implementation of a back-
end/interrupter for FPGA tasks. The backend is built in VHDL and is able
5
2. Related work 6
to extract and load data from/into a task running on the FPGA. This func-
tionality can be controlled through an instruction set that is used by software
applications to communicate with the backend. A state loading/saving mod-
ule of this kind is of great importance and a critical part of HWOS. The fact
that it is built in hardware makes it even more interesting when it comes to
performance and parallel execution with the rest of the system.
There were some challenges when trying to understand and integrate Ve-
gard’s system in the HWOS. Perhaps could this have been easier if there
was more documentation on the backend’s internal mode of operation. The
software-part of his work could also have been more well-structured. Vegard
Endresen himself says the following in [End10, page 57]:
Implementation on a Suzaku-sz410 board has confirmed that the
system and its parts are working. That said some components of
the communication framework has been designed without a very
thorough pre-analysis. Primarily this is the case for the HWOS
where there might be much to gain by proper analysis and design.
2.1.3 Sindre Hansen’s project report (december 2010):
Self Reconfiguration of Clock Networks on FPGA
The work done in the project report [Han10] functions as a base for the work
done in this thesis. The results in the project has shown that a clocked module
in a FPGA-bitstream can be pulled out from a source bitstream and success-
fully put into a target bitstream. This operation was done on a PC and not
while the FPGA was running. The target bitstream was later uploaded to the
FPGA and it was verified that it worked as expected.
The project report showed that partial reconfiguration of clocked modules
could be performed, but not why it was possible or how this could be done
at run-time. These are problems that would have to be solved if a working
solution on the FPGA should be realized.
2.2 Related work on partial self-reconfiguration
and HWOS
2.2.1 Klaus Danne (2004): Memory Management to
Support Multitasking on FPGA Based Systems
This article (found in [Dan04]) discusses some of the challenges when sharing
an FPGA for multiple processes. A simple run-time system is introduced
and special focus is on the Memory Management Unit (MMU), which use the
2. Related work 7
concept og Virtual Memory Management to share distributed memory between
several hardware tasks.
Chapter 3
Theory
3.1 Development tools
The development tool for FPGA development was the Xilinx Design Suite
10.1. Included in this suite was:
• Embedded Development Kit 10.1 for building user logic base designs.
• ISE 10.1 for building standalone reconfigurable modules.
• FPGA-Editor for inspecting placed and routed designs. Was also used
for rerouting wires for logic.
• Planahead. Was used for setting constraints in both the base design and
for the reconfigurable module.
For developing the software in this thesis, the following was used:
• The GCC-compiler for development on the development computer and
for cross-compiling on the Suzaku.
• Valgrind for debugging memory leakage problems during the develop-
ment process.
• The Check testing framework (see section 3.8.5) for testing the system.
3.2 Base platform
The development platform in this thesis is a development board from the
japanese company Atmark-Techno. The board has an FPGA from Xilinx and
an embedded microprocessor for implementing the reconfigurable systems. The
website for Atmark-Techno has a download section containing both the base
8
3. Theory 9
VHDL-design for the FPGA and the Linux-distribution for the processor. The
board also has LAN-connection for connecting it to a standard network. This
makes it easy to connect to the board from a development computer and upload
new FPGA- or software-designs.
3.2.1 Objective
O
The objective of this section is to present and give a quick
overview of the basic setup for the development platform. This
should make it easier to understand the results and discussions
in the thesis. A new developer in the AHEAD-project will need
to understand how the different parts of the platform can be set
up and what kind of tools should be used.
3.2.2 The development board
The development board used in this thesis has the specifications shown in table
3.1 and a similar model is shown in figure 3.1. The illustrated model (SZ310-
U00) has some different specifications, among them is the type of the FPGA,
the CPU and the configuration unit.
A photography of the Suzaku board is shown in figure 3.2.
Model SZ410-U00
FPGA-device Xilinx Virtex-4 FX XC4VFX12-SF363
CPU Core in FPGA PowerPC 405 (32bit RISC core)
CPU Clock 350 Mhz
Crystal Oscillator 100Mhz
DRAM 2*32MB DDR2
Flash Memory 8MB (SPI)
Ethernet 10BASE-T/100BASE-TX (half-duplex not
supported)
User I/O Pins 86
Serial Port 1ch (RS232C)
Configuration SPI Flash
Board Size 72x47 [mm]
Power Input DC3.3V
Linux kernel version 2.6
Table 3.1: Specifications of Suzaku-V (SZ410-U00)
3. Theory 10
Figure 3.1: Illustration of the Suzaku-V (SZ310-U00). SZ410-U00 is used in
this thesis. Picture taken from [AT11].
Figure 3.2: The Suzaku-V development board from Atmark-Techno.
3.2.3 The FPGA
A photography of the FPGA is shown in figure 3.3 and it’s specifications are
listed in table 3.2.
3. Theory 11
Manufacturer Xilinx
Model Virtex-4 FX XC4VFX12-SF363
Speed grade -10
CLB Array: Row x Column 64 x 24
Number of slices 5 472
Number of LUTs 10 944
Maximum Distributed RAM
or Shift Registers (Kb)
86
Number of flip-flops 10 944
Table 3.2: Specifications of the FPGA used in the thesis.
Figure 3.3: Photography of the FPGA on the development board.
3.2.4 Base VHDL-design for the Suzaku-V
The base design from Atmark-Techno contains the basic setup for communi-
cating with the parts on the Suzaku-V board.
3.2.5 Internal Configuration Access Port (ICAP)
The Internal Configuration Access Port (ICAP) allows for internal access to
the FPGA bitstream at runtime. The embedded processor can communicate
with ICAP through the Processor Local Bus (PLB) (as described in [Han10]
and [Xil08b]).
3.2.6 ATMARK-dist and uClinux-dist
ATMARK-dist is a Linux-distribution made specifically for the Microblaze-
and PowerPC-processors on the Suzaku-boards. The distribution is based on
uClinux-dist and is built around the standard Linux 2.6-kernel [Atm06].
The original uClinux was a derivative of the Linux 2.0 kernel and was
intended for microcontrollers without Memory Management Units (MMU)
3. Theory 12
[Inc10]. It was later integrated in the main line Linux kernel sources, starting
from Linux-2.5.46 [Ung02]. uClinux-dist is a collection of libraries, applications
and tools, where the most important parts probably are the uClinux-kernel and
the C standard library called uClibc.
A tutorial for setting up ATMARK-dist and NFS is provided in chapter A
in the appendix.
3.2.7 uClibc
uClibc is a C library for the Linux platform. It is intended for embedded
systems and is much smaller than glibc, which is the C library typically used
when developing applications in Linux.
uClibc supports many of the same applications as the heavier glibc and often
it is possible to just switch libraries and recompile the source code [ucl11]. This
assumption has been important when writing the C code for this thesis. The
code has been written and tested on the development computer, recompiled
for the PowerPC-architecture and tested more extensively on the embedded
platform.
3.3 Addressing the bitstream for Virtex-4
The addressing in the bitstream for the Virtex-4 has been discussed in [Ham09,
page 19]. The term frame and what one such frame may contain can be rather
confusing. This subsection makes the meaning of a frame clear by evaluating
the existing framework, articles on the subject and the documentation from
Xilinx.
3.3.1 Objective
O
The most important goal for this section is to define a frame
and what meanings one frame or a series of frames can have in
the bitstream or in the FPGA configuration memory. Another
objective of this section is give a brief overview of how addressing
of frames can be done in the bitstream file.
Understanding what a frame is and how frames can be addressed is very
important if the programs CLBRead or icap write should be utilized in an
application or further developed.
3. Theory 13
3.3.2 Definitions
Bitstream file
A file made by the FPGA development tools. Does contain informa-
tion to reconfigure the configuration memory on the FPGA, but also
operations and overhead data for doing this.
Configuration memory
The configuration memory on the FPGA. Virtex-4 has SRAM-based con-
figuration memory. These memory cells define the LUT equations, signal
routing, IOB voltage standards and all other aspects of the user design
[Xil09, page 87].
Frame
One frame is 41 32-bit words = 1312 bits [Xil09, page 92]. In the bit-
stream, 22 frames configures 8 CLBs + 1 HCLK + 8 CLBs. In the
configuration memory, one frame constitutes 8 CLBs + 1 HCLK + 8
CLBs [TBC07].
Major column of CLBs
One full, vertical column of CLBs on the FPGA. Each column is one
CLB wide.
Row of CLBs
A row spans the entire FPGA horizontally and it’s vertical width is 8
CLBs + 1 HCLK + 8 CLBs.
Partial reconfiguration
Reconfigure only a part of the configuration memory.
Self reconfiguration
Defined here as that the embedded system performs the reconfiguration
of the FPGA itself. A more strict definition could be that the FPGA
reconfigures itself.
3.3.3 Addressing of frames
The FPGA used in this thesis is 16*4 = 64 CLBs in full height. A major
column of CLBs is one full height of CLBs, one CLB wide. This FPGA has
four rows, where each row spans the entire FPGA horizontally and is 16 CLBs
in height. These definitions correspond with the definitions in [Xil09, page 92].
In [Ham09, page 10], Sverre Hamre lists up several articles that describe
bitstream composure for Xilinx FPGAs. The article in [TBC07] is especially
interesting as it describes a lot of the bitstream details for Virtex-4. One can
imagine that these articles has been important when he wrote the program
icap write.
3. Theory 14
Figure 3.5 is from Sverre Hamre’s master thesis [Ham09]. As shown in
this figure, there should be 22 frames per column of CLBs on the FPGA. It is
assumed that Sverre means there are 22 frames per column of 16 CLBs. Figure
3.6 is also from his master thesis and Sverre describes this as the organization
of frames in the bitfile. It is assumed that this shows the organization of frames
per row. Both assumptions correspond to the functionality of CLBRead and
icap write, as well as the article in [TBC07].
Figure 3.4) is from [Han10] and show what a frame may constitute in the
configuration memory.
	

  
  
  
 !!"#
Figure 3.4: Number of words in a frame and example of what frames in the
configuration memory may contain (picture taken from [Han10]).
In [Ham09, page 20], Sverre Hamre writes:
In the bitstream the CLBs consist of 22 frames, 20 of these are for
the routing and 2 are for the logic.
As discussed above, it is assumed that he means 22 frames per column of
16 CLBs. However, later on page 21 he says:
A frame in the bitstream is 41 words, 1312 bits, this is equal to 16
CLBs plus one word in the center of the bitstream for global logic.
As the discussion in this section reveals, the last statement seems to be
wrong as one frame in the bitstream does not alone configure so many CLBs.
Sverre probably meant the configuration memory.
3. Theory 15
At last, figure 3.7 is added to show what the meaning of a frame may be in
the bitstream or the configuration memory.
Figure 3.5: Virtex-4fx12 FPGA logic modules (picture taken from [Ham09,
page 20]).
Figure 3.6: Virtex-4fx12 bitfile frame organization (picture taken from [Ham09,
page 20]).
Figure 3.7: Conceptual drawing of the meaning of a frame on Virtex-4. Ex-
amples from the bistream and the configuration memory.
3. Theory 16
3.3.4 The term frame in CLBRead
A short extract from the defines in the CLBRead header file is shown in listing
3.1. The comments has been added here to show what role each define play
in the actual program. The use of the term frame is quite confusing also in
this file. However, if only the defines used for addressing are considered and
FRAMES_PR_CLB is interpreted as frames per column of 16 CLBs, everything
seems to be correct and matching the previous discussion in this section.
1 ( . . . )
2 // Used f o r addres s ing in CLBRead . c
3 #define CLB PR ROW 24
4 // Only used f o r debugg ing in CLBRead . c
5 #define CLB PR FRAME 16
6
7 // Used f o r addres s ing in CLBRead . c
8 #define FRAMES PR CLB 22
9 // Never used in CLBRead . c
10 #define BIT PR CLB 80
11 // Only used f o r debugg ing in CLBRead . c
12 #define BYTES PR CLB BIT PR CLB/8
13 ( . . . )
14 // Used f o r addres s ing in CLBRead . c
15 #define FRAME LENGTH 32BIT WORDS 41
16 // Used f o r addres s ing in CLBRead . c
17 #define FRAME LENGTHBYTES FRAME LENGTH 32BIT WORDS∗4
18 ( . . . )
19 // Never used anywhere
20 typedef struct{
21 u i n t 8 t iCLB [BYTES PR CLB∗FRAMES PR CLB ] ;
22 } c l b t ;
23 ( . . . )
Listing 3.1: Short extract from CLBRead.h. The omitted parts are marked
(. . . ).
3.3.5 The term frame in icap write
The header file for icap write has the line #define WORDS_PR_FRAME 41 and
one has to specify how many frames to write. The program takes 22 frames
per column of 16 CLBs. This seems to be correct behaviour and has been
verified by testing the program.
3.3.6 The term frame in the documentation from Xilinx
The Configuration User Guide in [Xil09, page 92] states the following:
All Frames in Virtex-4 have a fixed, identical length of 1312 bits
(41 32-bit words). One Frame configures one HCLK with either 4
block RAMS, 32 IOBs or 4 DSPs.
This presumably means that one frame in the configuration memory consti-
tutes one HCLK with either 4 block RAMs, 32 IOBs or 4 DSPs. Note that the
3. Theory 17
documentation from Xilinx is rather sparse on information on the bitstream
composure. It could for example not be found anywhere that 22 frames in the
bitstream configures a column of 16 CLBs.
3.4 The existing framework for partial self-
reconfiguration
In this section, the most important parts for the exisiting system for partial
reconfiguration will be introduced. Some terms have been discussed in [Han10]
and in earlier reports from the AHEAD-project. In that case, only a brief
definition and a reference to the relevant report will be given.
3.4.1 Objective
O
The goal of this section is to give the reader an overview of the
existing framework for self-reconfiguration on the FPGA, before
the contributions from this thesis are introduced. This is impor-
tant, because much of the work done in this thesis builds upon
the earlier work in the AHEAD-project.
3.4.2 CLBRead
As discussed in [Han10, page 16], CLBRead is a program for reading out a
partial CLB structure from a bitfile. It can also write a partial bitstream into
another bitfile. In this case, the reconfiguration is done by modifying a bit-
stream that has not been uploaded to an FPGA. An important finding from
[Han10] is that a partial bitstream containing routing information for both
CLBs and clocks can be extracted and inserted into another bitstream. It
was also shown that a flip-flop and it’s corresponding clock signal can be relo-
cated to another placement in the bitstream. This indicates that synchronous
modules can be inserted into and relocated in a target bitstream file.
The greatest limitation with these results is that they have not been per-
formed at run-time. One problem source that will have to be considered when
performing reconfiguration at run-time are oscillating signals, especially when
reconfiguring a high frequency clock signal. A clock buffer (see section 3.6.4)
can be used for disabling the clock signal during reconfiguration and syn-
chronous bus macros can be used for disabling normal signals into the module
(see section 3.4.4). These considerations are important for practical designs,
but are probably not necessary for showing a proof of concept for partial re-
configuration.
Another problem source is that propertiary protocols from Xilinx will have
3. Theory 18
to be used for writing the bitstream to the FPGA at run-time. The CLBRead
program is open-source, thus making it easier to isolate problems when do-
ing research on the area. For this matter, the analysis of the run-time re-
configuration process done by Sverre Hamre in [Ham09] acts as supporting
documentation when doing any further development.
3.4.3 icap write
Concept
CLBRead’s ability to write a partial bitstream to another file has not been
used in this thesis. Instead, the partial bitstream is written to a running
configuration on a physical FPGA through the Internal Configuration Access
Port (ICAP). Essentially, this means that the file containing the partial bit-
stream and a software program for writing to ICAP has to be present on
the Suzaku platform. Sverre Hamre has in his thesis [Ham09] written the
C-program icap write. This program runs on the PowerPC-processor on the
Suzaku, takes a bitfile description of a partial bitstream as input and writes
it to the ICAP port. It is very important to note that this program has not
been tested thorughfully, or as Sverre Hamre states (from [Ham09]):
The icap write and icap test programs are written to utilize the
Linux icap driver. These are just test programs, especially since
they have hard coded the adressing and has a lot of printf() for
debugging.
Despite that icap write is a test program, it is well written and easy to un-
derstand. Sverre Hamre has provided good documentation of the process of
writing to ICAP in his master thesis.
Practical use
1 . / i c ap wr i t e −h
2 i c a p wr i t e − i [ f i l ename ] −f [ f rames ]
3 This program has the address hardcoded inn .
4 Frames w i l l be wr i t t en to CLB 21 and on , on top row .
Listing 3.2: icap write help
As seen is listing 3.2, the program takes two inputs. The first is the filename
of the bitfile and the second is the number of frames to be written.
Because one vertical series of 16 CLBs is the smallest possible configurable
unit on the Virtex-4 and because the original program by Sverre Hamre has
the adressing hardcoded in the program, icap write has been rewritten in 5.13.
It is now a part of the HWOS-library and the library module is called hicap.
Refer to that section to see how this functionality is used.
3. Theory 19
3.4.4 Bus macros
A bus macro is essentially just a static interface between the module and the
static design (figure 3.8). One simple analogy to this is a standard power
outlet in a house wall where different devices can be connected, always using
the same type of plug.
Static design Reconfigurable module
Two halves of 
one bus macro
Figure 3.8: Conceptual drawing of a bus macro.
Normally the wiring between two adjacent CLBs are dynamically routed
during the place-and-route phase of the implementation process. The problem
is that the module and the static design are implemented separately and that
the routing between them will not be consistent.
How a bus macro is made in FPGA-editor is described in a tutorial by
Sverre Hamre [Ham08b]. The most simple bus macro is made of two adjacent
CLBs and a static, predefined routing between them that does not change
between implementations. The number of wires between the two CLBs will
be technology dependent, but a wider bus macro can of course be made by
setting up several pairs of CLBs.
As shown in figure 3.8, one half of the bus macro is integrated in the re-
configurable module and the other half is integrated in the static design. To
achieve this, the reconfigurable module and one complete bus macro is built
together in an empty FPGA-design. Using the program CLBRead (section
3.4.2), the reconfigurable module and one half of the bus macro is cut out and
stored in a partial bit file.
3. Theory 20
3.5 Synchronous design
3.5.1 Objective
O
Because the thesis focus on methodology for partial reconfigura-
tion of synchronous modules, the reader should understand what
a synchronous module is and why synchronous design is impor-
tant on an FPGA. Essentially, this section seeks to answer why
it is so important that a clock signal can be routed to the recon-
figurable module.
3.5.2 Definitions
Synchronous design
A clock signal triggers all events [Ste05, page 4].
Synchronous reconfigurable module
Is in this thesis defined as a module using the same or a derived clock
signal from the static base design.
3.5.3 Motivation for using synchronous design
In [Ste05], Jennifer Stephenson from Altera has discussed some fundamental
design practises for synchronous designs on FPGA. On page 4, she writes:
The basic principle of synchronous design is that a clock signal
triggers all events. As long as all of the registers’ timing require-
ments are met, a synchronous design behaves in a predictable and
reliable manner for all process, voltage, and temperature (PVT)
conditions. Typically, designers can easily target synchronous de-
signs to different device families or speed grades.
Her words says a lot about the motivation for using synchronous modules
in a reconfigurable framework. Making the module synchronous to the clock
signal in the base design can be seen as a way of letting the module “inherit”
the timing of the framework. Because the interface for control signals and
data transfer is the same for all the reconfigurable modules, it is important
that each module has predictable timing behaviour.
3.5.4 Problems in asynchronous designs
There has been a lot of research on asynchronous design, both for ASICs and
FPGAs. One of the largest problems is that the main-stream FPGAs and
3. Theory 21
their tools and not made for this type of design practise. The analysis tools
for FPGA designs are tailored for synchronous designs and it is easy to verify
that flip-flop setup and hold times are met under worst case conditions for
such designs [Eri00]. For asynchronous designs, the designer would have to
manually check the worst case timing of the signal paths, a process that would
be time-consuming or maybe impossible for complex designs. Even worse is
that this process must be redone for small changes in the design or if the design
is migrated to another FPGA.
Another problem is handling asynchronous input values to flip-flops. The
registers in FPGAs have a defined interval where the input value must be stable
to ensure that a reliable output signal is achieved. Specifically, the input value
must be stable for a minimum time of t_su (register setup time) before the
positive clock edge and a minimum time t_H (register hold time) after the
edge. The output value is then available after some time t_co. The problem
is if the signal transition violates the t_su or the t_H requirements. In this
case, the output may go into a metastable state and the output of the register
may be delayed and not be available withing the required time t_co [Alt09].
For synchronous designs, the input signal to the registers must always meet
the requirements, so the problem with metastability does not occur [Alt09].
3.5.5 Timing requirements for a reconfigurable module
The types of communication between the framework and the reconfigurable
module can be divided into control signals and data transfers.
Control signals
Signalling that the module should start or that a data transfer is finished are
typical control operations. If it is impossible to know when the operation
has been performed (such as when the module has been started), hand-shake
signals should be used. Signals for initiating operations and for signalling that
operations has been done must probably be present for both synchronous and
asynchronous modules. As discuessed earlier, the problem for asynchronous
modules would be performing the actual operation without a clock signal to
relate to.
Data transfers
Data transfers are more complex. The framework would typically implement
some sort of serial communication to transfer data to and from the module.
The state of the module is an example of data that must be transferred this
way. In the implementation by Vegard Endresen [End10], one or more parallel
shift-registers are used for this transfer. In this case, the framework assumes
that the module shifts out one bit of data each clock cycle. This is a safe
3. Theory 22
and deterministic way of doing high-performance data transfers if the clock
signal in the module has low skew and is synchronous to the clock signal in
the framework.
On the other hand, if the module was asynchronous, some sort of hand-
shaking mechanism would have to be used. An example implementation would
function this way:
1. The framework signals to the module that one bit should be shifted out
by setting SHIFT high.
2. The module shifts out one bit and sets a hand-shake signal, SHIFT_DONE,
high.
3. The framework sets SHIFT low and goes back to the first step.
The problem for this is that many control signals would have to be used for
the hand-shaking. Because of this, naive implementations would probabably
have poor performance.
3.6 Defining an interface for clock signals
3.6.1 Objective
O
The goal of this section is to provide background theory for the de-
velopment of a methodology for creating partially reconfigurable,
clocked modules. This will discuss how to make ready an ini-
tial base configuration to put on the FPGA and how to make
reconfigurable modules.
3.6.2 Definitions
Synchronous module
In this thesis, this is defined as a reconfigurable module with the same
or a derived version of the clock signal from the base design.
3.6.3 Concept
In this thesis, Directed Routing (see section 3.6.7) is a key concept for making
it possible to perform partial reconfiguration of synchronous modules. The
method can be used to put constraints on the routed clock signal for the
modules. To do this, a dummy design is first built in ISE. This design has
one global clock buffer and one flip-flop as shown in figure 3.9. Note that
3. Theory 23
the use of the global clock buffer (BUFGCE) must be clearly expressed in the
VHDL-code. This is to make sure that at least one separate clock buffer is
used to route the clock signal into one or more of the reconfigurable regions.
If the same global clock buffer was to be used for both the static and the
reconfigurable part of the design, this would mean the clock signal would be
routed to a lot of different logic and the DIRT would be harder to maintain as
explained in section 3.6.7.
The simple dummy design is synthesized, routed, placed and the resulting
NCD-file is opened in FPGA-editor. Now this tool can extract a short and
concise user constraint for the global clock buffer, the clock signal and the
flip-flop.
BUFGCE
FF
Routed clock signal
FPGA
Horizontal clock root 
spines
Figure 3.9: Dummy design containing one flip-flip and one global clock buffer
(BUFGCE)
A few of the global clock root spines for the specific region is also drawn
for illustration. The purpose is to show that the clock signal will be routed
along the lowest and first available global clock wire. The routing information
from this dummy design is put into the complete, static base-design, thereby
guaranteeing that the same clock buffer and, most importantly, the same clock
wire always will be used for any reconfigurable module. The dummy design
actually defines an interface for clock signals into the reconfigurable module. In
this simple example it is defined that any reconfigurable module placed in the
upper, right corner of the FPGA is able to use the global clock buffer specified
in the dummy design and that this clock signal will be routed along the lowest
global clock wire. Any reconfigurable module built in ISE will always use the
3. Theory 24
lowest clock wire as this is the first one that is available. Because the base-
design has been built with the clock-routing-information from the dummy-
design, any synchronous, reconfigurable module will be compatible with this
design.
3.6.4 Global clock buffer, BUFGCE
The BUFGCE primitive is one of the available global clock buffers on the
Virtex-4 FPGA. It has one input and one output port for the clock signal
and the input signal CE for enabling/disabling the output clock signal. All the
global clock buffers are derived configurations from the BUFGCTRL primitive
as described in [Han10].
3.6.5 Clock root spines
The different clock resources on the Virtex-4 FPGA was introduced and dis-
cussed in [Han10]. One important point was the distribution of the clock
signals to the different clock regions on the FPGA. Both global and regional
clock signals are routed along horizontal root spines as shown figure 3.10. Note
that the Virtex-4 has 8 horizontal root spines, while Virtex-5 has 10.
Figure 3.10: Distribution of global clock signals into a clock region (Virtex-5).
Picture taken from [QWA09].
The connection of global clock buffers to the global clock lines on the Virtex-
4 FPGA are shown in figure 3.11. The picture shows one global clock buffer
3. Theory 25
configured as a BUFGCE (see section 3.6.4).
Figure 3.11: Connection of global clock buffers to global clock lines on the
Virtex-4. Screenshot from FPGA-editor.
The clock signals are routed along vertical root spines as shown in figure
3.12. There are several switchboxes along the vertical spine, making it possible
for the routed clock signal to turn left or right into a specific clock region. In
this figure, the clock signal is routed into the upper right region of the FPGA.
Figure 3.12: Distribution of global clock signals in the middle of the Virtex-4.
The routed clock signal is marked red. Screenshot from FPGA-editor.
Figure 3.13 shows a screenshot of the horizontal root spine on the Virtex-4
FPGA. The clock signal is routed horizontally from the middle of the FPGA
and then switched up to the CLBs. Note that there are 8 global clock wires.
The two lower wires in figure 3.13 come from regional clock buffers (see [Han10]
for more information on these).
3. Theory 26
Figure 3.13: Distribution of global clock signals in the upper right corner of
the Virtex-4. The routed clock signal is marked red. Screenshot from FPGA-
editor.
3.6.6 User Constraints File (UCF)
The User Constraints File (UCF) are used during place and route for the
FPGA logic. Examples of constraints are placement of logic blocks, usage
of resources like BRAM and timing constraints. Constraints can be written
directly into the UCF-file, but this is hard for complex constraints. The tool
Planahead are typically used for location constraints.
An interesting type of constraint for this thesis is Directed Routing as dis-
cussed in section 3.6.7.
3.6.7 Directed Routing (DIRT)
Directed Routing (DIRT) is meant to be a method for defining repeatable,
locked routing functionality for a limited number of critical signals in a design
via UCF constraints [Xil08a]. Generally, this can be useful for maintaining
determinable timing of nets in the design or as a potential workaround for
routing limitations introduced by the tools. The routing information for one
routed signal can be extracted and reused in the same or in another design.
This kind of constraint is typically extracted from a design that already has
been placed and routed. This can be done by opening the NCD-file in FPGA-
editor and choosing the menus Tools → Directed Routing Constraints.
This will open a dialogbox with several configuration options, which essen-
tially will export a Directed Routing constraint to raw text that can be put in
a User Constraints File (UCF). Directed Routing specifies the exact routing
between two or more components on the FPGA. This means that constraints
for these relevant components also must be extracted if the DIRT should func-
3. Theory 27
tion correctly. This can impose a problem when the same signal is routed to
many components as the DIRT will be both large and have strong dependence
on the original design. Also, the extracted text constraint are not actually
human-readable, which means it is not easy to maintain manually. In practise
one have to use the tools from Xilinx to generate such a constraint, but it
might be possible to reverse-engineer the format to understand how they are
made.
3.6.8 Setting up a base design
Before tests on swapping reconfigurable modules can be performed, a VHDL-
design must be set up on the FPGA. A VHDL-project containing a minimum
configuration for the FPGA can be downloaded from Atmark-Techno’s web
pages (elaborated in 3.2.4).
The base design can be built with an initial reconfigurable module or at
least open room for a module. The reconfigurable module must be connected
to the rest of the design through so-called bus macros (described earlier in
section 3.4.4).
3.6.9 Setting up a reconfigurable module
There are two ways of creating a reconfigurable module in VHDL. The first
one is to include the reconfigurable module in the process when designing the
base design. This is typically done in EDK. The other way is to design only
the reconfigurable module in ISE (description of tools is in section 3.1).
Either way, it is important to put constraints in the design to make sure
the bus macros and the reconfigurable module are placed on deterministic
locations. Constraints are put in a User Constraints File (UCF), which are
a plain text file that can be altered manually or by the program Planahead
(see section 3.6.6). How to put constraints on logic is discussed in [Han10]
and in [End09a] and will not be described in detail here. The most important
logic constraints for a reconfigurable module are summarized in the following
subsections. Figure 3.14 works as a reference for these sections.
Constraints in the base design
For the base design it is important to define a reconfigurable region where one
or more reconfigurable modules can be placed.
3. Theory 28
Logic 
elements for 
module
Bus macro(s)
Legal routing
Illegal routing
Cutline/boundary for 
reconfigurable module - 
everything outside this 
boundary is out of interest
FPGA
IO-banks
Figure 3.14: Conceptual drawing of a reconfigurable module built on an empty
FPGA in ISE.
Constrain logic blocks for a reconfigurable module
In most reconfigurable systems, it is convenient to group the logic blocks for
a module as close together as possible. The idea is that communication will
be faster when the logic is in the same group. In this thesis, the internal
communication for a module goes through direct wiring. This should work as
a quite general solution, but it is worth mentioning that some systems use a
dedicated Network on Chip (NoC) both for communication between modules
and for the internal communication in a module.
Constrain routing for a reconfigurable module
When building a standalone module in ISE, the internal wiring in the module
should of course not extend outside the region defined for that module (see
“Illegal routing” in figure 3.14). It makes sense to keep the boundaries of this
region as tight and close to the logic as possible, but this is also dependent on
the routing resources available inside the module.
One significant problem is that routing can not be easily constrained in
the implementation process. In general, this means that the Xilinx tools will
consider the whole FPGA when making routing choices and manual rerouting
of wires must be performed to make sure that routing is held inside the mod-
ule’s region. The rerouting process is done in FPGA-editor before generating
a bitfile and can be quite time consuming for complex modules. If the module
is changed and must be synthesized one more time, the rerouting process must
3. Theory 29
be redone and even more time is wasted. Not that this is not the case for clock
signals. A methodology for constraining the clock will be presented in 3.6.10.
One possible workaround for normal signals could be to create a hard macro
or use Directed Routing (section 3.6.7) to occupy all or most of the routing
resources on the FPGA, except the resources in the reconfigurable region.
This method has not been tried out, but could potentially force ISE to only
use routing resources in that region.
Another smaller problem is the routing from the bus macros to the IO-banks
(see figure 3.14).
3.6.10 Making base design compatible with synchronous
modules
As described in Concept (section 3.6.3), some steps must be done to make sure
the base design are compatible with reconfigurable modules.
A dummy module can be set up using just one clocked flip-flop and one
global clock buffer. A VHDL design for this is shown in listing C.2. As de-
scribed earlier, this design is opened in FPGA-editor and choosing the menus
Tools → Directed Routing Constraints. As shown in figure 3.15, a win-
dow is opened and all the routing resources in the module are listed. In this
example, the output signal from the clock buffer, clk_buffer_out, has been
selected.
Note that both a relative and an absolute constraint can be generated. To
make sure the routing is followed exactly, the last option is chosen. After
clicking “OK”, the UCF-constraint shown in listing 3.3 will be written to a
file. As can be seen, this code put constraints on the global clock buffer, the
flip-flop and the routing between them.
Listing 3.3: DIRT constraint for dummy module built in ISE
1 NET ” c l k b u f f e r o u t ”
2 ROUTE=” {3 ; 1 ; 4 v fx12s f363 ; fd3a8469 !−1;1112;−11800;S
! 0 ; 3 7 6 ; 1 1 9 2 5 ! 1 ; 3 1 8 9 7 ; ”
3 ” 82119 !2 ;23663 ; −787 !3 ;2853 ;26551 !4 ; 683 ; −496 ;L !} ” ;
4 INST ”BUFGCE inst” LOC=BUFGCTRL X0Y13;
5 INST ”R/ out 16 0 ” LOC=SLICE X46Y127 ;
6 INST ”R/ out 16 0 ” BEL=”FFY” ;
This constraint is put in the UCF-file of the base design. It should not
matter what kind of static design this is as long as the initial reconfigurable
module is the module with one flip-flop. It must be the exact same one as
described here.
3. Theory 30
Figure 3.15: Screenshot from Directed Ruting Constraints menu in FPGA-
editor.
3.6.11 Making a scalable solution
It should be possible to use several clock buffers in the complete design. It
could for example be useful to have different clock frequencies on different
modules. The general problem is to route a set of clock buffers BUFFER0,
BUFFER1, BUFFER2 and so on to a set of modules MODULE0, MODULE1, MODULE2
and so on. A brief description of methodology for the reconfigurable module
and the static design follows.
The static design
For the static design, this is quite easy. Just build a dummy module with all
the clock buffers and one flip-flop per clock buffer (FF0, FF1, FF2 and so on).
Iterate the design flow as follows:
1. Build a dummy module with only BUFFER0 and FF0 in ISE.
2. Extract the DIRT for this clock signal and place the constraint in the
3. Theory 31
UCF-file of the same ISE-project.
3. Rebuild the dummy module with the same buffer and flip-flop as before,
but also BUFFER1 and FF1. FF1 is placed in a different reconfigurable
region than FF0. The two regions will now represent different clock-
domains. A module placed in the first region will use the output of
BUFFER0 and a module placed in the second region will use the output
of BUFFER1.
4. Extract the DIRT for the clock output of BUFFER1 and place it in the
same project.
5. Rebuild the project again and do the same operations for BUFFER2 and
any other buffers.
6. At last: Place all the DIRT-constraints in the UCF-file of the static
design. Build the static design with the same set of intitial, reconfigurable
modules.
After this is done, the static design should be compatible with the rest of
the design. The FPGA should be divided into several reconfigurable regions,
each region potentially using the output from a different clock buffer.
The reconfigurable module
It is assumed that each reconfigurable module use only one clock signal each.
In this case there is no need to constrain the clock signal, as the first clock
wire in to the module always will be used.
3.7 Scheduling on FPGA
3.7.1 Objective
O
The objective is to provide background information on how
scheduling is done for computer systems in general, for embedded
systems and for partial reconfigurable systems on FPGA. In this
thesis, this theory forms a background for the choices done when
implementing the scheduler in section 5.
3.7.2 Motivation
In this thesis, an FPGA is used as a reconfigurable hardware accelerator and
the overall goal is to increase the overall speed of the system. Processes that
run on the FPGA can be interruptible, meaning that after they have been
3. Theory 32
placed on the FPGA, they can be stopped. The process state can be saved to
for example a file on disk or to some random access memory. The FPGA area
can then be utilized by some other process that needs run-time.
[Wol05] discusses hardware accelerators and scheduling on hardware. As
said in this article, moving functionality from software to a hardware accelera-
tor must of course only be done if the task can run faster on the hardware than
on the CPU. This can be a challenge for a high-performance CPU, but the fact
that hardware modules can run in parallel with other modules and the CPU
increases the chance of speed-up. Furthermore, the system should be analyzed
and the bottlenecks should be identified. Speeding up one part of the design
in hardware when in fact another part is the bottle-neck may not add any
net-gain to the complete system. Yet another challenge is communication and
synchronization between the accelerated module and the rest of the system.
The communication cost between two processes on the same processing unit
is in general less than the cost between processes on different units. Even if
a module runs faster in hardware than in software, the communication costs
may be to large to gain any real speed-up.
3.7.3 Type of scheduling decisions for reconfigurable
hardware
For a software programs running on a single CPU, a scheduler will be in charge
of determining when each process should run on the CPU. This will be more
complex if there are multiple CPUs where the process can run. It depends on
the definition of a scheduler whether or not it is the scheduler’s responsibility to
also decide where the process should be executed. In this thesis, the following
are defined.
Partitioning module
The partitioning module is responsible for partitioning the application between
hardware and software.
Scheduler
In this thesis, scheduling is defined as deciding when a process should run.
Placer
The placer is defined as the part of the HWOS that decides where the process
should be placed on the FPGA. This can be done in an 1-dimensional or 2-
dimensional manner. For 1D-placers, many of the same concepts as for virtual
memory can be used.
3. Theory 33
If a reconfigurable system, like an FPGA and a coprocessor, has maximum
flexibility, a process can run in both hardware and software, on an arbitrary
place on the FPGA and in any given time. For such a system, the decision
taken by the scheduler may of course depend on the results by the partitioning
module and the placer. For example, it may not be a good idea to schedule a
process for run-time on the FPGA if it must transfer a lot of data to another
module that is running in software. Also, it may not be a good idea to place a
module on the FPGA if it is to large to fit within the exisiting configuration.
3.7.4 Process
In this thesis, a process is defined as an entity owned by some module, either
a software or a hardware module, that is running on the FPGA for hardware
acceleration. For example can a software application ask the hardware sched-
uler to perform multiplication operations on the FPGA. If a multiplicator is
present on the FPGA, the process can utilize the multiplicator logic to per-
form it’s operations in cooperation with the larger software application. In a
certain way, one can see the multiplicator logic on the FPGA as a class and
the process as an object of that class. A process has state data and input and
output values bound to it, while a reconfigurable module is logic acting upon
these input and state values.
Whether a process is I/O-bound or processor-bound is an important classi-
fication for any type of scheduler.
I/O-bound processes
This type of process makes heavy use of I/O-resources. This can for example
be a process that is awaiting keystrokes from a user. Such a process will spend
relatively long time waiting for input, and will therefore often be preempted.
However, it is important that the process is waken up as fast as possibly,
else the user will experience the system as unresponsive. In a typical UNIX-
scheduler, this kind of process will be given large priority so that it can be
resumed quickly [BC00].
In the AHEAD-project, the FPGA are meant to function as a hardware
accelerator on a separate platform than the client. Because the communication
cost between the tag and the client will be relatively large, it is probably
not a good idea to place user-interactive processes on the FPGA. It is more
appropriate that such processes are separated from the rest of the application
and placed in software. In the case of a smartphone client, the process can run
on the mobile OS.
However, there will be processes running on the FPGA that will have more
need for communication than others. For example could the bottle-neck of a
MPEG-decoder be running in hardware, but still need new input data quite
3. Theory 34
often.
Processor-bound processes
These kind of processes are typically spending much time performing heavy
calculations that demands a lot of the processor. A process can for example
be busy doing a cryptography algorithm that takes a few inputs, but where
the actual algorithm takes a lot of time do to on the processor. Because these
processes will spend a lot time on the CPU and not be preempted often by
external events, the scheduler should penalize them so they do not starve other
processes. In Linux, or any other UNIX-based kernel, such processes will be
assigned a less favorable priority [BC00].
When building the scheduler for this thesis, it was assumed that all processes
in the system are mostly processor-bound. It is assumed that the decision
between which processes should run in software and which processes should be
accelerated on the FPGA had already been taken by a partitioning module.
In any case, there will be some processes on the FPGA that will have less
need for communication with the client or the other modules on the FPGA.
Ideally, these processes should be given less priority so that they do not occupy
the FPGA too much. A valid question to ask is how should the scheduler know
how much the process is bound to I/O. This information can, if possible, be
given by the module that partitions the application between hardware and
software and should in fact be a major consideration when performing this
partitioning.
Static priority
In Linux, static priority range from 1 to 99 and can be given to real-time
processes by the users [BC00]. Real-time processes will always have higher
priority than convential processes. This concept is interesting when consider-
ing scheduling in the AHEAD-project. For example can a pool of hardware
modules be available for the AHEAD-tag, each assigned with a static priority
relatively to the other modules. It would then be quite easy for the scheduler
to assign priorites to the processes running on the FPGA.
3.7.5 Queue structure
The scheduler developed in this thesis makes heavy use of queue-structures for
the processes. The queue structure is central in many schedulers, including
the Linux scheduler [BC00]. A general queuing diagram for scheduling can be
seen in picture 3.16 and was found in [Sta05, page 396].
As seen in the figure, the scheduler typically has an incoming queue for
jobs and a READY-queue for jobs that is ready to be placed on the processor.
3. Theory 35
Figure 3.16: A generic overview of queues used in scheduling. Picture taken
from [Sta05, page 396].
A process that has to wait for I/O is typically interrupted and placed in the
BLOCKED-queue. When the I/O-resource is ready this process moves to the
READY-queue as seen in the figure. The two suspend-queues are used when a
process is swapped out from main memory to disk. The concept of swapping
is complex and is not studied in this thesis.
3.7.6 Interrupter
Vegard Endresen has in his thesis worked on an interrupter built in VHDL
and implemented on the FPGA [End09b]. The purpose of this module is to
ask the running process to stop and then act as a buffer mechanism between
the process and the CPU when the process sends it’s state data to the CPU.
The interrupter takes commands from the CPU, which in turn is interpreted
as actions like shift in/out data to/from the module or start/stop the module.
The interrupter is a very critical part of the scheduler as much depends on if
the state of the module can be extracted and reinserted fast enough. Equation
3.1 sums this up pretty good (taken from [Wol05]). In the worst case scenario
for a partial reconfigurable system like in this thesis, tin is the time it takes to
place the process on the FPGA and insert the state data, while tout is the time
it takes to save the state data. It is assumed that the module’s functionality is
static and saved in a bitfile on disk and that it is never any need to read back
the logic of a module from the FPGA.
The functional description of a process does not always need to be removed
3. Theory 36
from the FPGA. In a larger system, there may be several reconfigurable mod-
ules on the FPGA at the same time. For example could an adder and a
multiplier both be present on the FPGA the whole time and be reused by
several processes. In this case would tin and tout only be the time to stream
data in and out of the module.
taccel = tin + tx + tout (3.1)
Another way of saving and restoring state data is using the Internal Config-
uration Access Port (ICAP) (see section 3.2.5). This method has been briefly
discussed as possible in Roman Plessl’s master thesis in [Ple04, page 72]. The
downside with this approach is that one would have to read back and save
both the state data and the logic of the module. Most of the data would then
describe the functionality of the module and not the state.
Round Robing scheduling policy and time slices
A time slice is defined as how long a process is allowed to run before it is
preempted by the system. For a scheduling policy like Round-Robin (RR),
each process is allowed to run for a certain time and is then preempted by
a clock interrupt at periodic intervals [Sta05, page 405]. When the currently
running process is interrupted, it is placed in the READY-queue.
3.7.7 Types of scheduling
The goal of a scheduler is to meet system objectives, such as response time,
throughput and processor efficiency [Sta05, page 393]. The different types of
scheduling can be summarized in table 3.7.7 and is taken from the book in
[Sta05].
Type Description
Long-term scheduling The decision to add to the pool of processes
to be executed.
Medium-term scheduling The decision to add to the number of pro-
cesses that are partially or fully in main
memory.
Short-term scheduling The decision as to which available process
will be executed by the processor.
IO scheduling The decision as to which process’ pending
I/O request shall be handled by an available
I/O device.
Table 3.3: Types of scheduling (from [Sta05]).
3. Theory 37
3.8 Automated testing
3.8.1 Objective
O
The objective of this section is to introduce the reader to method-
ology for testing embedded systems. Since partial reconfigurable
systems have conceptual similarities to software based operative
systems, it seems natural to look at some of the most promising
testing methodologies from software development. Test-driven
development (TDD) has been gained a lot of popularity the recent years and
may also be applicable for many embedded systems.
The theory in this section is used as background for the testing of the
implemented systems in this thesis.
3.8.2 Definitions
Test-driven development (TDD)
Software development process.
Unit test
Smallest testable unit in a system. Could for example be a function of a
module.
3.8.3 Motivation
The embedded software developed in this thesis runs on the PowerPC-processor
on the Suzaku-development board. The code was first developed for the x86-
architecure using Linux and the glibc-library and then recompiled to run on
the PowerPC using the uClibc-library. As explained in section 3.2.7, porting a
C-application from glibc to uClibc is often no harder than writing proper
Makefiles and recompiling the software. This fact makes it interesting to
look into more sophisticated methods to perform testing. Test-driven develop-
ment, along with automatic testing, is an interesting approach to performing
HW/SW-coverification of embedded systems.
Verification of HW/SW-systems is in general harder than for pure software
systems. A typical methodology is to divide the development into phases.
First the hardware is developed in one phase, then the software is built on top
of the hardware in the next phase. This means the software-developers has
to wait for the hardware-developers to finish or vica-versa. It also means that
the interfaces and structure of the HW and SW has to be built independently,
when they in fact depend heavily on each other. More time can then be
spent redefining the interfaces or restructuring the system in one phase. If
the software phase detects any bugs in the hardware, this can be very costly
3. Theory 38
to repair and the reiterations in the development cycle may become large.
Variations of this methodology will of course exist, but the general problem is
that time-to-market is crucial and it may be hard to build robust systems this
way.
3.8.4 Test-driven development (TDD)
Test-driven development is a rather old technique (originated from the 80’s),
but has later gained more attention as development processes such as eXtreme
Programming (XP), SCRUM and the Unified Process (UP) has become pop-
ular [CS10].
Figure 3.17 is from the article in [CS10] from January 2010. As discussed
in that article, TDD-driven development is based on the red-bar/green-bar
mantra, where the colors describe different steps in the development process.
The TDD-approach, as most other development approaches, starts with a
list of requirements. One specific requirement is then converted to an exe-
cutable test as shown in top of the figure. This could for example be a small
unit test. To at least make the test compile successfully, some minimal sys-
tem code is implemented, thereby moving down to the red bar as depicted in
the illustration. The test is run and reports that the given system does not
respond as expected. Now the developer knows two things:
• The test is working. It is reporting failure as expected.
• The system contains unimplemented features or features not working.
Now the developer starts working with the system to make the test pass
successfully. When this is done, the process has moved to green bar (see figure
3.17). When making the test pass, the developer’s focus could have been quite
narrow and in the worst case some poorly written code was developed just to
make the test pass. Some reasons for this are listed below.
• The developer did not see the tested unit as part of a complete system
and did not consider the reuse potential.
• The development involved a lot of low-level or system-dependent consid-
erations. The language used could for example involve a lot of implementation-
specific details. For an embedded system, low-level optimizations can be
a necessary part of the requirements.
To make the system better, the tested implementation is refactored several
times as seen in the bottom of figure 3.17. Ideally, the refactoring should
not introduce any new functionality. The smallest iteration makes the system
3. Theory 39
implementation more well structured, robust and reusable. The largest itera-
tion is needed to improve the structure of the tests and improve the interface
to the implemented system. For each change, the test is rerun to make sure
everything is working as expected.
A large gain from TDD is that focus is shifted from implementation to the
interface and observable behaviour of the system [CS10]. Also, focusing on
verification early in the development process helps reveal misunderstandings
in the specifications and poorly designed interfaces. The motivation is simply:
Errors that are revealed later in the development process are more costly to
repair. Errors that are revealed after production can be very expensive to
repair, as the system may be mass-produced and/or in use by customers.
Another advantage is that TDD focus on automated testing. These tests
can be run several times and the final results are also easily reproducable.
For some systems, a small modification of one part of the system may cause
another part to malfunction. The malfunctioning part may have been tested
earlier, but without rerunning the test the error may not be revealed before
much later. Well structured automatic tests, for example written using a test
framework, may also function as documentation of the system’s behaviour.
Figure 3.17: Test-driven development cycles. Picture taken from [CS10].
3.8.5 Check: A unit testing framework for C
Most of the implemented parts in this thesis was tested using the test-framework
Check [che09]. Check is simple, written in ANSI-C and used by several open-
source projects (see their website for more information). It is easy to compile
the framework for both the x86- and the PowerPC-architecture, meaning many
of the same tests can be run on both the development computer and the Suzaku
board. For this thesis, the tests involving an FPGA could obviously not be
3. Theory 40
run on the development computer, but other than that all tests run on this
system also.
Using a test framework should also make it easier for future developers to
reuse the work. The tests are pretty easy to read and it is quite clear which
parts of the system have been tested and which has not.
Chapter 4
Implementation: Partial
reconfiguration of synchronous
modules at run-time
4.1 Objective
O
The purpose was to create a methodology for making reconfig-
urable modules with an incoming clock signal. This is an impor-
tant part of the AHEAD-project and a key for making modules
interruptible in a HWOS. A module typically make use of several
flip-flops for saving it’s state and clocked shift-registers are used
in Vegard Endresen’s master thesis [End10] for shifting this data in and out.
Performing partial reconfiguration of asynchronous modules has been stud-
ied and performed at run-time in [Ham09]. However, reconfiguring synchronous
modules has not been done earlier in the AHEAD-project and has also shown
to be problematic in [End10].
4.2 Definitions
DIRT
Directed Routing (see section 3.6.7). Used to constrain the clock routing
for the reconfigurable framework.
EDK
Embedded Development Kit from Xilinx (see section 3.1). Used to build
complete FPGA configurations.
ISE
ISE Design Suite from Xilinx (see section 3.1). Used to build partial
FPGA modules.
41
4. Implementation: Partial reconfiguration of synchronous modules at
run-time 42
Base design in VHDL from Atmark Techno
Used as basis for the FPGA on the Suzaku-V development board (pre-
sented in section 3.2.4). Built in EDK.
User logic base design in VHDL
Built by the developer on top of the base design from Atmark Techno.
This is the framework and the interface for the reconfigurable modules.
Must contain bus macros and is built in EDK.
Reconfigurable module in VHDL
The module to be swapped in and out of the FPGA. It is built in ISE and
cropped by CLBRead afterwards. Must contain bus macros for signal
interface.
BUFGCE
Global clock buffer configured with Clock Enable (CE) port (see section
3.6.4).
NFS
Network File Server (NFS). Described in chapter A.
4.3 Concept
In [Han10], reconfiguration of synchronous modules were done. The experi-
ments and results from the project report are important and show that the
existing framework supports reconfiguration of the routing information for a
clock signal. The problem is that these experiments were not performed while
the FPGA was running. The partial reconfiguration of the bitfile was done
on the development computer and this file was afterwards verified by upload-
ing it to an FPGA. Also, it was not done a thorough analysis on why partial
reconfiguration of synchronous modules worked in this case.
The simple clocked flip-flop designs were found in [Han10]. A drawing of
the base design and the reconfigurable modules are shown in figure 4.1. The
setup contains one base design that is initially uploaded to the FPGA and four
different reconfigurable modules. Each module had different placement of the
clocked flip-flops.
The ultimate goal was to perform partial reconfiguration of these syn-
chronous modules by writing them to ICAP at run-time.
4. Implementation: Partial reconfiguration of synchronous modules at
run-time 43
Initial reconfigurable 
module
Busmacros
Reconfigurable 
module 1: One 
flip-flop
Reconfigurable 
module 2: One 
flip-flop shifted 
left
Reconfigurable 
module 3: Three 
flip-flops
Reconfigurable 
module 4: Three 
flip-flops shifted 
left
BUFGCE
Base design
Clock signal (constrained by 
Directed Routing)
Part of clock 
routing
Flip-flop (one-bit 
register)
Figure 4.1: Base design and reconfigurable modules for testing reconfiguration
of flip-flops.
4.4 Requirements and design
4.4.1 Setting up the base designs on the FPGA
The base VHDL-design for the Suzaku-V from Atmark-Techno was set up in
EDK.
Several different user logic base designs were set up for the tests. These are
briefly discussed here.
Base design 1: One flip-flop
This user logic base design is appended in listing C.1 and was found in [Han10].
A global clock buffer (BUFGCE) was added to enable/disable the clock signal
to the reconfigurable module. The Clock Enable (CE) input port of this buffer is
connected to one of the software accessible registers, thereby making it possible
4. Implementation: Partial reconfiguration of synchronous modules at
run-time 44
to enable/disable the clock signal in the reconfigurable module from software.
As described in section 3.4.4, bus macros must be included in the base design
to make it compatible with reconfigurable modules.
This base design was built with the initial module shown in listing C.3
found in [Han10]. The values of the flip-flops can be read by using the kernel
module and the small C-program found in [Han10]. How this design is placed
and routed on the FPGA is shown in figure 4.2.
Figure 4.2: Base design with one flip-flop (from FPGA-editor).
Base design 2: Three flip-flops
This design is the same as base design 1, but the initial reconfigurable module
has three flip-flops as shown in listing C.4 (from [Han10]). A picture of the
design is shown in figure 4.3.
Figure 4.3: Base design with one flip-flop (from FPGA-editor).
4. Implementation: Partial reconfiguration of synchronous modules at
run-time 45
Base design 3: One flip-flop with DIRT constraint on clock
This design is built using base design 1 as basis, but the routed clock signal
was constrained by a DIRT constraint. How this is done is discussed in chapter
3.6.
Base design 4: Vegard Endresen’s instruction- and data cache back-
end
As described in [End10] and in section 3.7.6, this is the most promising design
from this thesis for handling state loading/saving. The design is built with a
sequential multiplicator as the initial reconfigurable module. As all the other
base designs, bus macros are used for communicating with the reconfigurable
module.
4.4.2 Building reconfigurable modules in ISE
All the reconfigurable modules shown in figure 4.1 were built without the base
design. This was done in Xilinx ISE. These modules and how they were built
are described briefly here.
Reconfigurable module 1: One flip-flop
The simple code for this module is shown in listing C.3. By generating a User
Constraints file (see section 3.6.6 in Planahead, the module is forced to be
placed at the upper left corner of the FPGA. This is important to make sure
the module is placed in the reconfigurable area when writing it to the base
configuration on the FPGA. The module will be look like the module in figure
4.2 without the static design.
Reconfigurable module 2: One flip-flop shifted left
This is the same module as the first module, except that a constraint in the
UCF-file is used to move the flip-flop left.
Reconfigurable module 3: Three flip-flops
The code for this module is shown in listing C.4. The same type of constraints
is used as for the first module.
4. Implementation: Partial reconfiguration of synchronous modules at
run-time 46
Reconfigurable module 4: Three flip-flops shifted left
This is the same module as the third module, except that a constraint in the
UCF-file is used to move the flip-flops left.
Reconfigurable module 5: A multiplicator
The multiplicator from base design 4 was also built as it’s own module in ISE.
4.5 Development of methodology through test
suites
The methodology for self-reconfiguration of synchronous modules were done
by performing a series of test suites. These test suites, their objectives, results
and summaries are described in chapter 6.
4.5.1 Analysis of first test suite: Simple flip-flop designs
The first test suite is described in section 6.2. The results from this suite
verified that the experiments from [Han10] could be performed at run-time by
writing the reconfigurable module to ICAP, but does not explain why Vegard
Endresen could not perform run-time reconfiguration in [End10].
Vegard Endresen only used EDK for building the base design, while the
reconfigurable modules were built without the base design in ISE. To find if
there were any difference when building the modules alone in ISE compared to
building them together with the static design in EDK, Vegards system where
rebuilt.
Figure 4.4 shows a close-up screenshot of the clock routing in Vegards static
backend module. The system was built with a multiplicator module as the
initial reconfigurable module and a global clock buffer for controlling the clock
signal into the module. As seen in this picture, the clock signal is routed along
the second clock wire.
For the simple flip-flop design (base design 1), the picture in 4.5 shows that
the clock signal is routed incorrectly along the second clock wire. All the
reconfigurable modules built in ISE had the clock routed along the first clock
wire.
The mismatch of the clock wires is seen as a likely reason as to why his
experiments and the experiments with the simple flip-flop designs did not
succeed.
Different ways of solving this problem was investigated. One potential solu-
4. Implementation: Partial reconfiguration of synchronous modules at
run-time 47
Figure 4.4: Screenshot of clock routing for Vegard’s backend module (from
FPGA-editor).
tion was to use a bus macro solution for the clock signal. Bus macros has been
used for the data signals into the module (see section 3.4.4). Sverre Hamres
project report [Ham08a] and the attached tutorial has shown how a bus macro
can be built. However, because the high amount of work for creating one such
bus macro and because it was not sure whether or not this solution would work
for a clock signal, this idea was discarded.
A more promising solution was the use of Directed Routing. Section 3.6.3
in the theory discusses how this concept works and how it can be applied to
a complete static design. This was done in the third test case 6.2.3. This test
case is a proof-of-concept showing that a static interface for the clock routing
can be defined, much in the same way as bus macros define a static interface
for data and control signals.
4.5.2 Analysis of second test suite: Instruction- and
data cache backend
The test cases in the second test suite (see section 6.3) was performed to check
if the clocking methodology could be used for the backend module written by
Vegard Endresen. As can be seen from the results, both tests failed. It was
possible to build the backend and constrain the clock routing if the backend
was built with a simple flip-flop as it’s initial reconfigurable module.
4. Implementation: Partial reconfiguration of synchronous modules at
run-time 48
Figure 4.5: Screenshot of clock routing for base design 1 (from FPGA-editor).
Chapter 5
Implementation: Scheduler and
HWOS
5.1 Objective
O
The goal was to develop and implement a simple scheduler for
FPGA-based tasks. This scheduler should be able to schedule
processes in time. A good structure should be realized to make
room for future optimizations on the critical parts of the sched-
uler.
5.2 Structure of code and compiling
The HWOS can be compiled for the standard Linux x86-architecture or the
embedded uClinux using the PowerPC microprocessor. A description of the
directory structure for the HWOS and how to compile the code is found in B.
5.3 Documentation of HWOS in Doxygen
All the C-code for the HWOS was documented in the documentation system
Doxygen. A HTML-version of this documentation is included in the appended
ZIP-file. The HTML-document is probably the best place to start when trying
to understand the source code.
The documentation can be found in the directory hwos_sw/docs.
The code is also in the appendix (see section D) for convenience.
49
5. Implementation: Scheduler and HWOS 50
5.4 Work done by Vegard Endresen
The software part of the HWOS was started from the implementation made by
Vegard Endresen in [End10]. However, as stated by Vegard Endresen in section
2.1.2, this implementation was rather rough and done without a thoroughly
preanalysis. Most of the implementation has been replaced or completely
rewritten. The parts that are based on Vegard’s implementation is hdev (not
used in this thesis) and hmqueue (queue of System V message queues). Also
the concept of having a message server and a daemon has been taken from
him.
It should be clearly defined what is original work from this thesis in the
.h-files or in the Doxygen-documentation.
5.5 General structure of the HWOS
The HWOS has been implemented as a daemon running as a background
process in uClinux on the Suzaku. As shown in figure 5.1, the daemon was
implemented as four threads: The scheduler thread, the placer thread, the
message server thread and the timer thread.
Scheduler 
thread 
(hdsched)
Placer thread 
(hdplacer)
Timer thread 
(hdtimer)
Message 
server thread 
(hdtimer)
HWOS daemon (hdaemon)
Figure 5.1: General structure of the HWOS-daemon.
5.6 General structure of the message server
The message server wait for new events from clients. The module makes heavy
use of the hmsg message interface module in the HWOS-library. When a client
5. Implementation: Scheduler and HWOS 51
registers a new process, this is signalled to the scheduler thread.
5.7 General structure of the placer
The placer thread works really simple. It waits for a timer interrupt from
the timer thread. When this is received, the placer replaces the currently
running process on the FPGA and puts it in the queue for rescheduling. This
is signalled to the scheduler thread.
5.8 General structure of the timer
The timer simply waits for one predefined time slice. Each time a time slice
has been reached, the timer interrupts the placer.
5.9 General structure of the scheduler
The concept of processes in priority queues (as discussed in 3.7.5) was imple-
mented. This means that each process of the same priority and state belongs
to the same queue. There is for example a list of queues in the READY-state.
Each queue in this list has processes with the same priority and the scheduler
will always choose the process with the highest priority (see figure 5.2 and
implementation below).
The scheduler use the module hevent for handling events from the message
server and for sending messages to the hplacer module. The scheduler per-
forms long-term scheduling, short-term scheduling or rescheduling of a process
depending on what message it receives. This are listed below:
Long-term scheduling The message server has added a new process to the
queue of new processes. The new process is fetched by the scheduler. If
the process is admitted, it will be added to the correct READY-queue
or a new READY-queue will be created for the process.
Short-term scheduling A new process has been added to the READY-
queue. The scheduler gets the process and makes it ready for the placer
module for placement on the FPGA.
Rescheduling The time slice for the running process has elapsed. The pro-
cess is added to the queue for rescheduled processes and is fetched by
the scheduler for rescheduling.
5. Implementation: Scheduler and HWOS 52
5.10 List of scheduler queues: hsqlist
The structure for hsqlist is shown in listing 5.1. Each instance of this struc-
ture keeps a list of scheduler queues. Each queue in this list is supposed to
keep processes of the same state. Because of this, the system only contains a
finite number of queues kept in an array as shown in listing 5.2.
Listing 5.1: Structure for list of scheduler queues (from libhwos/hsqlist.c)
1 /∗ ! A s imple s t r u c t u r e f o r a l i s t o f queues .
2 ∗ One ins tance o f t h i s s t r u c t u r e po in t s to one l i s t o f
queues .
3 ∗
4 ∗ There are on ly as many in s t ance s o f t h i s s t r u c t u r e as
t he r e are
5 ∗ d i f f e r e n t proces s s t a t e s ( de f ined by the cons tant
HPS NUMBER) .
6 ∗/
7 struct h s q l i s t {
8 // ! Number o f queues in t h i s l i s t .
9 int queues num ;
10 // ! F i r s t queue element in t h i s l i s t .
11 struct hsqueue∗ f i r s t q u e u e ;
12 // ! Last queue element in t h i s l i s t .
13 struct hsqueue∗ l a s t queue ;
14 } ;
Listing 5.2: Array of queue lists (from libhwos/hsqlist.c)
1 /∗ ! Array o f queue l i s t s .
2 ∗ There i s one l i s t f o r each p o s s i b l e s t a t e in the
s chedu l e r .
3 ∗ Warning : This w i l l r e s u l t in a compi ler error i f the
number o f NULL’ s on the r i g h t
4 ∗ s i d e o f the expre s s i on does not match HPS NUMBER.
5 ∗/
6 stat ic struct h s q l i s t ∗ q u e u e l i s t s [HPSNUMBER] = {NULL, NULL
, NULL, NULL, NULL} ;
An conceptual drawing of the structure of an hsqlist is shown in figure 5.2.
An UML-class diagram of the hsqlist, hsqueue and hprocess is shown in figure
5.3.
5. Implementation: Scheduler and HWOS 53
type=struct hprocess*
state=HPS_READY
priority=base_priority+1
type=struct hprocess*
state=HPS_READY
priority=base_priority+1
type=struct hsqueue*
. . .
type=struct hprocess*
state=HPS_READY
priority=base_priority+0
type=struct hprocess*
state=HPS_READY
priority=base_priority+0
type=struct hsqueue*
. . .
.
.
.
type=struct hsqlist*
Figure 5.2: Conceptual drawing of hsqlist, hsqueue and hprocess.
5. Implementation: Scheduler and HWOS 54
Figure 5.3: UML 2 Class Diagram of hsqlist, hsqueue and hprocess. Arguments
to functions are omitted.
5. Implementation: Scheduler and HWOS 55
5.11 Process structure: hprocess
A structure for a process was defined as in listing 5.3. This struct is not visible
to the rest of the code, but the interface to the module is shown in D.19.
Listing 5.3: Structure for process (from libhwos/hprocess.c)
1 struct hproces s {
2 // ! The process id .
3 int pid ;
4 // ! The s t a t e o f the proces s .
5 int s t a t e ;
6 // ! Value to manipulate p r i o r i t y ( s e t by user
a p p l i c a t i o n ) .
7 int n i c e ;
8 // ! Actual p r i o r i t y f o r proces s .
9 int p r i o r i t y ;
10 // ! F i l e where the FPGA−b i t s t r eam fo r the proces s i s
.
11 char∗ b i t f i l e n ame ;
12 // ! F i l e where the s t a t e data f o r the proces s i s .
13 char∗ s t a t e f i l e n ame ;
14 struct hproces s ∗ next ;
15 struct hproces s ∗ prev ;
16 struct hsqueue∗ parent queue ;
17 } ;
This structure holds the basic information about the process such as ID,
state, priority and filenames for the bitfile and the state data. The fields next,
prev and parent_queue are for keeping the process in a double linked list (as
explained in section 5.12).
5.12 Scheduler queue structure: hsqueue
As described in section 3.7.5, each process belongs to a queue. Each hsqueue
contains a queue of processes of the same state and priority.
Listing 5.4: Structure for process queue (libhwos/hsqueue.c)
1 // ! A doub le l i n k e d l i s t s t r u c t u r e f o r a proces s queue .
2 // ! One ins tance o f t h i s s t r u c t u r e po in t s to one queue o f
p roce s s e s .
3 // !
4 // ! Seve ra l i n s t ance s o f t h i s s t r u c t u r e forms a l i s t o f
queues .
5 // ! A l l queues in the same l i s t w i l l have proce s s e s wi th the
same s t a t e .
6 struct hsqueue {
7 // ! Number o f p roce s s e s in the queue .
8 int processes num ;
9 // ! Points to the parent l i s t , i f any .
10 int p a r e n t l i s t ;
11 // ! F i r s t e lement in the queue .
12 struct hproces s ∗ f i r s t p r o c e s s ;
13 // ! Last e lement in the queue .
14 struct hproces s ∗ l a s t p r o c e s s ;
15 // ! Points to the next queue in t h i s l i s t .
16 struct hsqueue∗ next ;
17 // ! Points to the prev ious queue in t h i s l i s t .
5. Implementation: Scheduler and HWOS 56
18 struct hsqueue∗ prev ;
19 } ;
5.13 Rewritten version of icap write: hicap
The implementation of icap write by Sverre Hamre (see section 3.4.3) was
rewritten to a library module in the HWOS library. The interface was changed
and documented in Doxygen. Whereas the previous version of icap write had
the adressing of frames hard coded in the program, the new implementation
does this conversion automatically.
The newest hicap was not tested thoroughly. This should be done before
using it in real systems. The header file defining the interface for hicap can be
found in D.13.
Chapter 6
Verification and results: Partial
reconfiguration of synchronous
modules at run-time
The verification of the clocking methodology is divided into test suites. Each
suite contains a set of test cases for a particular system.
Many of the tests were done manually: Some system were set up and testing
was by manually running a program and providing input on the command line.
For the most important test suites, the tests were written as automated tests
in the Check framework 3.8.5.
6.1 Definitions
The definitions for this chapter is the same as for chapter 4.
6.2 First test suite: Simple flip-flop designs
6.2.1 Test case 1: Cut the reconfigurable modules from
the base designs
Setup
In this test case, base design 1 was initially built and placed on the FPGA. Base
design 2 was also built and bitfiles was generated for both designs. Building
the modules this way takes a lot of time (over 30 minutes for each complete
bifile). CLBRead from section 3.4.2 was used to cut out the modules from
both designs. Listing 6.1 shows how this was done for both the base designs.
The same syntax was used for both bitfiles.
57
6. Verification and results: Partial reconfiguration of synchronous modules at
run-time 58
The two resulting partial bitfiles corresponds to reconfigurable module 1
and 3 in figure 4.1. These bitfiles were uploaded to the FPGA using NFS.
1 $ . /CLBRead − i on ed f f ba s e . b i t −o oned f f p a r t . b i t −fmR −sc
21 −ec 23 −verbose
2 Input f i l e name : oned f f ba s e . b i t
3 Output f i l e name : on ed f f p a r t . b i t
4 Frame mode , read ing out frames
5 read frame : a c e s s i n g
6 read frame : Memory a loca t ed
7 l ocateFrameStart : Synch word = aa995566
8 l ocateFrameStart : FDRI word = 30004000
9 l ocateFrameStart : word = 50024090
10 l ocateFrameStart : Type2 header detec ted
11 l ocateFrameStart : Word count : 147600
12 l ocateFrameStart : Number o f c on f i gu r a t i on b i t s : 4723200
13 l ocateFrameStart : Conf igurat ion bytes = 4801
14 locateFrameFromCLBnr : frame : 644
15 locateFrameFromCLBnr : CLBcol : 21
16 locateFrameFromCLBnr : frame : 1190
17 read frame : frames to read out : 66
18 read frame : f rameStart : 199961
19 read frame : Read out CLBs startCLB : 21 endCLB : 23
20 read frame : startCLB >= 16 startCLB < 24
21 warning BRAM/DSP connect ions read out not implemented
22 read frame : Clos ing f i l e s
Listing 6.1: Using CLBRead to cut out partial reconfigurable modules from
base design 1 and 2.
The program icap write (as described in section 3.4.3) was modified (around
line 237) as in listing 6.2.
1 // Address hardcoded in , row 2(=1) from center , CLB 21(=25) ,
frame 0 .
2 i f ( wr i t e heade r (handlemem , 1 , 25 , 0 , frames ) < 0) {
Listing 6.2: Modification of icap write for flip-flops tests
Because the old version of icap write had hardcoded the addressing of
frames, this program had to be recompiled after the change. The new version
of the icap write, called hicap, does include an easier interface for adressing
(see section 5.13).
A kernel driver and a software program (from [Han10]) were used to write/read
to the flip-flops on the FPGA. This program could also control the clock enable
port (CE) of the global clock buffer.
Objective
A setup similar to the one in this test case was set up in [Han10]. In that
project, a partial bitfile was written to another bitfile containing the complete
FPGA-configuration, but the experiment was not done while the FPGA was
running. The resulting bitfile was later verified by uploading it to the FPGA.
In this test case, the partial bitfiles is written to a running FPGA using
icap_write. The running configuration is verified before and after the writing.
6. Verification and results: Partial reconfiguration of synchronous modules at
run-time 59
The objective is to verify that the experiments from [Han10] can be performed
at run-time also.
Testing and results
The partial bitfiles is in this case written to the running configuration by
issuing the command on the form depicted in 6.3.
1 $ . / i c ap wr i t e − i p a r t i a l b i t f i l e . b i t −f 22
Listing 6.3: Usage of icap write for DFF-tests.
The running configuration were tested by setting each flip-flops input value
HIGH one at a time. The same tests were performed when the clock enable
(CE) signal to the clock buffer was turned off.
All flip-flops did respond to the input values when CE was high. No change
happened when it was low.
Summary
The tests showed that clocked, reconfigurable modules can be extracted from
one bitfile and inserted into another at run-time. This verifies the results from
[Han10]. Note that this test suite did not test if the flip-flops and the clock
routing could be moved left when performing reconfiguration. It should also be
noted that building the reconfigurable module together with the base design
takes a lot of time.
6.2.2 Test case 2: Build the base design and the recon-
figurable module separately
Setup
In this test case, base design 1 was initially placed on the FPGA. Reconfig-
urable module 1 and 3 were built separately in ISE as described in section
4.4.2.
For cutting out the modules from the ISE-designs, CLBRead were used as
in listing 6.1. This produced partial bitfiles for each module.
icap write was modified as in listing 6.2 and used as in listing 6.3.
A kernel driver and software program (from [Han10]) was used for writ-
ing/reading to the flip-flops.
6. Verification and results: Partial reconfiguration of synchronous modules at
run-time 60
Objective
The objective was to verify that modules built separately in ISE may not be
compatible with a static design built in EDK.
Testing and results
The partial bitfile was written to the configuration memory using icap write.
Several test values was written to the registers, but no response were seen.
Summary
The test case showed that building the base design in EDK and the reconfig-
urable modules in ISE made the reconfigurable modules not compatible with
the base design. After writing the partial bitfile through ICAP, the resulting
configuration on the FPGA did not work as expected and the flip-flops were
not updated when the input signals were changed.
6.2.3 Test case 3: Build the base design with DIRT
Setup
In this test case, base design 3 was initially placed on the FPGA. Reconfig-
urable module 1, 2, 3 and 4 were built separately in ISE as described in section
4.4.2.
For cutting out the modules from the ISE-designs, CLBRead was used as
in listing 6.1. This produced partial bitfiles for each module.
Because these tests are so important, it was written automated tests for
them in the Check framework. These tests use the HWOS-library and the new
hicap module for writing each partial bitfile to ICAP.
The directory dff_tests in the appendix contains the following subdirec-
tories:
bitfiles
Pregenerated bitfiles for the base design and the reconfigurable modules.
check-powerpc
Precompiled version of the Check library for the PowerPC-processor on
the Suzaku board.
drivers
Precompiled kernel drivers for ICAP-access and access to the HW/SW-
accessible registers.
6. Verification and results: Partial reconfiguration of synchronous modules at
run-time 61
swreg driver
Source code for the HW/SW-access-driver. This driver is used to read-
/write to the flip-flops on the FPGA.
tests
Source code for actual tests.
The test program performs several tests on all the reconfigurable modules.
The input of all the flip-flops are automatically toggled and the response is
registered both when the Clock Enable signal (CE) is HIGH and when it is
LOW.
The tests are compiled the same way as the HWOS as described in section
B. Some testing was also done manually to make sure the automated tests
worked correctly.
Objective
The objective was to show that partial run-time reconfiguration of clocked
modules could be performed if the method using DIRT constraint (see section
3.6.3) was used.
Testing and results
The output from the tests are shown in listing 6.4. The tests were conducted
on two different Suzaku development boards.
Listing 6.4: Automatic tests for DFFs with DIRT
1 Se l f−r e c on f i g u r a t i o n o f synchronous modules on FPGA
2 Pr e r e qu i s i t e s :
3 Development board : Suzaku−V.
4 Proces sor : Power−PC.
5 FPGA: Virtex−4 XC4VFX12 .
6 OS: atmark−d i s t −20090318 ( l inux −2.6.18− at11 ) .
7 Base FPGA−des ign : b i t f i l e s / d f f w i t h bu f g c e w i t h d i r t . b i t (
must be uploaded with n e t f l a s h ) .
8
9 These t e s t s w i l l perform pa r t i a l r e c on f i g u r a t i o n o f
synchronous modules on the FPGA.
10 For each time the FPGA i s r econ f i gured , the new des ign w i l l
be thoroughly t e s t ed .
11 ===
12 I n s e r t i n g module : One f l i p −f l o p .
13 ===
14 Running s u i t e ( s ) : oned f f
15 100%: Checks : 3 , Fa i l u r e s : 0 , Errors : 0
16 t e s t o n e d f f . c : 4 5 :P : Core : t e s t c r e a t e : 0 : Passed
17 t e s t o n e d f f . c : 9 1 :P : Core : t e s t c l o c k e n ab l e d : 0 : Passed
18 t e s t o n e d f f . c : 1 4 3 :P: Core : t e s t c l o c k d i s a b l e d : 0 : Passed
19 ===
20 I n s e r t i n g module : Three f l i p −f l o p s .
21 ===
22 Running s u i t e ( s ) : t h r e e d f f s
23 100%: Checks : 3 , Fa i l u r e s : 0 , Errors : 0
24 t e s t t h r e e d f f s . c : 4 4 :P : Core : t e s t c r e a t e : 0 : Passed
6. Verification and results: Partial reconfiguration of synchronous modules at
run-time 62
25 t e s t t h r e e d f f s . c : 1 2 0 :P: Core : t e s t c l o c k e n ab l e d : 0 : Passed
26 t e s t t h r e e d f f s . c : 1 9 1 :P: Core : t e s t c l o c k d i s a b l e d : 0 : Passed
27 ===
28 I n s e r t i n g module : One f l i p −f l o p ( s h i f t e d one column l e f t ) .
29 ===
30 Running s u i t e ( s ) : oned f f
31 100%: Checks : 3 , Fa i l u r e s : 0 , Errors : 0
32 t e s t o n e d f f . c : 4 5 :P : Core : t e s t c r e a t e : 0 : Passed
33 t e s t o n e d f f . c : 9 1 :P : Core : t e s t c l o c k e n ab l e d : 0 : Passed
34 t e s t o n e d f f . c : 1 4 3 :P: Core : t e s t c l o c k d i s a b l e d : 0 : Passed
35 ===
36 I n s e r t i n g module : Three f l i p −f l o p s ( s h i f t e d one column l e f t )
.
37 ===
38 Running s u i t e ( s ) : t h r e e d f f s
39 100%: Checks : 3 , Fa i l u r e s : 0 , Errors : 0
40 t e s t t h r e e d f f s . c : 4 4 :P : Core : t e s t c r e a t e : 0 : Passed
41 t e s t t h r e e d f f s . c : 1 2 0 :P: Core : t e s t c l o c k e n ab l e d : 0 : Passed
42 t e s t t h r e e d f f s . c : 1 9 1 :P: Core : t e s t c l o c k d i s a b l e d : 0 : Passed
43
44 Summary :
45 Al l t e s t s passed !
One significant problem was that it took several seconds between each new
partial reconfiguration.
Summary
All tests worked as expected. These tests showed that a static interface could
be defined for routing of clock signals. This is done using Directed Routing
and makes it possible to perform partial self-reconfiguration of synchronous
modules on the FPGA.
One problem was large reconfiguration-time.
6.3 Second test suite: Instruction- and data
cache backend
6.3.1 Test case 1: Make the backend compatible with
synchronous modules
Setup
In this test, base design 4 (backend) was built in EDK. Reconfigurable module
5 (the multiplicator) was built in ISE.
After this was done, the DIRT constraint for the ISE-design was extracted
similar to the last test case in test suite 1. Because all flip-flops must be
defined for the UCF, this constraint was really long. It was also seen that the
relative paths to the components were different in the base design and in the
reconfigurable module. This had to be changed before the constraint were put
6. Verification and results: Partial reconfiguration of synchronous modules at
run-time 63
in the base design.
The last thing that was done in the setup was placing the DIRT-constraint
in the UCF-file of the base design. This is the same methodology as for the
last successfull test case in test suite 1.
Objective
The objective of this test was to show that constraints could be put on the
clock signal for the backend module created by Vegard Endresen in [End10].
The backend was built with the same module as in his thesis.
Testing and results
This test could not be performed on the FPGA, because the Directed Routing
Constraint were reported by the Xilinx tools to not beeing followed during the
build process. The output from Xilinx EDK is shown in listing 6.5.
Listing 6.5: Output from EDK after adding DIRT to Vegards backend with
multiplier
1 INFO: Route − One or more d i r e c t ed rout ing (DIRT) c on s t r a i n t s
generated for a s p e c i f i c dev i c e have been found . Note
that
2 DIRT s t r i n g s are guaranteed to work only on the same
dev i c e they were c reated for . I f the DIRT con s t r a i n t s
f a i l ,
3 v e r i f y that the same conne c t i v i t y i s a v a i l a b l e in the
t a r g e t dev i ce for t h i s implementation .
4
5 # of EXACT MODE DIRECTED ROUTING found :1 , SUCCESS:0 ,
FAILED:1
This was the only information provided on why it failed.
Summary
The test did not succeed. Xilinx EDK refused to follow the extracted DIRT
constraint.
6.3.2 Test case 2: Make the backend compatible with
synchronous modules using dummy module
Setup
In this test, base design 4 (backend) was modified so it used reconfigurable
module 1 (one flip-flop) as the initial reconfigurable module. The complete
design was built in EDK. Reconfigurable module 1 (one flip-flop) was modified
to make use of the same bus macros as base design 4 (backend).
6. Verification and results: Partial reconfiguration of synchronous modules at
run-time 64
Similar to the previous test case, the DIRT constraint for the ISE-design
was extracted and placed into the UCF-file of the base design. Similar to the
test cases in test suite 1, the extracted DIRT was really short and only specifed
the global clock buffer, the flip-flop and the routing between them.
Objective
The objective of this test was to show that constraints could be put on the
clock signal for the backend module created by Vegard Endresen in [End10]. By
building his backend with a very simple flip-flop-design as initial reconfigurable
module, Xilinx EDK would hopefully follow the constraint.
Testing and results
When building this design, Xilinx EDK reported the constraint to be followed
successfully. The output from Xilinx EDK is shown in listing 6.6.
Listing 6.6: Output from EDK after adding DIRT to Vegards backend with a
dummy module
1 INFO: Route − One or more d i r e c t ed rout ing (DIRT) c on s t r a i n t s
generated for a s p e c i f i c dev i c e have been found . Note
that
2 DIRT s t r i n g s are guaranteed to work only on the same
dev i c e they were c reated for . I f the DIRT con s t r a i n t s
f a i l ,
3 v e r i f y that the same conne c t i v i t y i s a v a i l a b l e in the
t a r g e t dev i ce for t h i s implementation .
4
5 # of EXACT MODE DIRECTED ROUTING found :1 , SUCCESS:1 ,
FAILED:0
However, after uploading the base design to the FPGA, the Suzaku would
not boot anymore. This happened despite to the fact that the design was built
with CRC-checking.
Summary
The test did not succeed. Xilinx EDK did follow the DIRT constraint, but the
Suzaku deadlocked when rebooting the new base design.
Chapter 7
Verification and results:
Scheduler
The verification of the HWOS and the scheduler is divided into test suites.
Each suite contains a set of test cases for a particular module.
All tests were written as automated tests in the Check framework (see
section 3.8.5). This should make it easy for a developer to see in detail what
has been tested and quickly reproduce the test results. The actual source files
for the tests are pretty large and are therefore only included in the appended
ZIP-file.
7.1 Portability
The tests and the Check framework are written in ANSI-C. They can be
compiled for both the x86-architecture on the development computer and the
PowerPC-architecture on the Suzaku. How to build the tests for each archi-
tecture is described in section B.
7.2 Test strategy
The actual tests are written in C and is divided into test suites, test cases
and unit tests. Each test suite tests a particular module in the system. Each
test case tests a typical scenario that the module should handle through it’s
interface. For each test case, there is a number of unit tests. Some initial
conditions and setup must be done for each test case, but the test cases was
meant to run independently of each other and that each unit test should run
independently of other unit tests. The Check framework helps to structure the
tests this way.
65
7. Verification and results: Scheduler 66
7.3 Description of test suites
hprocess
Tests that the process module can be created, removed and that fields
in the process image can be set and read.
hsqueue
Tests basic functionality for a scheduler queue. Tests that a large amount
of processes can be enqueued and dequeued.
hsqlist
Tests basic functionality for a list of scheduler queues. Tests that a large
amount of scheduler queues can be inserted and removed from the list.
hmqueue
Tests basic functionality for a queue of System V message queues. Tests
that message queues can be added, removed, enqueued and dequeued.
hdev
Tests basic functionality for a list of kernel devices. Tests that a devices
can be added, removed, enqueued and dequeued.
hlist
Tests basic functionality for a general list structure. Test that list ele-
ments can be added, removed, enqueued and deququed.
hvmem
Tests allocation and deallocation of virtual memory. Note that this func-
tionality was just started and is not done.
hwos daemon
Tests basic message passing between a client and the HWOS message
server. Also tests that a process can be registered and scheduled by a
client.
7.4 Test results
7.4.1 The HWOS-daemon
It was verified that the message server worked as expected. It was also verified
that the scheduler could receive new processes and reschedule processes in a
round-robin manner. The placer worked as expected and printf statements
showed that it potentially could place FPGA-tasks. However, because the
instruction- and data cache backend was not working, it was not performed
placement of real FPGA-tasks.
7. Verification and results: Scheduler 67
7.4.2 The HWOS-library
All the tests were run successfully both on the development computer and the
Suzaku board. A output trace from the test program is shown in listing 7.1.
1 1=yes , 0=no
2 Perform test on daemon (daemon must be s t a r t ed ) ?
3 1
4 ===
5 Running s u i t e ( s ) : hproces s
6 100%: Checks : 1 , Fa i l u r e s : 0 , Errors : 0
7 t e s t hp r o c e s s . c : 5 5 :P : Core : t e s t p r o c e s s c r e a t e : 0 : Passed
8 ===
9 Running s u i t e ( s ) : hsqueue
10 100%: Checks : 3 , Fa i l u r e s : 0 , Errors : 0
11 t e s t h squeue . c : 8 3 :P: Core : t e s t qu eu e c r e a t e : 0 : Passed
12 t e s t h squeue . c : 1 1 0 :P: Core : t e s t enqueue : 0 : Passed
13 t e s t h squeue . c : 1 4 8 :P: Queue : t e s t dequeue : 0 : Passed
14 ===
15 Running s u i t e ( s ) : h s q l i s t
16 100%: Checks : 4 , Fa i l u r e s : 0 , Errors : 0
17 t e s t h s q l i s t . c : 7 8 :P: Core : t e s t l i s t c r e a t e : 0 : Passed
18 t e s t h s q l i s t . c : 1 0 5 :P: Core : t e s t add queue s : 0 : Passed
19 t e s t h s q l i s t . c : 1 3 6 :P: Core : t e s t i n s e r t q u e u e s b e f o r e : 0 :
Passed
20 t e s t h s q l i s t . c : 1 7 5 :P: Remove : t e s t r emove queues : 0 : Passed
21 ===
22 Running s u i t e ( s ) : hmqueue
23 100%: Checks : 3 , Fa i l u r e s : 0 , Errors : 0
24 test hmqueue . c : 5 4 :P : Core : t e s t qu eu e c r e a t e : 0 : Passed
25 test hmqueue . c : 8 1 :P : Enqueue : t e s t enqueue : 0 : Passed
26 test hmqueue . c : 1 1 9 :P: Dequeue : t e s t dequeue : 0 : Passed
27 ===
28 Running s u i t e ( s ) : hdev
29 100%: Checks : 3 , Fa i l u r e s : 0 , Errors : 0
30 t e s t hdev . c : 5 8 :P : Core : t e s t qu eu e c r e a t e : 0 : Passed
31 t e s t hdev . c : 8 9 :P : Enqueue : t e s t enqueue : 0 : Passed
32 t e s t hdev . c : 1 2 7 :P: Dequeue : t e s t dequeue : 0 : Passed
33 ===
34 Running s u i t e ( s ) : h l i s t
35 100%: Checks : 4 , Fa i l u r e s : 0 , Errors : 0
36 t e s t h l i s t . c : 7 7 :P : Core : t e s t qu eu e c r e a t e : 0 : Passed
37 t e s t h l i s t . c : 1 0 6 :P: Core : t e s t enqueue : 0 : Passed
38 t e s t h l i s t . c : 1 6 9 :P: Core : t e s t i n s e r t i n b e tw e e n : 0 : Passed
39 t e s t h l i s t . c : 2 0 3 :P: Queue : t e s t dequeue : 0 : Passed
40 ===
41 Running s u i t e ( s ) : hvmem
42 100%: Checks : 1 , Fa i l u r e s : 0 , Errors : 0
43 test hvmem . c : 5 6 :P: Core : t e s t hvmem al locate : 0 : Passed
44 ===
45 Running s u i t e ( s ) : hwos daemon
46 100%: Checks : 4 , Fa i l u r e s : 0 , Errors : 0
47 test hwos daemon . c : 7 0 :P: Core : t e s t s e t up : 0 : Passed
48 test hwos daemon . c : 7 0 :P: Reg i s t e r message queue : t e s t s e t up : 0 :
Passed
49 test hwos daemon . c : 7 8 :P: Reg i s t e r message queue : test rmqueue
: 0 : Passed
50 test hwos daemon . c : 9 4 :P: Process : t e s t r e g i s t e r p r o c e s s : 0 :
Passed
51
52 Summary :
53 Al l t e s t s passed !
Listing 7.1: Output trace from testing of the HWOS-library.
Chapter 8
Discussion
8.1 Partial self-reconfiguration of synchronous
modules
The results from this thesis has shown that partial self-reconfiguration of
clocked reconfigurable modules can be performed on a running FPGA. This
was the most important goal of the thesis. All the tests with the simple,
clocked flip-flop designs were working when constraining the clock signal with
Directed Routing. This solution should in theory be compatible with any re-
configurable system using bus macros. This is because the clock signal is the
only signal not piped through these macros and because the bus macros make
it possible to build the static system with any kind of reconfigurable module.
In theory, any such reconfigurable system can be built with a simple flip-flop
and a corresponding global, clock buffer.
One significant problem did occur: The reconfiguration time for each partial
design was several seconds long. This is occured both for the old and the new
version of icap write. This is in large contrast to the results by Vegard Endresen
in his project report. These results show that partial reconfiguration of quite
large, asynchronous modules can be performed in milliseconds. It is possible
that the reconfiguration time get larger when using a clock signal, but several
seconds seems to be quite high. The problem could be due to some hardware
failure, but this is rather unlikely as the tests were conducted on two different
Suzaku development boards.
The results shows some challenges when integrating the clocking methodol-
ogy in the instruction- and data cache backend by Vegard Endresen [End10].
The backend can be built successfully with a constrained clock signal using
a dummy module, but the Suzaku fails to boot after uploading the complete
design to the FPGA. This is quite strange, as the dummy module and the
backend both worked fine when not building them together. Since the recon-
figurable dummy module is built together with the backend using bus macros,
the backend should presumably be built the same way as when building it
68
8. Discussion 69
together with the multiplier. There is no reason that signals would be opti-
mized away by the tools, because the bus macros function as a black box in the
design. Working with this system was quite challenging as a lot of time was
spent on understanding the system made by Vegard Endresen. One problem
was that it could not be found testbenches for each part of the system and that
the documentation on the actual implementation was a bit sparse. There is
also some uncertainty whether it works good with any reconfigurable module.
It also takes a lot of time to perform these tests. Each time a small change
is done in the base design, the complete system has to be built in EDK (a
process that takes over 30 minutes). Each time the Suzaku deadlocks, it have
to be reflashed with an inital configuration through JTAG using a dedicated
development computer.
The results from [Han10] have been the largest motivation for finding a way
to perform reconfiguration of synchronous modules at run-time. The general
motivation for using synchronous modules in the framework has been disussed
in section 3.5. This shows that synchronous design is the most reasonable de-
sign technique for an FPGA, at least if no special framework for asynchronous
development is used.
When searching for available litterature on reconfiguration of synchronous
modules, it could not be found any articles with a specific solution to the
problem. One reason could be that many of them are using the partial recon-
figuration flow from Xilinx and that this makes the reconfiguration of clock
signals transparent to the user. For example does the article in [JIA06] (“The
GAPLA: a globally asynchronous locally synchronous FPGA architecture”)
state that static conditions between modules must be done through bus macros
and that global clock signals must be used for reconfigurable modules accord-
ing to Xilinx. The author also writes that using asynchronous modules will
remove the need for clock distribution.
8.2 Scheduler
A lot of time was spent on developing the scheduler for the HWOS. The ulti-
mate goal was to perform scheduling of FPGA-based tasks. The state loading
facility made by Vegard Endresen in [End10] could be used when interrupting
modules. However, because the modified version of his backend did not work
on the FPGA, this was not done.
However, the scheduler did work when registering processes from a test
client. It was shown that the multithreaded software program could perform
scheduling of several processes in a round-robin manner. This scheduler can
be further optimized by placing some parts of it in hardware. Typically would
the placement process be very complex as modules can be placed at many
different places on the FPGA. The article in [QDG] states the following:
8. Discussion 70
The on-line scheduling of HW tasks on PRTR FPGA is much more
complicated than SW scheduling. SW tasks only share computing
resources in the time dimension, while HW tasks share computing
resources in not only the time but also the space dimension. The
on-line HW task scheduling algo- rithms are usually very time-
consuming.
The focus for the scheduler in this thesis has been to schedule processes in
time. The article in [HSM] describes multitasking on the FPGA both through
paralell execution on the FPGA and through interrupting processes and replac-
ing them on the FPGA. The main advantage that is described is that sharing
of IO-resources becomes easier when only one process is running on the FPGA.
One could also imagine that the system would benefit from removing processes
waiting on IO-resources from the FPGA.
However, as described in section 3.7.6 and equation 3.1, a critical require-
ment for an interrupter is the time performed to interrupt a process, save the
state and reconfigure the region. Obviously, if the reconfiguration time is very
slow, the gain from performing partial reconfiguration would be less. In this
case it could be easier to reconfigure the complete FPGA and perform parallel
execution of a set of processes on the FPGA. To make the system in this thesis
usable, the cause of the long reconfiguration time must be found.
Chapter 9
Conclusion
The results from this thesis has shown that partial self-reconfiguration of syn-
chronous modules is possible on a running FPGA. It has been shown that
clock routing for a reconfigurable module can be constrained using Directed
Routing. This is similiar to using bus macros for normal signals. The proof of
concept is an important result from this thesis. In theory, this solution should
be compatible with any reconfigurable system built using bus macros.
A significant problem was the partial reconfiguration time when using this
solution. The delay was several seconds and is in contrast to earlier studies in
the AHEAD-project. These have shown that the partial reconfiguration time
is as low as a few milliseconds.
Experiments on partial self-reconfiguration was performed on the instruction-
and data cache backend made by Vegard Endresen. This backend is able to
shift data in and out of a reconfigurable module. The results show that this
static design can be built successfully with Directed Routing constraint on
the clock signal, but when uploading the design to the FPGA, the Suzaku
deadlocked and further testing could not be conducted.
The software parts of the HWOS have been restructured and rewritten
from scratch. A software based solution for performing simple scheduling of
reconfigurable modules has been made. This scheduler is working correct using
a round-robin scheduling policy.
71
Chapter 10
Further work
The work with the complete reconfigurable framework reveals several chal-
lenges ahead. A brief discussion of some of them is discussed in this chapter.
Some suggestions on how to solve them is also included.
10.1 Better testing of each part of the com-
plete framework
Several parts of the reconfigurable framework could have been better tested.
Especially since the complete framework has become quite large, testing each
part could make it less unwieldy and easier to maintain.
CLBRead and icap write is working in most cases, but it would be really
useful to set up a larger amount of test cases for these programs. For example
could some automated tests run on the Suzaku and reconfigure many parts of
the entire FPGA. Naturally, to test the area where the static design resides,
the static design would have to be moved by reconfiguring the complete con-
figuration. This is a bit problematic since the Suzaku must reboot each time
the static has been reconfigured.
Some more unit tests could have been set up on Vegards backend. This will
probably make further development on the interrupt procedure of modules
much easier.
Some more testing can be done on the scheduler and the version of the
HWOS developed in this thesis, especially on the top-level functionality.
It is higly recommended to make use of the Check framework and a test-
driven development strategy (see section 3.8.4). It has been set up for the
Suzaku-V and is working good for the development platform.
72
10. Further work 73
10.2 Further development
10.2.1 Partial self-reconfiguration
The program CLBRead can’t read out all possible sections of the FPGA. This
should be further developed.
When performing partial reconfiguration on the FPGA, it is a bit like “work-
ing in the dark”. If something fails it is hard to determine what went wrong. An
interesting idea is to read back the modified configuration from the FPGA and
use the work done by Ingar Hauge [Hau06] to analyze it graphically. Reading
back configurations from the FPGA should be possible according to [Xil09].
However, Ingar Hauges analysis software must first be rewritten to support
the Virtex-4 bitstream architecture. It is unsure how much work this is and
if it is possible at all, but it could potentially increase the knowledge on the
partial reconfiguration process done by icap write and make the debug process
much easier.
Bibliography
[Alt09] Altera. Understanding metastability in fpgas, 2009.
[AT11] Atmark-Techno. Suzaku developers site. http://suzaku-en.
atmark-techno.com (Retrieved 6. June 2011), 2011.
[Atm06] Atmark-Techno. Suzaku Software Manual, 1.3.1 edition, August
2006.
[Atm07] Atmark-Techno. Suzaku-V SZ310-U00 Hardware Manual, 1.0.0.1
edition, June 2007.
[Atm08] Atmark-Techno. Using NFS for the Root File System, 2008. http:
//suzaku-en.atmark-techno.com/dev/howtos/nfs (Retrieved 6.
February 2011).
[BC00] Daniel P. Bovet and Marco Cesati. Understanding the linux kernel
- chapter 10: Process scheduling. http://oreilly.com/catalog/
linuxkernel/chapter/ch10.html (Retrieved 19. May 2011), 2000.
[che09] Check: A unit testing framework for c. http://check.
sourceforge.net (Retrieved 22. May 2011), 2009.
[CS10] Jeroen Boydensand Piet Cordemans and Eric Steegmans. Test-
driven development of embedded software, January 2010.
[Dan04] Klaus Danne. Memory management to support multitasking on fpga
based systems. In In Proceedings of the International Conference
on Reconfigurable Computing and FPGAs (ReCon), page 21, 2004.
[End09a] Vegard Endresen. Creating a reconfigurable fpga system. NTNU,
2009. Tutorial.
[End09b] Vegard Endresen. Hardware task execution in reconfigurable sys-
tems. NTNU, 2009. Project report.
[End10] Vegard Endresen. Hardware-software intercommunication in recon-
figurable systems. Master’s thesis, NTNU, 2010.
[Eri00] Ken Erickson. Asynchronous fpga risks. In 2000 MALPD Interna-
tional Conference, 2000.
74
BIBLIOGRAPHY 75
[Eto07] Emi Eto. Difference-based partial reconfiguration (xilinx, xapp290),
2007.
[Ham08a] Sverre Hamre. Self reconfigurable system on a xilinx spartan3 fpga
by using bus macros, 2008. Project report.
[Ham08b] Sverre Hamre. Tutorial for creating a hard/bus macro to the spar-
tan3 fpga, December 2008. Tutorial.
[Ham08c] Sverre Hamre. Tutorial for using atmark techno fpga development
environment, 2008. Tutorial.
[Ham09] Sverre Hamre. Framework for self reconfigurable system on a xilinx
fpga. Master’s thesis, NTNU, 2009.
[Han10] Sindre Hansen. Self reconfiguration of clock networks on fpga.
NTNU, 2010. Project report.
[Hau06] Ingar Hauge. Analyse, dekomponering og rekonstruksjon av fpga-
konfigurasjoner for ahead. Master’s thesis, NTNU, 2006.
[HSM] L. Levinson H. Simmler and R. Manner. Multitasking on fpga co-
processors.
[Inc10] Arcturus Networks Inc. uClinuxTM – embedded linux microcon-
troller project – home page. http://www.uclinux.org/ (Retrieved
14. February 2011), 2010.
[JIA06] XIN JIA. Gapla: A globally asynchronous locally synchronous fpga
architecture, 2006.
[Joh06] Mikael Johansson. How to use NFS on SUZAKU-V. http://
staff.aist.go.jp/hirano-s/mikael/Notes/nfs.htm (Retrieved
9. February 2011), 2006.
[KR06] Ian Kuon and Jonathan Rose. Measuring the gap between fpgas
and asics. FPGA’06, February 22–24, 2006, Monterey, California,
USA, 2006.
[PJC09] Steve J.E. Wilton Peter Jamieson, Wayne Luk and George A. Con-
stantinides. An energy and power consumption analysis of fpga
routing architectures. Field-Programmable Technology, 2009.
[Ple04] Roman Plessl. Embedded machine on fpga. Master’s thesis, Swiss
Federal Institute of Technology Zurich, 2004.
[QDG] Nan Guan Qingxu Deng, Yi Zhang and Zonghua Gu. A uni-
fied hw/sw operating system for partially runtime reconfigurable
fpga based computer systems. Northeastern University, Shenyang,
China.
BIBLIOGRAPHY 76
[QWA09] Subodh Gupta Qiang Wang and Jason Anderson. Clock power
reduction for virtex-5 fpgas. FPGA’09, February 22–24, Monterey,
California, USA, 2009.
[Sal07] Peter Jay Salzman. The linux kernel module programming
guide. http://www.tldp.org/LDP/lkmpg/2.6/html/index.html
(Retrieved 9. February 2011), May 2007.
[SH09] Jason H. Anderson Safeen Huda, Muntasir Mallick. Clock gating
architectures for fpga power reduction. IEEE International Con-
ference on Field-Programmable Logic and Applications (FPL), pp.
112-118, Prague, Czech Republic, 2009.
[SKB02] Li Shang, Alireza Kaviani, and Kusuma Bathala. Dynamic power
consumption in virtex[tm]-ii fpga family. In FPGA, pages 157–164,
2002.
[Sta05] William Stallings. Operating systems Internals and Design princi-
ples. Pearson Prentice Hall, 5 edition, 2005.
[Ste05] Jennifer Stephenson. Design guidelines for optimal results in fpgas
(altera), 2005.
[suz06] Suzaku-V tutorials. http://ramsites.net/˜wcsleeman/ (Re-
trieved 10. February 2011), February 2006.
[TBC07] Wayne Luk Tobias Becker and Peter Y.K. Cheung. Enhancing relo-
catability of partial bitstreams for run-time reconfiguration, 2007.
[ucl11] uclibc. http://uclibc.org (Retrieved 22. May 2011), 2011.
[Ung02] Greg Ungerer. uCdot | uClinux merged into main line linux ker-
nel sources. http://www.ucdot.org/article.pl?sid=02/11/05/
0324207 (Retrieved 14. February 2011), November 2002.
[Wol05] W. Wolf. Computers as components : principles of embedded com-
puting system design, chapter 7 hardware accelerators. Elsevier,
2005.
[Xil05] Xilinx. Power management solution guide, July 2005.
[Xil08a] Xilinx. Constraints guide (10.1), 2008.
[Xil08b] Xilinx. Virtex-4 fpga user guide (ug070), 2008.
[Xil09] Xilinx. Virtex-4 fpga configuration user guide (ug071), 2009.
Appendix A
Tutorial for uClinux
This tutorial describes the process of compiling and configuring ATMARK-
dist. ATMARK-dist is operating system for the Suzaku boards (using a
PowerPC- or Microblaze-processor) and is built upon uClinux. uClinux is
a lightweight operating system based on the Linux kernel.
Compiling and configuring ATMARK-dist is a relatively easy task once the
correct cross-development tools are installed. This tutorial aims to give a quick
overview and help for setting up ATMARK-dist on the Suzaku-V sz410 board.
This board has a PowerPC-processor and a Virtex-4 FPGA. The operating
system used for development in this tutorial was Debian Squeeze (6.0).
Suzaku-V (embedded 
platform with PowerPC 
and FPGA)
Development computer (laptop 
running Ubuntu Linux)
NFS-server 
(sharing parts of 
the development 
computer's 
harddrive)
External 
storage
Figure A.1: Concept for using NFS on the Suzaku-V.
A brief description of how to set up the Network File System (NFS) on
the Suzaku will be given. This will make it possible to share files between
the development computer and the Suzaku-board. The Suzaku-V has 8 MB
of non-volatile flash-memory, but the problem is that files can only be per-
manently added by adding them to the system image. This means that the
complete filesystem has to be made ready on the development computer and
later uploaded to the Suzaku. A much better alternative is setting up the
uClinux-distribution for using NFS as a external storage device as shown in
77
A. Tutorial for uClinux 78
picture A.1.
A.1 Objective
O
The most important goal of this tutorial is to make the reader
able to set up the cross-development tools needed for compil-
ing the source code in this thesis. A developer in the AHEAD-
project should also be able to recompile and configure the Linux-
distribution for the embedded platform, for example for including
tools or kernel modules. To be effective in the development, NFS can be used
as an external storage device for the Suzaku.
Note that one of the objectives is to show how the development enviroment
was set up in practise on the given platform, not to document every single part
of the process. If there are any problems, please refer to the documentation
from Atmark-Techno.
A.2 Prerequisites
The prerequisites for this tutorial are listed in table A.2.
OS on development computer Debian Squeese 6.0
(kernel: linux-2.6.32-5-686)
OS on embedded platform atmark-dist-20090318
(kernel: linux-2.6.18-at11)
Embedded platform Suzaku sz410
Embedded processor PowerPC
Embedded software manual Suzaku Software
Manual 1.3.1 [Atm06].
Embedded hardware manual Suzaku-V SZ310-U00
Hardware Manual 1.0.0.1
[Atm07].
Table A.1: Prerequisites for this tutorial.
A.3 Download and compile ATMARK-dist
It is assumed that the reader is using a Linux based operating system for
compiling ATMARK-dist. It may be possible to do the compiling on newer
versions of for example Ubuntu, but this can cause problems as Ubuntu con-
tains libraries and configurations that may not be compatible with the cross-
development tools used in this tutorial. The safest choice is to use the virtual
A. Tutorial for uClinux 79
machine from Atmark Techno, which is based on a stable version of Debian
Linux. The newest version of the virtual machine can be downloaded from:
http://download.atmark-techno.com/suzaku/atde/
Sverre Hamre has written a tutorial on how to change the default language
and add extra hard-drive storage in [Ham08c].
After this is done, download an ISO-image containing the cross-development
tools, ATMARK-dist and uCLinux. This ISO-image can be downloaded at:
http://download.atmark-techno.com/suzaku/iso/
sv_20100924.iso was used and was the latest version when this tutorial was
written. As shown in A.1, the file sv_20100924.iso should be mounted and
the distribution files must be unpacked.
Listing A.1: Downloading and unpacking ATMARK-dist
1 # Download the iso− f i l e .
2 wget http :// download . atmark−techno . com/suzaku/ i s o /
sv 20100924 . i s o
3
4 # Make a d i r e c t o r y and mount the iso− f i l e to i t .
5 mkdir isomount
6 chmod 777 isomount
7 sudo mount −o loop sv 20100924 . i s o isomount/
8
9 # Change d i r e c t o r y in to suzaku/ d i s t and copy the f i l e s .
10 cd isomount/ suzaku/ d i s t
11 cp atmark−d i s t −20090318. ta r . gz . . / . . / . . /
12 cp l inux −2.6.18− at11 . ta r . gz . . / . . / . . /
13
14 # Go back to roo t d i r e c t o r y and unpack the f i l e s .
15 cd . . / . . / . . /
16 ta r xvf atmark−d i s t −20090318. ta r . gz
17 ta r xvf l inux −2.6.18− at11 . ta r . gz
The cross-development tools are in the folder
isomount/suzaku/cross-dev/powerpc/deb. Use the command
sudo dpkg -i [deb-file] to install the Debian package files. It is proba-
bly enough to install the package atde-essential-powerpc_9_all.deb and
its many dependencies. If two packages depend on each other, install both
and leave them unconfigured. Then issue sudo aptitude upgrade to resolve
dependencies. After this is done, run the code in A.2.
Listing A.2: Compile ATMARK-dist
1 # Link the Linux ke rne l to atmark−d i s t and con f i gu r e the
k e rne l .
2 cd atmark−d i s t −20090318
3 ln −s . . / l inux −2.6.18− at11 . / l inux −2.6 . x
4 make menuconfig
5
6 # Make cho i c e s here as de s c r i b ed in the t u t o r i a l
7 # . . .
8
9 # Run make when f i n i s h e d
10 make
It is possible to configure the kernel after executing make menuconfig. A
minimal configuration is described below.
A. Tutorial for uClinux 80
• Choose ”Vendor/Product Selection”.
• Choose ”AtmarkTechno” as Vendor and ”SUZAKU-V.SZ410” as Prod-
uct.
• Go to ”Exit”.
• Choose ”Kernel/Library/Defaults Selection”.
• Choose ”powerpc” as Cross-dev and ”uClibc” as Libc Version.
• Mark ”Customize Kernel Settings (NEW)” and ”Customize Vendor/User
Settings (NEW)”.
• Go to ”Exit” and ”Exit”.
This is actually the minimum configuration needed for ATMARK-dist to
run on the Suzaku-V. If the cross-development tools are installed, it should
be possible to just run make after the rest of the configuration is done. It is
recommended to keep the default network settings if facilities as telnet/FTP
and Network File System (NFS) are needed. You could also choose other
programs you find useful (for example dmesg among the Busybox-programs).
The image resides in images/image.bin. If adding files to the file system
is needed, put them in one of the folders in romfs/ and run the command
make image in the root directory afterwards.
It should be possible to login into the Suzaku by running telnet [IP-address].
If the Suzaku is connected to the local network, the IP-address can be found by
running the script in A.3. This will only work if your router assign IP-addresses
on the form 192.168.0.x. If not, change the script accordingly.
Listing A.3: Get all IP-addresses on LAN
1 for ip in $ ( p e r l −e ’ $ ,=”\n” ; p r i n t 1 . . 2 5 4 ; ’ ) ; do ping −t
1 −c 1 1 9 2 . 1 6 8 . 0 . $ip>/dev/ nu l l ; [ $? −eq 0 ] && echo ”
1 9 2 . 1 6 8 . 0 . $ ip UP” | | : ; done
Upload the new image to a local or external server and run the commands
in A.4 to reflash the uClinux-image.
Listing A.4: Telnet into the Suzaku and upload a new uClinux image
1 t e l n e t [ IP−address ]
2 n e t f l a s h −r image http :// local . s e r v e r . name/ suzaku/ image . bin
A.4 Setting up Network File System (NFS) on
Suzaku and development machine
Since the Suzaku has flash memory, everything that is created by user in the
file system will be lost after reboot of the device. It is possible to upload
A. Tutorial for uClinux 81
them every time using FTP and/or automatic scripts, but this can be quite
tedious. Also, the amount of memory can be limiting. Suzaku-V has 8MB
flash memory and a huge amount of the memory is needed for ATMARK-dist.
To solve these problems, the development computer can be set up as a NFS-
server. That means that the Suzaku will have access to a shared folder on
the development computer and the files will only be loaded into the Suzaku’s
memory whenever they’re executed or opened on the Suzaku.
Note that there are two ways of using NFS on the Suzaku. These are:
• Sharing a single folder between the Suzaku and the development com-
puter.
• Mount the complete Suzaku-filesystem in a folder on the development
computer. Then configure the Suzaku to boot from this folder across the
network.
The first option is discussed in this tutorial. The last option is probably a bit
more complex and is discussed in the tutorial from Atmark-Techno [Atm08].
Refer to this tutorial if you need to mount the complete filesystem across the
network.
The process of setting up NFS for a single folder is briefly discussed here.
The steps included are listed below and will be discussed in the subsequent
sections.
• Set up the development-PC to share a folder on the network.
• Set up the Suzaku-V as a client. This essentially means mounting a
folder.
On Debian based systems, the NFS server can be installed using the code
in A.5.
Listing A.5: Install and configure NFS in Debian
1 # I n s t a l l t he NFS se r v e r ( run as roo t )
2 apt−get i n s t a l l nfs−kerne l−s e r v e r
3
4 # Create a shared f o l d e r
5 mkdir −p /path−of−your−home−f o l d e r / suzaku shared
6
7 # Spec i f y the f o l d e r as shared .
8 # As root , open the f i l e / e t c / expor t s and add :
9 /path−of−your−home−f o l d e r / suzaku shared
192 . 1 6 8 . 0 . 0 / 2 5 5 . 2 5 5 . 2 5 5 . 0 ( rw , sync )
10 # The IP−address 192 .168 .0 .0 must be ad ju s t ed f o r your
rou te r .
11
12 # After you have done t h i s , run the f o l l ow i n g command to
r e s t a r t NFS−s e r v e r ( as roo t ) :
13 / e tc / i n i t . d/ nfs−kerne l−s e r v e r r e s t a r t
A. Tutorial for uClinux 82
Once the server is configured, make sure ATMARK-dist is compiled with
support for NFS (it should be enabled by default). To mount the shared folder
on the Suzaku, type the code in A.6. Without the option nolock the mount-
point will be locked the first 5 minutes. The options rsize and wsize specifies
the buffer size and makes it possible to load/copy larger files [Joh06].
Listing A.6: Mount a NFS folder on the Suzaku
1 # Change to a d i r e c t o r y where you have wri te−acces s
2 cd /var
3
4 # Create the d i r e c t o r y where you want to mount the NFS
f o l d e r
5 mkdir −p suzaku shared
6 chmod 777 suzaku shared
7
8 # Mount the shared NFS f o l d e r
9 # Replace 192 .168 .0 .133 wi th the IP−address to your
development computer
10 mount −o nolock , r s i z e =4096 , ws ize=4096 −t n f s 1 9 2 . 1 6 8 . 0 . 1 3 3 : /
address−to−your−home−f o l d e r / suzaku shared /var /
suzaku shared
This could be done more permanent by creating a folder and a file in the
romfs-folder of the root-directory where ATMARK-dist is compiled (see A.7).
Listing A.7: Adding an entry to fstab to permanently mount shared folder
1 # Current d i r e c t o r y i s where ATMARK−d i s t i s compi led ( atmark
−d i s t −20090318 in t h i s example ) .
2 # Create a d i r e c t o r y in romfs/mnt .
3 cd romfs /mnt
4 mkdir −p suzaku shared
5 chmod 777 suzaku shared
6
7 # Add the f o l l ow i n g t e x t to romfs/ e t c / f s t a b ( c r ea t e the f i l e
)
8 # Replace 192 .168 .0 .165 wi th the IP−address to the
development computer
9 192 . 1 6 8 . 0 . 1 6 5 : / path−to−your−home−f o l d e r / suzaku shared /
var / suzaku shared n f s nolock , r s i z e =4096 , ws ize=4096
0 0
10
11 # At the end o f the f i l e romfs/ e t c / i n i t . d/ rc . l o c a l , add :
12 mount −a
13 # This w i l l mount a l l f i l e s y s t em s when s t a r t i n g uClinux .
14
15 # Change d i r e c t o r y to the roo t d i r e c t o r y ( atmark−d i s t
−20090318 in t h i s example ) .
16 # Run the f o l l ow i n g command to update the f i l e system .
17 make image
This will automount the NFS-folder each time uClinux starts up.
A.5 Creating kernel modules
There are numereous sources documenting the act of writing your own kernel
module. For generic Linux modules, a guide called ”The Linux Kernel Module
Programming Guide” from ”The Linux Documentation Project” [Sal07] gives
A. Tutorial for uClinux 83
a quick introduction and is a good place to start. Some extra consideration
must be taken when writing kernel modules for the Suzaku-V. It is important
that the uClinux-kernel must be compiled with support for loadable modules.
A.6 Compiling HWICAP-driver for uClinux
The process of compiling the Hardware Internal Configuration Access Port
(HWICAP) is described in [Ham09, page 27]. Be careful when reading the
source code on these pages, as the output from the command diff can be
misunderstood.
Appendix B
Compiling the HWOS-code
The software code of the HWOS are in the directory hwos sw. The following
are the directory structure of this source code and strongly related parts.
libhwos/
Forms a library (libhwos.a) to be used when building the top-level enti-
ties of the HWOS. Functionality that (potentially) has to be reused are
typically in this library.
hwos daemon/
Code for the top-level daemon (server) to run in the background on the
Suzaku-board. Contains top-level entities for the message server, the
scheduler, a simple placer and a timer. These run in parallel threads.
tests/
Contains automated tests for the library and the daemon. Because the
library will be reused a lot, the most elaborate and important tests are
for the library.
docs/
Documentation for the library and the daemon. The interfaces (.h-files)
are the parts that have been documented the most.
There are several other directories that are not really a part of the HWOS-
source. These are listed below.
check-0.9.8/
Original source code for the Check framework (found in [che09]). In-
stallation instructions for the HWOS-framework can be found in IN-
STALL CHECK.txt, but there has also been made precompiled versions
as described under. Check was compiled for the x86-architecture us-
ing glibc (standard C-library in desktop-Linux) and for the PowerPC-
architecture using uClibc (see section 3.2.7).
84
B. Compiling the HWOS-code 85
check-x86/
A precompiled version of the Check library for the x86-architecture and
glibc-version of the HWOS.
check-powerpc/
A precompiled version of the Check library for the PowerPC-architecture
and uClibc-version of the HWOS.
bitfiles/
Files containing the FPGA-bitstream description for the processes in the
system.
logs/
Log files for the daemon and the tests.
The HWOS can be compiled for the standard Linux x86-architecture or
the embedded uClinux using the PowerPC microprocessor. To configure the
makefiles for the x86-architecture using the standard C-library glibc, just do
the following:
• Go to the root directory hwos sw. Run ./configure
– The configuration files will now be automatically written to compile
for the x86-architecture using the C-library glibc.
To compile it for uClinux and the PowerPC using uClibc, do the following:
• Go to the root directory hwos sw. Run ./configure uclinux
– The configuration files will now be automatically written to compile
for the PowerPC-architecture using the C-library uclibc.
If all the standard build tools for the Linux-distribution are installed, this
should be successfull. The configuration process for the uClinux-distribution
assumes that a cross-development enviroment has been set up.
Now, for both architectures:
• Run make
– The library, the daemon and the tests will now be compiled. If
you do not want to make all of them, run make <entity>, where
<entity> can be library, daemon or test. This will compile the
library, the daemon or the tests respectively. Run make clean to
remove all executables and other files generated by make.
• The daemon can be started by changing directory to hwos daemon/ and
running ./hwos
B. Compiling the HWOS-code 86
– This will start the daemon in the background. The message server
will be ready to receive messages from a client application.
• The tests can be started by changing directory to tests/ and running
./hwos_test
– First of all this will test the HWOS-library, but the program will
also provide options for running tests on the daemon and the FPGA-
dependent parts of the library.
Appendix C
VHDL-code
Listing C.1: Top-level entity for base design (one flip-flop) in EDK
1 −− Recon f i gurab l e top− l e v e l module f o r the t e s t s wi th one
f l i p −f l o p .
2 −− From my p ro j e c t r epor t .
3 −− Some ( not very r e l e v an t ) par t s are omit ted and marked
( . . . ) .
4 −− See the appended ZIP− f i l e f o r complete f i l e s .
5 −−
6 −− Sindre Hansen (2010/2011) .
7
8 ( . . . )
9
10 signal bm l2r in : s t d l o g i c v e c t o r (0 to 15) ;
11 signal bm l2r out : s t d l o g i c v e c t o r (0 to 15) ;
12
13 signal bm r2 l in : s t d l o g i c v e c t o r (0 to 15) ;
14 signal bm r2l out : s t d l o g i c v e c t o r (0 to 15) ;
15
16 signal c l k b u f f e r o u t : s t d l o g i c ;
17 signal c l k b u f f e r c e : s t d l o g i c ;
18
19 ( . . . )
20
21 −− The user l o g i c b a s i c a l l y maps the input and output o f
22 −− the bus macros to HW/SW−a c c e s s i b l e r e g i s t e r s .
23 USER LOGIC I : entity r e c on f v1 00 a . u s e r l o g i c
24 ( . . . )
25 port map
26 (
27 i n 16 => bm r2l out ,
28 out 16 => bm l2r in ,
29 ( . . . )
30 ) ;
31
32 R : entity r e con f i gu rab l e modu l e
33 port map
34 (
35 c l k i n => c l k bu f f e r o u t ,
36 i n 16 => bm l2r out ,
37 out 16 => bm r2 l in
38 ) ;
39
40
41 −− Connect one o f the ou tpu t s from the HWSW−r e g i s t e r to
c l k b u f f e r c e .
42 −− This means the c l o c k bu f f e r ’ s CE−input can be c on t r o l l e d
by a Linux−app .
43 c l k b u f f e r c e <= bm l2r in (7 ) ;
87
C. VHDL-code 88
44
45 −− I n s t a n t i a t e g l o b a l c l o c k b u f f e r e x p l i c i t e l y to enab l e /
d i s a b l e c l o c k
46 BUFGCE inst : BUFGCE
47 port map (
48 −− Clock b u f f e r output ( connected to recon f . module )
49 O => c l k bu f f e r o u t ,
50 −− Clock enab l e input
51 CE => c l k b u f f e r c e ,
52 −− Clock b u f f e r input ( Using same as to u s e r l o g i c / backend
module
53 I => i p i f Bus2 IP Clk
54 ) ;
55
56
57 bm l2r 0 : component busmacro xc4v l2r async narrow
58 port map(
59 input0 =>bm l2r in (0 ) ,
60 input1 =>bm l2r in (1 ) ,
61 input2 =>bm l2r in (2 ) ,
62 input3 =>bm l2r in (3 ) ,
63 input4 =>bm l2r in (4 ) ,
64 input5 =>bm l2r in (5 ) ,
65 input6 =>bm l2r in (6 ) ,
66 input7 =>bm l2r in (7 ) ,
67 output0 =>bm l2r out (0 ) ,
68 output1 =>bm l2r out (1 ) ,
69 output2 =>bm l2r out (2 ) ,
70 output3 =>bm l2r out (3 ) ,
71 output4 =>bm l2r out (4 ) ,
72 output5 =>bm l2r out (5 ) ,
73 output6 =>bm l2r out (6 ) ,
74 output7 =>bm l2r out (7 )
75 ) ;
76
77 bm l2r 1 : component busmacro xc4v l2r async narrow
78 port map(
79 input0 =>bm l2r in (8 ) ,
80 input1 =>bm l2r in (9 ) ,
81 input2 =>bm l2r in (10) ,
82 input3 =>bm l2r in (11) ,
83 input4 =>bm l2r in (12) ,
84 input5 =>bm l2r in (13) ,
85 input6 =>bm l2r in (14) ,
86 input7 =>bm l2r in (15) ,
87 output0 =>bm l2r out (8 ) ,
88 output1 =>bm l2r out (9 ) ,
89 output2 =>bm l2r out (10) ,
90 output3 =>bm l2r out (11) ,
91 output4 =>bm l2r out (12) ,
92 output5 =>bm l2r out (13) ,
93 output6 =>bm l2r out (14) ,
94 output7 =>bm l2r out (15)
95 ) ;
96
97 bm r2l 0 : component busmacro xc4v r2 l async narrow
98 port map(
99 input0 =>bm r2 l in (0 ) ,
100 input1 =>bm r2 l in (1 ) ,
101 input2 =>bm r2 l in (2 ) ,
102 input3 =>bm r2 l in (3 ) ,
103 input4 =>bm r2 l in (4 ) ,
104 input5 =>bm r2 l in (5 ) ,
105 input6 =>bm r2 l in (6 ) ,
106 input7 =>bm r2 l in (7 ) ,
107 output0 =>bm r2l out (0 ) ,
108 output1 =>bm r2l out (1 ) ,
C. VHDL-code 89
109 output2 =>bm r2l out (2 ) ,
110 output3 =>bm r2l out (3 ) ,
111 output4 =>bm r2l out (4 ) ,
112 output5 =>bm r2l out (5 ) ,
113 output6 =>bm r2l out (6 ) ,
114 output7 =>bm r2l out (7 )
115 ) ;
116
117 bm r2l 1 : component busmacro xc4v r2 l async narrow
118 port map(
119 input0 =>bm r2 l in (8 ) ,
120 input1 =>bm r2 l in (9 ) ,
121 input2 =>bm r2 l in (10) ,
122 input3 =>bm r2 l in (11) ,
123 input4 =>bm r2 l in (12) ,
124 input5 =>bm r2 l in (13) ,
125 input6 =>bm r2 l in (14) ,
126 input7 =>bm r2 l in (15) ,
127 output0 =>bm r2l out (8 ) ,
128 output1 =>bm r2l out (9 ) ,
129 output2 =>bm r2l out (10) ,
130 output3 =>bm r2l out (11) ,
131 output4 =>bm r2l out (12) ,
132 output5 =>bm r2l out (13) ,
133 output6 =>bm r2l out (14) ,
134 output7 =>bm r2l out (15)
135 ) ;
136
137 ( . . . )
Listing C.2: Top-level entity for a reconfigurable module (one flip-flop) in ISE
1 −− This i s a top l e v e l e n t i t y f o r a r e c on f i g u r a b l e
2 −− module b u i l t in ISE .
3 −−
4 −− Sindre Hansen (2011) .
5
6 l ibrary i e e e ;
7 use i e e e . s t d l o g i c 1 1 6 4 . a l l ;
8 use i e e e . s t d l o g i c a r i t h . a l l ;
9 use i e e e . s t d l o g i c un s i g n ed . a l l ;
10
11 l ibrary unisim ;
12 use unisim . vcomponents . a l l ;
13
14 entity r e con f i s
15 port (
16 c l k i n : in s t d l o g i c ;
17 i n 16 : in s t d l o g i c v e c t o r (0 to 15) ;
18 out 16 : out s t d l o g i c v e c t o r (0 to 15)
19 ) ;
20 end entity r e con f ;
21
22
23 architecture IMP of r e con f i s
24
25 signal bm l2r in : s t d l o g i c v e c t o r (0 to 15) ;
26 signal bm l2r out : s t d l o g i c v e c t o r (0 to 15) ;
27
28 signal bm r2 l in : s t d l o g i c v e c t o r (0 to 15) ;
29 signal bm r2l out : s t d l o g i c v e c t o r (0 to 15) ;
30
31 signal c l k b u f f e r o u t : s t d l o g i c ;
32 signal c l k b u f f e r c e : s t d l o g i c ;
33
34 component busmacro xc4v l2r async narrow i s
35 port (
36 input0 : in s t d l o g i c ;
C. VHDL-code 90
37 input1 : in s t d l o g i c ;
38 input2 : in s t d l o g i c ;
39 input3 : in s t d l o g i c ;
40 input4 : in s t d l o g i c ;
41 input5 : in s t d l o g i c ;
42 input6 : in s t d l o g i c ;
43 input7 : in s t d l o g i c ;
44 output0 : out s t d l o g i c ;
45 output1 : out s t d l o g i c ;
46 output2 : out s t d l o g i c ;
47 output3 : out s t d l o g i c ;
48 output4 : out s t d l o g i c ;
49 output5 : out s t d l o g i c ;
50 output6 : out s t d l o g i c ;
51 output7 : out s t d l o g i c
52 ) ;
53 end component ;
54
55
56 component busmacro xc4v r2 l async narrow i s
57 port (
58 input0 : in s t d l o g i c ;
59 input1 : in s t d l o g i c ;
60 input2 : in s t d l o g i c ;
61 input3 : in s t d l o g i c ;
62 input4 : in s t d l o g i c ;
63 input5 : in s t d l o g i c ;
64 input6 : in s t d l o g i c ;
65 input7 : in s t d l o g i c ;
66 output0 : out s t d l o g i c ;
67 output1 : out s t d l o g i c ;
68 output2 : out s t d l o g i c ;
69 output3 : out s t d l o g i c ;
70 output4 : out s t d l o g i c ;
71 output5 : out s t d l o g i c ;
72 output6 : out s t d l o g i c ;
73 output7 : out s t d l o g i c
74 ) ;
75 end component ;
76
77
78
79 begin
80
81 −− Connect the ou tpu t s & inpu t s to the bus macro ou tpu t s &
inpu t s
82 out 16 <= bm r2l out ;
83 bm l2r in <= in 16 ;
84
85 R : entity r e con f i gu rab l e modu l e
86 port map
87 (
88 c l k i n => c l k bu f f e r o u t ,
89 i n 16 => bm l2r out ,
90 out 16 => bm r2 l in
91 ) ;
92
93 −− Connect one o f the ou tpu t s from the HWSW−r e g i s t e r to
c l k b u f f e r c e .
94 −− This means the c l o c k bu f f e r ’ s CE−input can be
c on t r o l l e d by a Linux−app .
95 c l k b u f f e r c e <= bm l2r in (2 ) ;
96
97 −− I n s t a n t i a t e g l o b a l c l o c k b u f f e r e x p l i c i t e l y to enab l e /
d i s a b l e c l o c k
98 BUFGCE inst : BUFGCE
99 port map (
100 O => c l k bu f f e r o u t , −− Clock b u f f e r output (
connected to recon f . module )
101 CE => c l k b u f f e r c e , −− Clock enab l e input
C. VHDL-code 91
102 I => c l k i n −− Clock b u f f e r input
103 ) ;
104
105
106 bm l2r 0 : component busmacro xc4v l2r async narrow
107 port map(
108 −−
109 input0 =>bm l2r in (0 ) ,
110 input1 =>bm l2r in (1 ) ,
111 input2 =>bm l2r in (2 ) ,
112 input3 =>bm l2r in (3 ) ,
113 input4 =>bm l2r in (4 ) ,
114 input5 =>bm l2r in (5 ) ,
115 input6 =>bm l2r in (6 ) ,
116 input7 =>bm l2r in (7 ) ,
117 output0 =>bm l2r out (0 ) ,
118 output1 =>bm l2r out (1 ) ,
119 output2 =>bm l2r out (2 ) ,
120 output3 =>bm l2r out (3 ) ,
121 output4 =>bm l2r out (4 ) ,
122 output5 =>bm l2r out (5 ) ,
123 output6 =>bm l2r out (6 ) ,
124 output7 =>bm l2r out (7 )
125
126 ) ;
127
128 bm l2r 1 : component busmacro xc4v l2r async narrow
129 port map(
130
131 input0 =>bm l2r in (8 ) ,
132 input1 =>bm l2r in (9 ) ,
133 input2 =>bm l2r in (10) ,
134 input3 =>bm l2r in (11) ,
135 input4 =>bm l2r in (12) ,
136 input5 =>bm l2r in (13) ,
137 input6 =>bm l2r in (14) ,
138 input7 =>bm l2r in (15) ,
139 output0 =>bm l2r out (8 ) ,
140 output1 =>bm l2r out (9 ) ,
141 output2 =>bm l2r out (10) ,
142 output3 =>bm l2r out (11) ,
143 output4 =>bm l2r out (12) ,
144 output5 =>bm l2r out (13) ,
145 output6 =>bm l2r out (14) ,
146 output7 =>bm l2r out (15)
147
148 ) ;
149
150 bm r2l 0 : component busmacro xc4v r2 l async narrow
151 port map(
152
153 input0 =>bm r2 l in (0 ) ,
154 input1 =>bm r2 l in (1 ) ,
155 input2 =>bm r2 l in (2 ) ,
156 input3 =>bm r2 l in (3 ) ,
157 input4 =>bm r2 l in (4 ) ,
158 input5 =>bm r2 l in (5 ) ,
159 input6 =>bm r2 l in (6 ) ,
160 input7 =>bm r2 l in (7 ) ,
161 output0 =>bm r2l out (0 ) ,
162 output1 =>bm r2l out (1 ) ,
163 output2 =>bm r2l out (2 ) ,
164 output3 =>bm r2l out (3 ) ,
165 output4 =>bm r2l out (4 ) ,
166 output5 =>bm r2l out (5 ) ,
167 output6 =>bm r2l out (6 ) ,
168 output7 =>bm r2l out (7 )
C. VHDL-code 92
169
170 ) ;
171
172
173
174 bm r2l 1 : component busmacro xc4v r2 l async narrow
175 port map(
176
177 input0 =>bm r2 l in (8 ) ,
178 input1 =>bm r2 l in (9 ) ,
179 input2 =>bm r2 l in (10) ,
180 input3 =>bm r2 l in (11) ,
181 input4 =>bm r2 l in (12) ,
182 input5 =>bm r2 l in (13) ,
183 input6 =>bm r2 l in (14) ,
184 input7 =>bm r2 l in (15) ,
185 output0 =>bm r2l out (8 ) ,
186 output1 =>bm r2l out (9 ) ,
187 output2 =>bm r2l out (10) ,
188 output3 =>bm r2l out (11) ,
189 output4 =>bm r2l out (12) ,
190 output5 =>bm r2l out (13) ,
191 output6 =>bm r2l out (14) ,
192 output7 =>bm r2l out (15)
193
194 ) ;
195
196 end IMP;
Listing C.3: Reconfigurable module (one flip-flop)
1 −− Recon f i gurab l e module f o r the t e s t s wi th one f l i p −f l o p .
2 −− From my p ro j e c t r epor t .
3 −− Some ( not very r e l e v an t ) par t s are omit ted and marked
( . . . ) .
4 −− See the appended ZIP− f i l e f o r complete f i l e s .
5 −−
6 −− Sindre Hansen (2010/2011) .
7
8 l ibrary ( . . . )
9
10 entity r e con f i gu rab l e modu l e i s
11 port (
12 c l k i n : in s t d l o g i c ;
13 i n 16 : in s t d l o g i c v e c t o r (0 to 15) ;
14 out 16 : out s t d l o g i c v e c t o r (0 to 15)
15 ) ;
16 end entity r e con f i gu rab l e modu l e ;
17
18 architecture arch of r e con f i gu rab l e modu l e i s
19 begin
20
21 −− D−f l i p −f l o p
22 −− in 16
23 −− b i t 0 : d
24 −− b i t 1 : r s t
25 −− b i t 2−15 : not used ( but connected to output to avoid
error messages )
26 −− out 16
27 −− b i t 0 : q
28 −− b i t 1−14: not used ( but connected to input b i t s 2−15
anyway )
29 −− b i t 15 : not used ( s e t to ’0 ’ )
30
31 out 16 (1 to 14) <= in 16 (2 to 15) ;
32 out 16 (15) <= ’0 ’ ;
33
34 OUTPROC : process ( c l k i n , i n 16 (1 ) )
C. VHDL-code 93
35 begin
36 i f ( i n 16 (1 ) = ’1 ’) then
37 out 16 (0 ) <= ’0 ’ ;
38 e l s i f ( c l k i n ’EVENT and c l k i n = ’1 ’) then
39 out 16 (0 ) <= in 16 (0 ) ;
40 end i f ;
41 end process OUTPROC;
42 end arch ;
Listing C.4: Reconfigurable module (three flip-flops)
1 −− Recon f i gurab l e top− l e v e l module f o r the t e s t s wi th one
f l i p −f l o p .
2 −− From my p ro j e c t r epor t .
3 −− Some ( not very r e l e v an t ) par t s are omit ted and marked
( . . . ) .
4 −− See the appended ZIP− f i l e f o r complete f i l e s .
5 −−
6 −− Sindre Hansen (2010/2011) .
7
8 l ibrary ( . . . )
9
10 entity r e con f i gu rab l e modu l e i s
11 port (
12 c l k i n : in s t d l o g i c ;
13 i n 16 : in s t d l o g i c v e c t o r (0 to 15) ;
14 out 16 : out s t d l o g i c v e c t o r (0 to 15)
15 ) ;
16 end entity r e con f i gu rab l e modu l e ;
17
18 architecture arch of r e con f i gu rab l e modu l e i s
19 signal c l o ck enab l e : s t d l o g i c ;
20 signal r s t i n : s t d l o g i c ; −− Reset DFF
21 signal r e g 0 i n : s t d l o g i c ; −− DFF input
22 signal r eg0 out : s t d l o g i c ; −− DFF output
23 signal r e g 1 i n : s t d l o g i c ; −− DFF input
24 signal r eg1 out : s t d l o g i c ; −− DFF output
25 signal r e g 2 i n : s t d l o g i c ; −− DFF input
26 signal r eg2 out : s t d l o g i c ; −− DFF output
27 begin
28
29 −− D−f l i p −f l o p
30 −− in 16
31 −− b i t 0−2: d0−d2
32 −− b i t 3 : r s t
33 −− b i t 4−15 : not used ( but connected to output to avoid
error messages )
34 −− out 16
35 −− b i t 0−2: q0−q2
36 −− b i t 3−14: not used ( but connected to input b i t s 4−15
anyway )
37 −− b i t 15 : not used ( s e t to ’0 ’ )
38
39 r e g 0 i n <= in 16 (0 ) ;
40 r e g 1 i n <= in 16 (1 ) ;
41 r e g 2 i n <= in 16 (2 ) ;
42 r s t i n <= in 16 (3 ) ;
43 out 16 (0 ) <= reg0 out ;
44 out 16 (1 ) <= reg1 out ;
45 out 16 (2 ) <= reg2 out ;
46 out 16 (3 to 14) <= in 16 (4 to 15) ;
47 out 16 (15) <= ’0 ’ ;
48
49 OUTPROC : process ( c l k i n , r s t i n )
50 begin
51 i f ( r s t i n = ’1 ’) then
52 r eg0 out <= ’0 ’ ;
53 r eg1 out <= ’0 ’ ;
C. VHDL-code 94
54 r eg2 out <= ’0 ’ ;
55 e l s i f ( c l k i n ’EVENT and c l k i n = ’1 ’) then
56 r eg0 out <= reg0 i n ;
57 r eg1 out <= reg1 i n ;
58 r eg2 out <= reg2 i n ;
59 end i f ;
60 end process OUTPROC;
61
62 end arch ;
Appendix D
HWOS
This is some of the code from the HWOS. It is highly recommended to read
the actual code in the ZIP-file or the Doxygen-documentation.
D.1 The HWOS daemon
These are the source files (*.c) and the header files for the top-level entities
of the HWOS daemon, the message server, the scheduler, the placer and the
timer.
Listing D.1: hdaemon.h
1 /∗ ! \ f i l e hwos . h
2 ∗ \ b r i e f [ Hardware OS In t e r f a c e ] Top l e v e l daemon fo r HWOS
.
3 ∗
4 ∗ This daemon communicates wi th a number o f hardware
modules through dev i c e d r i v e r s .
5 ∗ The main goa l o f t h i s so f tware i s to s chedu l e run time
on the FPGA for d i f f e r e n t modules .
6 ∗ Another goa l i s to a l l o c a t e / d e a l l o c a t e memory f o r the
module .
7 ∗
8 ∗ Orig ina l author (2010) : Vegard Endresen
9 ∗
10 ∗ Modif ied (2011) by : Sindre Hansen
11 ∗ − Rewrote daemon .
12 ∗ − Rewrote message i n t e r f a c e and HWOS− l i b r a r y .
13 ∗ − Added schedu l e r .
14 ∗ − Added p l a c e r .
15 ∗ − Added threads f o r message s e r v e r and schedu l e r .
16 ∗/
17
18
19 #include ” h s t ru c tu r e s . h”
20
21 #ifndef HDAEMONH
22 #define HDAEMONH
23
24 #include ” . . / p l a t f o rm de f i n e s . h”
25 #i f SUZAKU BUILD
26 #de f i n e DAEMONLOG ”/var /hwos daemon . l og ”
27 #else
28 #de f i n e DAEMONLOG ” . . / l o g s /hwos daemon . l og ”
95
D. HWOS 96
29 #endif
30
31 // ! Device f i l e roo t f o l d e r on Suzaku .
32 #define DEVICE ROOT ”/var /tmp/dev”
33
34 #endif
Listing D.2: hdmsg.h
1 /∗ ! \ f i l e hwos . h
2 ∗ \ b r i e f [ Hardware OS Message Server ] Top l e v e l daemon fo r
HWOS message s e r v e r .
3 ∗
4 ∗ Author (2011) : Sindre Hansen
5 ∗/
6
7
8 #ifndef HDMSGH
9 #define HDMSGH
10
11 enum hdmevent {
12 HDMEQUIT // ! Quit the
message s e r v e r thread .
13 } ;
14
15 /∗ ! \ b r i e f Main func t i on f o r message s e r v e r thread .
16 ∗/
17 void∗ hdmsg main ( ) ;
18 int hdmsg not i fy (enum hdmevent event ) ;
19
20 /∗ ! \ b r i e f Send a dummy/sync message to the message s e r v e r .
21 ∗
22 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
23 ∗/
24 int hdmsg synchronize ( ) ;
25
26 #endif
Listing D.3: hdplacer.h
1 /∗ ! \ f i l e hdp lacer . h
2 ∗ \ b r i e f [ Hardware OS Placer ] Top l e v e l module f o r
placement on FPGA.
3 ∗
4 ∗ The r e s p o n s i b i l i t i e s o f t h i s module i s as f o l l o w s :
5 ∗
6 ∗ − The module a l s o t a k e s care o f p e r i o d i c a l l y
i n t e r r u p t i n g the proces s running on the FPGA and
7 ∗ sav ing i t ’ s s t a t e . I t a l s o wr i t e s the s t a t e back to a
proces s running on the FPGA i f i t i s
8 ∗ to be resumed .
9 ∗
10 ∗ − This module t a k e s care o f placement ( o f a proces s
chosen by the s chedu l e r ) on the FGPA.
11 ∗ In t h i s v e r s i on o f the HWOS, the module a lways p l a c e s
the r e c on f i g u r a b l e module at
12 ∗ the same p lace us ing ICAP.
13 ∗
14 ∗ Orig ina l author (2010) : Sindre Hansen
15 ∗
16 ∗/
17
18 #ifndef HDPLACERH
19 #define HDPLACERH
20
21
22 enum hdpevent {
23 HDPE QUIT // ! Quit the p l a c e r
thread .
D. HWOS 97
24 ,HDPE TIMER // ! The t im e s l i c e
has exp i r ed .
25 } ;
26
27 void∗ hdplacer main ( ) ;
28 int hdp l a c e r no t i f y (enum hdpevent event ) ;
29
30 #endif
Listing D.4: hdsched.h
1 /∗ ! \ f i l e hwos . h
2 ∗ \ b r i e f [ Hardware OS Schedu ler ] Top l e v e l i n t e r f a c e f o r
s chedu l e r .
3 ∗
4 ∗ This d e f i n e s the i n t e r f a c e at top− l e v e l f o r the
s chedu l e r . This i s
5 ∗ t y p i c a l l y on ly f unc t i on s to n o t i f y the s chedu l e r on
even t s e t c .
6 ∗
7 ∗ This module p r e t t y much works as a s e r v e r .
8 ∗
9 ∗ Orig ina l author (2011) : Sindre Hansen
10 ∗/
11
12 #include <hproces s . h>
13
14 #ifndef HDSCHEDH
15 #define HDSCHEDH
16
17
18 enum hdsevent {
19 HDSE NEW PROCESS // ! A new process has
a r r i v ed to the HWOS.
20 ,HDSE QUIT // ! Quit the p l a c e r
thread .
21 ,HDSE SHORT // ! Perform short−term
schedu l i n g .
22 ,HDSE RESCHED TIMER // ! Reschedule proces s (
was t imer i n t e r r up t e d ) .
23 ,HDSE RESCHED PRIORITY // ! Reschedule proces s (
was rep l aced by h i gher p r i o r i t y proces s ) .
24 ,HDSE RESCHED IO // ! Reschedule proces s (
needed una va i l a b l e IO) .
25 } ;
26
27 enum hdsreturn {
28 HDSR NO BITFILE = −2 // ! B i t f i l e does not
e x i s t .
29 ,HDSRTOOMANY = −1 // ! Too many proce s s e s
in system a l ready .
30 ,HDSRREADY = 0 // ! Process has been
added to a ready queue .
31 } ;
32
33 /∗ ! \ b r i e f Main func t i on f o r s chedu l e r thread .
34 ∗/
35 void∗ hdsched main ( ) ;
36
37 /∗ ! \ b r i e f Reschedule the g iven process .
38 ∗
39 ∗ This func t i on i s c a l l e d when the g iven an
40 ∗ event has occured f o r the g iven proces s .
41 ∗
42 ∗ @param process The proces s to be re schedu l ed .
43 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
44 ∗/
45 int hdsched re schedu l e ( struct hproces s ∗ process , enum
hdsevent event ) ;
D. HWOS 98
46
47 /∗ ! \ b r i e f No t i f y the s chedu l e r on new event .
48 ∗
49 ∗ @param event The n o t i f i e d event .
50 ∗/
51 int hdsched not i f y (enum hdsevent event ) ;
52
53 /∗ ! \ b r i e f Return the number o f accepted proce s s e s in the
s chedu l e r .
54 ∗
55 ∗ This i s the number o f a l l p roce s s e s t ha t has been
accepted .
56 ∗ I t does not inc l ude proce s s e s in the HPS NEW−queues .
57 ∗
58 ∗ This number can be modifed by the s chedu l e r at the same
59 ∗ t ime and i s t h e r e f o r e secured wi th a mutex .
60 ∗
61 ∗ @return Number o f p roce s s e s .
62 ∗/
63 int hdsched processes number ( ) ;
64
65 /∗ ! \ b r i e f Add a new proces s to one o f the HPS NEW−queues .
66 ∗
67 ∗ The HPS NEW−queues can be accessed by another thread at
the same
68 ∗ t ime and i s t h e r e f o r e secured wi th a mutex .
69 ∗
70 ∗ @param process The proces s to be added .
71 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
72 ∗/
73 int hdsched add new process ( struct hproces s ∗ proce s s ) ;
74
75 /∗ ! \ b r i e f Get proces s in f r on t o f the HPS NEW−queue wi th
h i g h e s t p r i o r i t y .
76 ∗
77 ∗ The HPS NEW−queues can be modifed by another thread at
the same
78 ∗ t ime and i s t h e r e f o r e secured wi th a mutex .
79 ∗
80 ∗ @return The process .
81 ∗/
82 struct hproces s ∗ hdsched get new proces s ( ) ;
83
84
85 /∗ ! \ b r i e f Get next proces s to be p laced on the FPGA.
86 ∗
87 ∗ The process can be s e t by the s chedu l e r at the same
88 ∗ t ime and i s t h e r e f o r e secured wi th a mutex .
89 ∗
90 ∗ @return The process .
91 ∗/
92 struct hproces s ∗ hds ched ge t nex t p ro c e s s ( ) ;
93
94 #endif
Listing D.5: hdtimer.h
1 /∗ ! \ f i l e hdtimer . h
2 ∗ \ b r i e f [ Hardware OS Timer ] Timer f o r the HWOS Placer .
3 ∗
4 ∗ This module s imply genera t e s a t imer i n t e r r u p t to the
p l a c e r module
5 ∗ at the i n t e r v a l s p e c i f i e d by h d t ime r g e t t im e s l i c e ( ) .
6 ∗
7 ∗ Orig ina l author (2010) : Sindre Hansen
8 ∗
9 ∗/
10
11
D. HWOS 99
12 #ifndef HDTIMER H
13 #define HDTIMER H
14
15 enum hdtevent {
16 HDTE QUIT // ! Quit the t imer
thread .
17 } ;
18
19
20 int hdt imer no t i f y (enum hdtevent event ) ;
21 const int hd t ime r g e t t ime s l i c e ( ) ;
22 void∗ hdtimer main ( ) ;
23
24 #endif
Listing D.6: hdaemon.c
1 #include <s t d i o . h>
2 #include <s t d l i b . h>
3 #include <uni s td . h>
4 #include <s i g n a l . h>
5 #include <pthread . h>
6 #include <sys / types . h>
7 #include <sys / s t a t . h>
8
9 #include <hlog . h>
10 #include <hmem. h>
11
12 #include ”hdaemon . h”
13 #include ”hdsched . h”
14 #include ”hdmsg . h”
15 #include ” hdplacer . h”
16 #include ” hdtimer . h”
17
18
19 // ! I n i t i a l i z e t h i s proces s as a daemon ( Linux a c t u a l l y has
i t ’ s own i n t e r f a c e f o r t h i s ) .
20 stat ic int in i t daemon ( ) ;
21 stat ic void s i g n a l h and l e r ( int s i g ) ;
22
23
24 stat ic void s i g n a l h and l e r ( int s i g )
25 {
26 switch ( s i g ) {
27 case SIGINT :
28 case SIGTERM:
29 hdmsg not i fy (HDMEQUIT) ;
30 hdmsg synchronize ( ) ;
31 break ;
32 }
33 }
34
35 stat ic int in i t daemon ( )
36 {
37 // Fork o f f the parent proces s .
38 // Exi t the parent proces s i f succe s s .
39 p id t pid ;
40 pid = fo rk ( ) ;
41 i f ( pid < 0) {
42 return −1;
43 } else i f ( pid > 0)
44 e x i t (EXIT SUCCESS) ;
45
46 // Change the f i l e mode mask ( shou ld not i n h e r i t
umask from parent ) .
47 umask (0 ) ;
48
49 // Create a new SID fo r the c h i l d proces s .
D. HWOS 100
50 p id t s i d ;
51 s i d = s e t s i d ( ) ;
52 i f ( s i d < 0) {
53 return −1;
54 }
55
56 // Change the current working d i r e c t o r y .
57 i f ( ( chd i r ( ”/” ) ) < 0) {
58 return −1;
59 }
60
61 // Close out the standard f i l e d e s c r i p t o r s .
62 c l o s e (STDIN FILENO) ;
63 c l o s e (STDOUT FILENO) ;
64 c l o s e (STDERR FILENO) ;
65
66 h l o g wr i t e (HLOGNORMAL, ”Daemon i n i t i a l i z e d .\n” ) ;
67
68 return 0 ;
69 }
70
71
72 int main ( int argc , const char∗ argv [ ] )
73 {
74 // Connect the d i f f e r e n t s i g n a l s ( t h a t the daemon
can r e c e i v e ) to s i g n a l h a n d l e r f unc t i on .
75 s i g n a l (SIGINT , s i g n a l h and l e r ) ;
76 s i g n a l (SIGTERM, s i g n a l h and l e r ) ;
77
78 i f ( h l o g i n i t (DAEMONLOG) < 0) {
79 p r i n t f ( ” Error . Could not i n i t l og f i l e .
Daemon i s s topping .\n” ) ;
80 e x i t (EXIT FAILURE) ;
81 }
82
83 // I n i t the daemon .
84 // i f ( in i t daemon () < 0)
85 // e x i t (EXIT FAILURE) ;
86
87 // I n i t i a l i z e a dynamic memory module wi th 256 l i n e s
.
88 //dmemInit (256) ;
89
90 stat ic pthread t s chedu l e r th r ead ;
91 stat ic pthread t p l a c e r th r e ad ;
92 stat ic pthread t mes sage s e rve r th r ead ;
93 stat ic pthread t t imer thread ;
94
95 // Takes care o f s chedu l i n g proce s s e s .
96 pth r ead c r ea t e (&schedu l e r thread , NULL, hdsched main
, (void ∗)NULL) ;
97 // Takes care o f p l a c i n g proce s s e s on the hardware .
98 pth r ead c r ea t e (&p lace r th r ead , NULL, hdplacer main ,
(void ∗)NULL) ;
99 // Generates t imer i n t e r r u p t s .
100 pth r ead c r ea t e (&t imer thread , NULL, hdtimer main , (
void ∗)NULL) ;
101 // Takes care o f r e c e i v i n g and proce s s ing messages
from c l i e n t programs .
102 // This thread i s a l s o the master thread and
r e s p on s i b l e f o r e x i t i n g a l l o ther threads .
103 pth r ead c r ea t e (&message se rve r thread , NULL,
hdmsg main , (void ∗)NULL) ;
104
105 // Wait f o r each thread to f i n i s h b e f o r e e x i t i n g the
daemon .
106 pth r ead j o i n ( t imer thread , NULL) ;
107 pth r ead j o i n ( s chedu l e r thread , NULL) ;
D. HWOS 101
108 pth r ead j o i n ( p l ace r th r ead , NULL) ;
109 pth r ead j o i n ( message se rve r thread , NULL) ;
110
111 h l o g c l o s e ( ) ;
112
113 e x i t (EXIT SUCCESS) ;
114 }
Listing D.7: hdmsg.c
1 #include <sys / types . h>
2 #include <sys / s t a t . h>
3 #include <s t d i o . h>
4 #include <s t d l i b . h>
5 #include < f c n t l . h>
6 #include <errno . h>
7 #include <uni s td . h>
8 #include <s y s l o g . h>
9 #include <s t r i n g . h>
10 #include <s i g n a l . h>
11 // ! Message queue f unc t i on s .
12 #include <sys / ipc . h>
13 #include <sys /msg . h>
14 // ! Sleep , unused at the moment .
15 #include <uni s td . h>
16 #include <pthread . h>
17
18 #include <hlog . h>
19 #include <hmsg . h>
20 #include <hmqueue . h>
21 #include <hmem. h>
22 // Not used .
23 //#inc l ude <d e v i c e l i s t . h>
24 #include < i n t e g e r f u n c t i o n s . h>
25 #include <hs t ru c tu r e s . h>
26 #include <hproces s . h>
27 #include <hevent . h>
28
29 #include ”hdaemon . h”
30 #include ”hdsched . h”
31 #include ”hdmsg . h”
32 #include ” hdplacer . h”
33 #include ” hdtimer . h”
34
35 /∗ ! \ b r i e f Process a r e c e i v ed message .
36 ∗
37 ∗ @param req msg The message to be processed .
38 ∗ @return The message to be sen t back to c l i e n t on
succe s s . NULL on f a i l u r e .
39 ∗/
40 stat ic void∗ p r o c e s s r e qu e s t (void∗ req msg ) ;
41 stat ic void c l eanup ( int e x i t c od e ) ;
42 stat ic int hand l e event ( ) ;
43
44 // ! The event hand ler f o r t h i s thread .
45 stat ic struct hevent∗ event hand le r = NULL;
46
47
48 stat ic int hand l e event ( )
49 {
50 enum hdmevent incoming event = (enum hdmevent ) hevent wai t (
event hand le r ) ;
51
52 // An incoming event i s handled here .
53 switch ( incoming event ) {
54 case HDMEQUIT:
55 // Quit the thread .
56 p r i n t f ( ” wa i t even t : Ex i t ing message s e r v e r thread .\n” ) ;
D. HWOS 102
57 c l eanup (EXIT SUCCESS) ;
58 break ;
59 }
60
61 return 0 ;
62 }
63
64
65 void∗ p r o c e s s r e qu e s t (void∗ req msg )
66 {
67 h l og wr i t e (HLOGNORMAL, ” p r o c e s s r e qu e s t : Proce s s ing
message .\n” ) ;
68
69 // The response message to be re turned .
70 void∗ resp msg = NULL;
71
72 h l og wr i t e (HLOGDEBUG, ” p r o c e s s r e qu e s t : command=” ) ;
73 h l o g w r i t e i n t e g e r ( hmsg get command ( req msg ) ) ;
74 h l o g w r i t e t e x t ( ” .\n” ) ;
75
76 switch ( hmsg get command ( req msg ) ) {
77 // GCC compi ler demands b ra c k e t s on t h i s case .
78 case HMCALLOC: {
79 // Commented away . Not ready to be used .
80 /∗
81 h l o g w r i t e (HLOGNORMAL, ” p r o c e s s r e q u e s t : HMCALLOC\n”)
;
82 resp msg = hmsg create (HMTCTRL) ;
83 i n t ∗ base addr = c a l l o c (1 , s i z e o f (∗ base addr ) ) ;
84 i n t a l l o c r e t = hmem al locate ( c h a r t o i n t ( hmsg ge t da ta (
req msg ) , 0) , base addr ) ;
85 i f ( a l l o c r e t < 0) {
86 h l o g w r i t e (HLOG ERROR, ” p r o c e s s r e q u e s t : Fa i l ed to
a l l o c a t e memory f o r HMCALLOC.\n”) ;
87 } e l s e {
88 h l o g w r i t e (HLOGDEBUG, ” p r o c e s s r e q u e s t : A l l o ca t ed
memory f o r HMCALLOC.\n”) ;
89 hmsg se t r e tu rn ( resp msg , ∗ base addr ) ;
90 }
91 f r e e ( base addr ) ;
92 base addr = NULL;
93 a l l o c r e t = −1;
94 ∗/
95 break ;
96 }
97 case HMCEXEC:
98 h l og wr i t e (HLOGNORMAL, ” p r o c e s s r e qu e s t : HMCEXEC\n”
) ;
99 resp msg = hmsg create (HMTCTRL) ;
100 // Not ready ye t .
101 // hmem ins t ruc t ion wr i t e (0 , hmsg ge t da ta ( req msg ) ,
hmsg g e t s i z e ( req msg ) ) ;
102 hmsg se t r e turn ( resp msg , HMROK) ;
103 break ;
104 case HMCFREE:
105 h l og wr i t e (HLOGNORMAL, ” p r o c e s s r e qu e s t : HMCFREE\n”
) ;
106 resp msg = hmsg create (HMTCTRL) ;
107 // Not ready ye t .
108 //hmem free ( c h a r t o i n t ( hmsg ge t da ta ( req msg ) , 0) ) ;
109 hmsg se t r e turn ( resp msg , HMROK) ;
110 break ;
111 case HMCLDDEV:
112 h l og wr i t e (HLOGNORMAL, ” p r o c e s s r e qu e s t : HMCLDDEV\n
” ) ;
113 resp msg = hmsg create (HMTCTRL) ;
114 hmsg se t r e turn ( resp msg , HMROK) ;
D. HWOS 103
115 break ;
116 case HMCRMDEV:
117 h l og wr i t e (HLOGNORMAL, ” p r o c e s s r e qu e s t : HMCRMDEV\n
” ) ;
118 resp msg = hmsg create (HMTCTRL) ;
119 hmsg se t r e turn ( resp msg , HMROK) ;
120 break ;
121 case HMCRMQUE: {
122 // Connect to an e x i s t i n g queue .
123 // Note : I f the queue i s r e g i s t e r e d a lready , the o ld
queue w i l l be removed
124 // and the new queue r e g i s t e r e d in s t ead .
125 h l og wr i t e (HLOGNORMAL, ” p r o c e s s r e qu e s t : HMCRMQUE\n” ) ;
126 resp msg = hmsg create (HMTCTRL) ;
127 // Connect to an e x i s t i n g queue . Use sender ’ s i d e n t i f i e r
as key .
128 struct hmqueue∗ reg mqueue = hmqueue add ( hmsg get sender (
req msg ) , HMQCONNECT) ;
129 i f ( reg mqueue == NULL) {
130 h l og wr i t e (HLOGERROR, ” p r o c e s s r e qu e s t : Could not
connect to message queue .\n” ) ;
131 hmsg se t r e turn ( resp msg , HMRERROR) ;
132 } else {
133 h l og wr i t e (HLOGDEBUG, ” p r o c e s s r e qu e s t : Connected
to message queue . msqid req=” ) ;
134 h l o g w r i t e i n t e g e r ( hmqueue get id ( reg mqueue ) ) ;
135 h l o g w r i t e t e x t ( ” . key=” ) ;
136 h l o g w r i t e i n t e g e r ( hmqueue get key ( reg mqueue ) ) ;
137 h l o g w r i t e t e x t ( ” .\n” ) ;
138 hmsg se t r e turn ( resp msg , HMROK) ;
139 }
140 reg mqueue = NULL;
141 break ;
142 }
143 case HMCREAD:
144 h l og wr i t e (HLOGNORMAL, ” p r o c e s s r e qu e s t : HMCREAD\n” ) ;
145 resp msg = hmsg create (HMTCTRLMEM) ;
146 // Not ready ye t .
147 //hmem data read (0 , hmsg ge t da ta ( resp msg ) ,
hmsg ge t addres s ( req msg ) , hmsg g e t s i z e ( req msg ) ) ;
148 hmsg se t re turn ( resp msg , HMROK) ;
149 break ;
150 case HMCUMQUE:
151 h l og wr i t e (HLOGNORMAL, ” p r o c e s s r e qu e s t : HMCUMQUE\n” ) ;
152 hmqueue remove by key ( hmsg get sender ( req msg ) ) ;
153 break ;
154 case HMCWRITE:
155 h l og wr i t e (HLOGNORMAL, ” p r o c e s s r e qu e s t : HMCWRITE\n” ) ;
156 resp msg = hmsg create (HMTCTRL) ;
157 // Not ready ye t .
158 // hmem data write (0 , hmsg ge t da ta ( req msg ) ,
hmsg ge t addres s ( req msg ) , hmsg g e t s i z e ( req msg ) ) ;
159 hmsg se t re turn ( resp msg , HMROK) ;
160 break ;
161 case HMCREGPROC:
162 // Reg i s t e r a new process and wake up s chedu l e r .
163 h l og wr i t e (HLOGNORMAL, ” p r o c e s s r e qu e s t : HMCREGPROC\n”
) ;
164 resp msg = hmsg create (HMTCTRL) ;
165 struct hproces s ∗ process new = hp ro c e s s c r e a t e (
hmsg get n i ce ( req msg ) ) ;
166 h l og wr i t e (HLOGDEBUG, ” p r o c e s s r e qu e s t : Filename in msg
: ” ) ;
167 h l o g w r i t e t e x t ( hmsg ge t b i t f i l ename ( req msg ) ) ;
168 h l o g w r i t e t e x t ( ”\n” ) ;
D. HWOS 104
169 hp r o c e s s s e t b i t f i l e n ame ( process new ,
hmsg ge t b i t f i l ename ( req msg ) ) ;
170 i f ( process new == NULL) {
171 hmsg se t r e turn ( resp msg , HMRNOPID) ;
172 } else {
173 hp r o c e s s s e t b i t f i l e n ame ( process new ,
hmsg ge t b i t f i l ename ( req msg ) ) ;
174 hmsg se t r e turn ( resp msg , HMROK) ;
175 hdsched add new process ( process new ) ;
176 hdsched not i f y (HDSE NEW PROCESS) ;
177 }
178 break ;
179 default :
180 break ;
181 }
182
183 return resp msg ;
184 }
185
186
187 stat ic void c l eanup ( int e x i t c od e )
188 {
189 i f ( e x i t c od e == EXIT FAILURE) {
190 h l og wr i t e (HLOGERROR, ”HWOS message s e r v e r terminated
with e r r o r=” ) ;
191 h l o g w r i t e i n t e g e r ( errno ) ;
192 h l o g w r i t e t e x t ( ” .\n” ) ;
193 }
194 else
195 h l og wr i t e (HLOGNORMAL, ”HWOS message s e r v e r
terminated normally .\n” ) ;
196
197 // Quit the o ther threads a l s o .
198 hdt imer no t i f y (HDTE QUIT) ;
199 hdp l a c e r no t i f y (HDPE QUIT) ;
200 hdsched not i f y (HDSE QUIT) ;
201
202 hevent remove ( event hand le r ) ;
203 pth r ead ex i t (NULL) ;
204 }
205
206
207 int hdmsg not i fy (enum hdmevent event )
208 {
209 heven t no t i f y ( event handler , ( int ) event ) ;
210
211 return 0 ;
212 }
213
214
215 int hdmsg synchronize ( )
216 {
217 // Send dummy/sync message to message s e r v e r
218 // ( to make sure i t hand les event ) .
219 void∗ sync msg = hmsg create (HMTCTRL) ;
220 hmsg send ( sync msg , hmqueue get (
hmqueue get daemonkey ( ) ) ) ;
221 f r e e ( sync msg ) ;
222
223 return 0 ;
224 }
225
226
227 void∗ hdmsg main ( )
228 {
229 // I n i t the message queue and re turn the i d e n t i f i e r .
230 struct hmqueue∗ mqueue ;
231 mqueue = hmqueue add ( hmqueue get daemonkey ( ) , HMQCREATE) ;
D. HWOS 105
232
233 // Message to be r e c e i v ed .
234 void∗ req msg ;
235
236 // Create event queue .
237 event hand le r = hevent c r ea t e (HEVENTNONBLOCKING) ;
238 i f ( event hand le r == NULL) {
239 c l eanup (EXIT FAILURE) ;
240 }
241
242 // Serv i c e loop f o r message s e r v e r .
243 while (1 ) {
244 // Block ing wai t f o r a message on queue msqid . Put
message in req msg .
245 h l og wr i t e (HLOGNORMAL, ”hdmsg main : Waiting f o r message
.\n” ) ;
246 req msg = hmsg rece ive (mqueue ) ;
247
248 i f ( req msg == NULL) {
249 h l o g wr i t e (HLOGERROR, ”hdmsg main : req msg i s NULL
.\n” ) ;
250 c l eanup (EXIT FAILURE) ;
251 }
252
253 void∗ resp msg ;
254 // Process the r e c e i v ed message and make ready the
response message .
255 resp msg = p r o c e s s r e qu e s t ( req msg ) ;
256
257 i f ( resp msg != NULL) {
258 // Get the message queue f o r the sender and send the
response .
259 hmsg send ( resp msg , hmqueue get ( hmsg get sender (
req msg ) ) ) ;
260 }
261
262 // Handle any even t s from the o ther threads .
263 hand l e event ( ) ;
264 }
265
266 c l eanup (EXIT SUCCESS) ;
267 }
Listing D.8: hdplacer.c
1 #include <s t d i o . h>
2 #include <s t d l i b . h>
3 #include <errno . h>
4 #include <uni s td . h>
5 #include <pthread . h>
6
7 #include <hlog . h>
8 #include <hs t ru c tu r e s . h>
9 #include <hp lace r . h>
10 #include <hproces s . h>
11 #include <hmsg . h>
12 #include <hmqueue . h>
13 #include <hevent . h>
14
15 #include ” hdplacer . h”
16 #include ”hdmsg . h”
17 #include ”hdsched . h”
18 #include ” . . / p l a t f o rm de f i n e s . h”
19
20
21 stat ic void c l eanup ( int e x i t c od e ) ;
22
23 /∗ ! \ b r i e f Handle an incoming event .
D. HWOS 106
24 ∗
25 ∗ This f unc t i on s l o c k s a mutex and wa i t s on a
26 ∗ cond i t i on v a r i a b l e .
27 ∗/
28 stat ic int wa i t even t ( ) ;
29 stat ic struct hproces s ∗ g e t c u r r e n t p r o c e s s ( ) ;
30 stat ic int s e t c u r r e n t p r o c e s s ( struct hproces s ∗ proce s s ) ;
31
32 stat ic struct hproces s ∗ cu r r en t p r o c e s s = NULL;
33 // ! The event hand ler f o r t h i s thread .
34 stat ic struct hevent∗ event hand le r = NULL;
35 // ! I f t h i s i s true , t h i s thread shou ld ask the s e r v e r to
q u i t a l l t h reads when e x i t i n g .
36 stat ic int n o t i f y s e r v e r q u i t = 1 ;
37 stat ic const char∗ b i t f i l e s p a t h = ” . . / b i t f i l e s ” ;
38 stat ic const char∗ i c apdev i c e = ”/var / i cap ” ;
39
40 stat ic int wa i t even t ( )
41 {
42 enum hdpevent incoming event = (enum hdmevent )
hevent wai t ( event hand le r ) ;
43
44 struct hproces s ∗ nex t p ro c e s s =
hdsched ge t nex t p ro c e s s ( ) ;
45 struct hproces s ∗ cu r r en t p r o c e s s =
g e t c u r r e n t p r o c e s s ( ) ;
46 int r e p l a c e c u r r e n t = 0 ;
47
48 // An incoming event i s handled here .
49 switch ( incoming event ) {
50 case HDPE QUIT:
51 // Do not ask s e r v e r to q u i t ( the
s e r v e r asked t h i s thread to q u i t )
.
52 n o t i f y s e r v e r q u i t = 0 ;
53 // Quit the thread .
54 c l eanup (EXIT SUCCESS) ;
55 break ;
56 case HDPE TIMER:
57 // Timer i n t e r r u p t .
58 i f ( c u r r en t p r o c e s s != nex t p ro c e s s )
59 r e p l a c e c u r r e n t = 1 ;
60 break ;
61 }
62
63 // Replace proces s i f current and next are unequal .
64 i f ( r e p l a c e c u r r e n t ) {
65 p r i n t f ( ” hdplacer main : Replac ing proce s s .
old−pid=%d . new−pid=%d\n” ,
hp r o c e s s g e t p i d ( cu r r en t p r o c e s s ) ,
hp r o c e s s g e t p i d ( nex t p ro c e s s ) ) ;
66 // Not ready ye t .
67 // h p l a c e r i n t e r r u p t p r o c e s s ( cu r r en t p ro c e s s )
;
68 // hp l a c e r l o a d p r o c e s s ( ne x t p ro c e s s ) ;
69 s e t c u r r e n t p r o c e s s ( nex t p ro c e s s ) ;
70 hdsched re schedu l e ( cu r r en t p roc e s s ,
HDSE RESCHED TIMER) ;
71 } else
72 p r i n t f ( ” hdplacer main : Keeping cur rent
p roce s s on FPGA. pid=%d\n” ,
hp r o c e s s g e t p i d ( nex t p ro c e s s ) ) ;
73
74 return 0 ;
75 }
76
77
78 stat ic struct hproces s ∗ g e t c u r r e n t p r o c e s s ( )
D. HWOS 107
79 {
80 return cu r r en t p r o c e s s ;
81 }
82
83
84 stat ic int s e t c u r r e n t p r o c e s s ( struct hproces s ∗ proce s s )
85 {
86 cu r r en t p r o c e s s = proce s s ;
87
88 return 0 ;
89 }
90
91
92 stat ic void c l eanup ( int e x i t c od e )
93 {
94 // Only n o t i f y s e r v e r i f not done a l r eady .
95 i f ( n o t i f y s e r v e r q u i t ) {
96 n o t i f y s e r v e r q u i t = 0 ;
97 hdmsg not i fy (HDMEQUIT) ;
98 }
99
100 hdmsg synchronize ( ) ;
101 pth r ead ex i t (NULL) ;
102 }
103
104
105 int hdp l a c e r no t i f y (enum hdpevent event )
106 {
107 heven t no t i f y ( event handler , ( int ) event ) ;
108
109 return 0 ;
110 }
111
112
113 void∗ hdplacer main ( )
114 {
115 // Create event hand ler .
116 event hand le r = hevent c r ea t e (HEVENTBLOCKING) ;
117 i f ( event hand le r == NULL) {
118 n o t i f y s e r v e r q u i t = 0 ;
119 c l eanup (EXIT FAILURE) ;
120 }
121
122 h p l a c e r s e t b i t f i l e s p a t h ( b i t f i l e s p a t h ) ;
123 hp l a c e r s e t i c a pd e v i c e ( i c apdev i c e ) ;
124
125 // S leep a b i t to i n i t i a l i z e s chedu l e r .
126 s l e e p (1 ) ;
127 // Serv i c e loop f o r p l a c e r .
128 while (1 ) {
129 // Ask s chedu l e r to f i nd a new proces s to
p l ace .
130 hdsched not i f y (HDSE SHORT) ;
131 wa i t even t ( ) ;
132 }
133
134 hevent remove ( event hand le r ) ;
135 c l eanup (EXIT SUCCESS) ;
136 }
Listing D.9: hdsched.c
1 #include <s t d i o . h>
2 #include <s t d l i b . h>
3 #include <errno . h>
4 #include <sys / types . h>
5 #include <sys / s t a t . h>
6 #include <uni s td . h>
D. HWOS 108
7 #include <pthread . h>
8
9 #include <hlog . h>
10 #include <hs t ru c tu r e s . h>
11 #include <hproces s . h>
12 #include <hsqueue . h>
13 #include <h s q l i s t . h>
14 #include <hp lace r . h>
15 #include <h l i s t . h>
16 #include <hevent . h>
17
18 #include ”hdsched . h”
19 #include ”hdmsg . h”
20
21
22 stat ic void c l eanup ( int e x i t c od e ) ;
23
24 // ! Suspend the thread u n t i l a new event occurs .
25 stat ic int wa i t even t ( ) ;
26
27 /∗ ! \ b r i e f Performs the ( long−term ) s chedu l i n g o f a new
proces s .
28 ∗
29 ∗ This func t i on shou ld dec ide i f the g iven proces s shou ld
be accepted
30 ∗ by the system or not .
31 ∗
32 ∗ @param process Points to the f i r s t proces s in the
HPS NEW−queue .
33 ∗
34 ∗ @return Negat ive i f proces s not accepted . 0 or p o s i t i v e
va lue i f i t was .
35 ∗ @retva l HDSR TOOMANY Too many proce s s e s in system .
36 ∗ @retva l HDSR NO BITFILE B i t f i l e does not e x i s t s .
37 ∗ @retva l HDSR READY Process added to a ready queue .
38 ∗/
39 stat ic int s chedu l e l ong te rm ( struct hproces s ∗ proce s s ) ;
40 stat ic int s chedu l e sho r t t e rm ( ) ;
41 stat ic int increment process number ( ) ;
42 stat ic int decrement process number ( ) ;
43
44 /∗ ! \ b r i e f Get f i r s t queue wi th g iven p r i o r i t y or c r ea t e a
queue and i n s e r t i t a t c o r r e c t p l a ce in l i s t .
45 ∗
46 ∗ @param l i s t The l i s t o f queues .
47 ∗ @param p r i o r i t y The p r i o r i t y to be compared .
48 ∗ @return The e x i s i t i n g queue or a new queue ( at c o r r e c t
p l a ce in l i s t ) on succe s s . NULL on f a i l u r e .
49 ∗/
50 stat ic struct hsqueue∗ c r e a t e qu eu e by p r i o r i t y ( int l i s t ,
int p r i o r i t y ) ;
51
52 /∗ ! \ b r i e f Set next proces s to be p laced on the FPGA.
53 ∗
54 ∗ The process can be f e t c h ed by another thread at the same
55 ∗ t ime and i s t h e r e f o r e secured wi th a mutex .
56 ∗
57 ∗ @param process The proces s .
58 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
59 ∗/
60 int s e t n e x t p r o c e s s ( struct hproces s ∗ proce s s ) ;
61
62
63 // ! Mutex f o r acces s to the queue o f new proce s s e s .
64 stat ic pthread mutex t new hsqueue mutex =
PTHREAD MUTEX INITIALIZER;
65 // ! Ask the s e r v e r to q u i t a l l t h reads .
66 stat ic int n o t i f y s e r v e r q u i t = 1 ;
D. HWOS 109
67 // ! Number o f ( accepted ) p roce s s e s in the system and i t ’ s
mutex .
68 stat ic int processes num = 0 ;
69 stat ic pthread mutex t processes num mutex =
PTHREAD MUTEX INITIALIZER;
70 // ! Next proces s to be p laced on the FPGA and i t ’ s mutex .
71 stat ic struct hproces s ∗ nex t p ro c e s s = NULL;
72 stat ic pthread mutex t next process mutex =
PTHREAD MUTEX INITIALIZER;
73 // ! Queue f o r p roce s s e s t ha t need r e s ch edu l i n g and i t ’ s
mutex .
74 stat ic struct h l i s t ∗ resched queue = NULL;
75 stat ic pthread mutex t resched queue mutex =
PTHREAD MUTEX INITIALIZER;
76 // ! Mutex f o r HPS READY−queue .
77 stat ic pthread mutex t ready queue mutex =
PTHREAD MUTEX INITIALIZER;
78 // ! The event hand ler f o r t h i s thread .
79 stat ic struct hevent∗ event hand le r = NULL;
80
81
82 stat ic int increment process number ( )
83 {
84 int r e t = −1;
85 pthread mutex lock ( &processes num mutex ) ;
86 processes num += 1 ;
87 r e t = processes num ;
88 pthread mutex unlock ( &processes num mutex ) ;
89
90 return r e t ;
91 }
92
93
94 stat ic int decrement process number ( )
95 {
96 int r e t = −1;
97 pthread mutex lock ( &processes num mutex ) ;
98 i f ( processes num < 1)
99 processes num = 0 ;
100 else
101 processes num −= 1 ;
102
103 r e t = processes num ;
104 pthread mutex lock ( &processes num mutex ) ;
105
106 return r e t ;
107 }
108
109
110 stat ic void c l eanup ( int e x i t c od e )
111 {
112 // Only n o t i f y s e r v e r i f not done a l r eady .
113 i f ( n o t i f y s e r v e r q u i t ) {
114 n o t i f y s e r v e r q u i t = 0 ;
115 hdmsg not i fy (HDMEQUIT) ;
116 }
117
118 hdmsg synchronize ( ) ;
119 pth r ead ex i t (NULL) ;
120 }
121
122
123 stat ic struct hsqueue∗ c r e a t e qu eu e by p r i o r i t y ( int l i s t ,
int p r i o r i t y )
124 {
125 struct hsqueue∗ queue = h s q l i s t g e t l a s t q u e u e ( l i s t ) ;
126
127 // Walk through the queues from the l a s t ( l owe s t p r i o r i t y )
to
D. HWOS 110
128 // the f i r s t ( h i g h e s t p r i o r i t y ) .
129 // Return the f i r s t queue t ha t has l a r g e r or equa l p r i o r i t y
to
130 // the g iven p r i o r i t y .
131 while ( hsqueue get prev ( queue ) != NULL) {
132 i f ( h p r o c e s s g e t p r i o r i t y ( h s q u e u e g e t f i r s t p r o c e s s (
queue ) ) >= p r i o r i t y )
133 return queue ;
134 queue = hsqueue get prev ( queue ) ;
135 }
136
137 i f ( h p r o c e s s g e t p r i o r i t y ( h s q u e u e g e t f i r s t p r o c e s s ( queue )
) < hp r o c e s s b a s e p r i o r i t y ( ) ) {
138 // A l l queues have l e s s p r i o r i t y ( or t he r e i s no queues )
.
139 // Create new queue and put i t in f r on t o f l i s t .
140 queue = hsqueue c rea te ( ) ;
141 h s q l i s t i n s e r t q u e u e b e f o r e (HPS READY,
h s q l i s t g e t f i r s t q u e u e (HPS READY) , queue ) ;
142 } else i f ( h p r o c e s s g e t p r i o r i t y ( h s q u e u e g e t f i r s t p r o c e s s (
queue ) ) > hp r o c e s s b a s e p r i o r i t y ( ) ) {
143 // The queue found have l a r g e r p r i o r i t y than the g iven
p r i o r i t y .
144 // Create new queue and put i t behind t ha t queue .
145 struct hsqueue∗ new queue = hsqueue c rea te ( ) ;
146 i f ( hsqueue get next ( queue ) == NULL)
147 hsq l i s t add queue (HPS READY, new queue ) ;
148 else
149 h s q l i s t i n s e r t q u e u e b e f o r e (HPS READY,
hsqueue get next ( queue ) , new queue ) ;
150 queue = new queue ;
151 } // Else : The queue found has the g iven p r i o r i t y .
152
153 return queue ;
154 }
155
156
157 stat ic int s chedu l e sho r t t e rm ( )
158 {
159 // Get the HPS READY−queue wi th h i g h e s t p r i o r i t y .
160 pthread mutex lock(&ready queue mutex ) ;
161 struct hsqueue∗ ready queue =
h s q l i s t g e t f i r s t q u e u e (HPS READY) ;
162 struct hproces s ∗ proce s s = hsqueue dequeue proces s (
ready queue ) ;
163 pthread mutex unlock(&ready queue mutex ) ;
164
165 i f ( p roce s s != NULL) {
166 s e t n e x t p r o c e s s ( p roce s s ) ;
167 }
168
169 return 0 ;
170 }
171
172
173 stat ic int s chedu l e l ong te rm ( struct hproces s ∗ proce s s )
174 {
175 i f ( hdsched processes number ( ) + 1 >
hproce s s max proce s s e s ( ) )
176 return HDSRTOOMANY;
177
178 // Check i f b i t f i l e i s r e gu l a r f i l e .
179 struct s t a t s t a t b u f f e r ;
180 char∗ f i l e n am e f u l l =
h p l a c e r c r e a t e f u l l b i t f i l e n am e ( proce s s ) ;
181 h l o g wr i t e (HLOGDEBUG, ” schedu l e l ong te rm : Filename
f o r p roce s s : ” ) ;
182 h l o g w r i t e t e x t ( f i l e n am e f u l l ) ;
D. HWOS 111
183 h l o g w r i t e t e x t ( ”\n” ) ;
184
185 s t a t ( f i l e n ame f u l l , &s t a t b u f f e r ) ;
186 f r e e ( f i l e n am e f u l l ) ;
187 i f ( ! S ISREG( s t a t b u f f e r . st mode ) )
188 return HDSR NO BITFILE ;
189
190 // Process can be added to the HPS READY−queue wi th
the base
191 // p r i o r i t y .
192 hp r o c e s s s e t p r i o r i t y ( process ,
h p r o c e s s b a s e p r i o r i t y ( ) ) ;
193 pthread mutex lock(&ready queue mutex ) ;
194 struct hsqueue∗ queue = c r e a t e qu eu e by p r i o r i t y (
HPS READY, hp r o c e s s b a s e p r i o r i t y ( ) ) ;
195 hsqueue enqueue proces s ( queue , p roce s s ) ;
196 pthread mutex unlock(&ready queue mutex ) ;
197 increment process number ( ) ;
198
199 return HDSRREADY;
200 }
201
202
203 stat ic int wa i t even t ( )
204 {
205 enum hdsevent incoming event = (enum hdsevent ) hevent wai t (
event hand le r ) ;
206
207 // An incoming event i s handled here .
208 switch ( incoming event ) {
209 case HDSE QUIT: {
210 // Do not ask s e r v e r to q u i t ( the s e r v e r asked t h i s thread
to q u i t ) .
211 n o t i f y s e r v e r q u i t = 0 ;
212 c l eanup (EXIT FAILURE) ;
213 break ;
214 }
215 case HDSE NEW PROCESS: {
216 struct hproces s ∗ new process = hdsched get new proces s
( ) ;
217 int s ched long = schedu l e l ong te rm ( new process ) ;
218 h l o g wr i t e (HLOGDEBUG, ” wa i t even t : ( s chedu l e r ) A new
proce s s has been added .\n” ) ;
219 h l o g wr i t e (HLOGDEBUG, ”Return value from long−term
schedu l e r : ” ) ;
220 h l o g w r i t e i n t e g e r ( s ched long ) ;
221 h l o g w r i t e t e x t ( ” .\n” ) ;
222
223 // Schedu ler r e f u s e s the new proces s .
224 // TODO: Should n o t i f y the message s e r v e r and l e t i t
send a response to the c l i e n t .
225 i f ( s ched long < 0)
226 hprocess remove ( new process ) ;
227
228 break ;
229 }
230 case HDSE SHORT:
231 s chedu l e sho r t t e rm ( ) ;
232 break ;
233 case HDSE RESCHED TIMER: {
234 struct hproces s ∗ r e s ch ed p ro c e s s = h l i s t d equeu e (
resched queue ) ;
235 i f ( r e s ch ed p ro c e s s != NULL) {
236 struct hsqueue∗ ready queue =
237 c r e a t e qu eu e by p r i o r i t y (HPS READY, hp r o c e s s g e t p r i o r i t y
( r e s ch ed p ro c e s s ) ) ;
D. HWOS 112
238 hsqueue enqueue proces s ( ready queue ,
r e s ch ed p ro c e s s ) ;
239 }
240 break ;
241 }
242 default :
243 break ;
244 }
245
246 return 0 ;
247 }
248
249
250 int s e t n e x t p r o c e s s ( struct hproces s ∗ proce s s )
251 {
252 // Lock the mutex to ge t a s t a b l e va lue o f the
proces s .
253 pthread mutex lock ( &next process mutex ) ;
254 nex t p ro c e s s = proce s s ;
255 pthread mutex unlock ( &next process mutex ) ;
256
257 return 0 ;
258 }
259
260
261 int hdsched re schedu l e ( struct hproces s ∗ process , enum
hdsevent event )
262 {
263 pthread mutex lock ( &resched queue mutex ) ;
264 h l i s t enqueu e ( resched queue , p roce s s ) ;
265 pthread mutex unlock ( &resched queue mutex ) ;
266
267 p r i n t f ( ”Rescheduled proce s s . pid=%d\n” ,
hp r o c e s s g e t p i d ( p roce s s ) ) ;
268
269 hdsched not i f y ( event ) ;
270
271 return 0 ;
272 }
273
274
275 int hdsched add new process ( struct hproces s ∗ proce s s )
276 {
277 // ! TODO: This f unc t i on always adds the new proces s
to the f i r s t o f the HPS NEW−queues .
278 // ! A fu t u r e ve r s i on o f the s chedu l e r might want to
add i t to another queue based on
279 // ! the proces s ’ i n i t i a l p r i o r i t y .
280 pthread mutex lock ( &new hsqueue mutex ) ;
281 hsqueue enqueue proces s ( h s q l i s t g e t f i r s t q u e u e (
HPS NEW) , proce s s ) ;
282 pthread mutex unlock ( &new hsqueue mutex ) ;
283
284 return 0 ;
285 }
286
287
288 struct hproces s ∗ hdsched get new proces s ( )
289 {
290 struct hproces s ∗ new process = NULL;
291
292 pthread mutex lock ( &new hsqueue mutex ) ;
293 new process = hsqueue dequeue proces s (
h s q l i s t g e t f i r s t q u e u e (HPS NEW) ) ;
294 pthread mutex unlock ( &new hsqueue mutex ) ;
295
296 return new process ;
297 }
298
D. HWOS 113
299
300 struct hproces s ∗ hds ched ge t nex t p ro c e s s ( )
301 {
302 struct hproces s ∗ r e t = NULL;
303 // Lock the mutex to ge t a s t a b l e va lue o f the
proces s .
304 pthread mutex lock ( &next process mutex ) ;
305 r e t = nex t p ro c e s s ;
306 pthread mutex unlock ( &next process mutex ) ;
307
308 return r e t ;
309 }
310
311
312 int hdsched processes number ( )
313 {
314 int r e t = −1;
315 pthread mutex lock ( &processes num mutex ) ;
316 r e t = processes num ;
317 pthread mutex unlock ( &processes num mutex ) ;
318
319 return r e t ;
320 }
321
322
323 int hdsched not i f y (enum hdsevent event )
324 {
325 heven t no t i f y ( event handler , ( int ) event ) ;
326
327 return 0 ;
328 }
329
330
331 void∗ hdsched main ( )
332 {
333 h s q l i s t c r e a t e (HPS READY) ;
334 struct hsqueue∗ new hsqueue = hsqueue c rea te ( ) ;
335 h s q l i s t c r e a t e (HPS NEW) ;
336 hsq l i s t add queue (HPS NEW, new hsqueue ) ;
337
338 // Create event hand ler .
339 event hand le r = hevent c r ea t e (HEVENTBLOCKING) ;
340 i f ( event hand le r == NULL) {
341 n o t i f y s e r v e r q u i t = 0 ;
342 c l eanup (EXIT FAILURE) ;
343 }
344
345 // Create queue f o r p roce s s e s t ha t need r e s ch edu l i n g
.
346 resched queue = h l i s t c r e a t e ( ) ;
347
348 // Serv i c e loop f o r s chedu l e r .
349 while (1 ) {
350 p r i n t f ( ” hdsched main : Waiting\n” ) ;
351 // Suspend the s chedu l e r u n t i l a new event .
352 wa i t even t ( ) ;
353 }
354
355
356 hevent remove ( event hand le r ) ;
357 c l eanup (EXIT SUCCESS) ;
358 }
Listing D.10: hdtimer.c
1 #include <s t d i o . h>
2 #include <s t d l i b . h>
3 #include <pthread . h>
D. HWOS 114
4 #include <uni s td . h>
5
6 #include <hevent . h>
7
8 #include ” hdplacer . h”
9 #include ”hdmsg . h”
10 #include ” hdtimer . h”
11
12
13 stat ic void c l eanup ( int e x i t c od e ) ;
14
15 /∗ ! \ b r i e f Handle an incoming event .
16 ∗
17 ∗ This f unc t i on s l o c k s a mutex , but does not wai t on a
18 ∗ cond i t i on v a r i a b l e ( even t s are handled synchronous ly ) .
19 ∗/
20 stat ic int hand l e event ( ) ;
21
22 // ! The event hand ler f o r t h i s thread .
23 stat ic struct hevent∗ event hand le r = NULL;
24
25 // ! I f t h i s i s true , t h i s thread shou ld ask the s e r v e r to
q u i t a l l t h reads when e x i t i n g .
26 stat ic int n o t i f y s e r v e r q u i t = 1 ;
27
28
29 stat ic int hand l e event ( )
30 {
31 enum hdtevent incoming event = (enum hdtevent )
hevent wai t ( event hand le r ) ;
32
33 // An incoming event i s handled here .
34 switch ( incoming event ) {
35 case HDTE QUIT:
36 // Do not ask s e r v e r to q u i t ( the
s e r v e r asked t h i s thread to q u i t )
.
37 n o t i f y s e r v e r q u i t = 0 ;
38 // Quit the thread .
39 c l eanup (EXIT SUCCESS) ;
40 break ;
41 }
42
43 return 0 ;
44 }
45
46
47 stat ic void c l eanup ( int e x i t c od e )
48 {
49 // Only n o t i f y s e r v e r i f not done a l r eady .
50 i f ( n o t i f y s e r v e r q u i t ) {
51 n o t i f y s e r v e r q u i t = 0 ;
52 hdmsg not i fy (HDMEQUIT) ;
53 }
54
55 hdmsg synchronize ( ) ;
56 pth r ead ex i t (NULL) ;
57 }
58
59
60 int hdt imer no t i f y (enum hdtevent event )
61 {
62 heven t no t i f y ( event handler , ( int ) event ) ;
63
64 return 0 ;
65 }
66
67 const int hd t ime r g e t t ime s l i c e ( )
68 {
D. HWOS 115
69 return 1000000;
70 }
71
72
73 void∗ hdtimer main ( )
74 {
75 // Create event hand ler .
76 event hand le r = hevent c r ea t e (HEVENTNONBLOCKING) ;
77 i f ( event hand le r == NULL) {
78 n o t i f y s e r v e r q u i t = 0 ;
79 c l eanup (EXIT FAILURE) ;
80 }
81
82 // S leep a b i t to i n i t i a l i z e s chedu l e r .
83 s l e e p (1 ) ;
84 // Serv i c e loop f o r p l a c e r .
85 while (1 ) {
86 // Wait f o r one t im e s l i c e .
87 us l e ep ( hd t ime r g e t t ime s l i c e ( ) ) ;
88 // Not i f y p l a c e r t ha t t imer has exp i r ed .
89 hdp l a c e r no t i f y (HDPE TIMER) ;
90 // Handle an incoming event .
91 hand l e event ( ) ;
92 }
93
94 hevent remove ( event hand le r ) ;
95 c l eanup (EXIT SUCCESS) ;
96 }
D.2 HWOS-library
These are some of the selected header files (*.h) from the HWOS library.
Listing D.11: hdev.h
1 /∗ ! \ f i l e hdev . h
2 ∗ \ b r i e f [ Hardware OS Device Driver I n t e r f a c e ] Keep t rack
o f dev i c e d r i v e r s in the HWOS.
3 ∗
4 ∗ This f u n c t i o n a l i t y i s not r e a l l y used in t h i s v e r s i on o f
the HWOS.
5 ∗ The module can func t i on as a base f o r f u r t h e r
development . I t i s p robab l y a good idea
6 ∗ to perform the ac t ua l c r ea t i on o f a dev i c e d r i v e r in
hdev add (now i t on ly adds a
7 ∗ d e s c r i p t i o n to a l i s t ) .
8 ∗
9 ∗ Based on d e v i c e l i s t by Vegard Endresen .
10 ∗ Sindre Hansen (2011) : Rewri t ten .
11 ∗ − Changed names and pu b l i c i n t e r f a c e .
12 ∗ − Hidden the dev i c e s t r u c t u r e .
13 ∗ − Encapsulated v a r i a b l e s and data s t r u c t u r e s .
14 ∗/
15
16
17 #include ” h s t ru c tu r e s . h”
18
19 #ifndef HDEV H
20 #define HDEV H
21
22 // ! Longest a l l owed path f o r dev i c e f i l e .
23 #define HDEV PATH SIZE 40
24
25 /∗ ! \ b r i e f S i z e o f the dev i c e l i s t .
D. HWOS 116
26 ∗
27 ∗ @return Number o f e lements in the dev i c e l i s t .
28 ∗/
29 int hdev s i z e ( ) ;
30
31 /∗ ! \ b r i e f Get dev i c e by major number .
32 ∗
33 ∗ @return Number o f e lements in the dev i c e l i s t .
34 ∗/
35 struct hdev∗ hdev get by major ( int major ) ;
36
37 /∗ ! \ b r i e f Add a dev i c e d e s c r i p t i o n .
38 ∗
39 ∗ This adds a d e s c r i p t i o n o f a charac t e r dev i c e
40 ∗ with a path and a major number .
41 ∗
42 ∗ @param major The major number o f the dev i c e .
43 ∗ @param dev i c e pa t h Path o f the dev i c e .
44 ∗ @return A po in t e r to a s t r u c t r ep r e s en t i n g the dev i c e on
succe s s . NULL on f a i l u r e .
45 ∗/
46 struct hdev∗ hdev add ( int major , char∗ dev i ce path ) ;
47
48 /∗ ! \ b r i e f Remove a dev i c e from the HWOS.
49 ∗
50 ∗ @param dev i c e Pointer to the s t r u c t r ep r e s en t i n g the
dev i c e .
51 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
52 ∗/
53 int hdev remove ( struct hdev∗ dev i c e ) ;
54
55 /∗ ! \ b r i e f Get major number f o r dev i c e .
56 ∗
57 ∗ @param dev i c e Pointer to the dev i c e .
58 ∗ @return Major number ( p o s i t i v e i n t e g e r ) on succe s s .
Negat ive on f a i l u r e .
59 ∗/
60 int hdev get major ( struct hdev∗ dev i c e ) ;
61
62 /∗ ! \ b r i e f Remove dev i c e by major number .
63 ∗
64 ∗ @param major Major number .
65 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
66 ∗/
67 int hdev remove by major ( int major ) ;
68
69 /∗ ! \ b r i e f Remove a l l d e v i c e s in the system .
70 ∗
71 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
72 ∗/
73 int hdev remove a l l ( ) ;
74
75 /∗ ! \ b r i e f Get f i r s t d ev i c e in l i s t .
76 ∗
77 ∗ @return Pointer to the dev i c e on succe s s . NULL on
f a i l u r e .
78 ∗/
79 struct hdev∗ h d e v g e t f i r s t ( ) ;
80
81 /∗ ! \ b r i e f Get l a s t dev i c e in l i s t .
82 ∗
83 ∗ @return Pointer to the dev i c e on succe s s . NULL on
f a i l u r e .
84 ∗/
85 struct hdev∗ hdev g e t l a s t ( ) ;
86
87 /∗ ! \ b r i e f Get next dev i c e in l i s t .
88 ∗
89 ∗ @param dev i c e Pointer to the dev i c e t ha t has the next
D. HWOS 117
f i e l d .
90 ∗ @return Pointer to the dev i c e on succe s s . NULL on
f a i l u r e .
91 ∗/
92 struct hdev∗ hdev get next ( struct hdev∗ dev i c e ) ;
93
94 /∗ ! \ b r i e f Get prev ious dev i c e in l i s t .
95 ∗
96 ∗ @param dev i c e Pointer to the dev i c e t ha t has the
prev ious f i e l d .
97 ∗ @return Pointer to the dev i c e on succe s s . NULL on
f a i l u r e .
98 ∗/
99 struct hdev∗ hdev get prev ( struct hdev∗ dev i c e ) ;
100
101
102 #endif
Listing D.12: hevent.h
1 /∗ ! \ f i l e hevent . h
2 ∗ \ b r i e f [ Hardware OS Event I n t e r f a c e ] No t i f y and handle
even t s between threads .
3 ∗
4 ∗ This module conta ins f u n c t i o n a l i t y f o r hand l ing even t s
between threads .
5 ∗ Events can be handled both b l o c k i n g ( asynchronous ly ) or
non−b l o c k i n g ( synchronous ly ) .
6 ∗ The under l y ing implemenation use System V message queues
.
7 ∗
8 ∗ Author (2011) : Sindre Hansen
9 ∗
10 ∗/
11
12 #ifndef HEVENTH
13 #define HEVENTH
14
15 #define HEVENTBLOCKING 1
16 #define HEVENTNONBLOCKING 0
17
18 #include ” h s t ru c tu r e s . h”
19
20 /∗ ! \ b r i e f Creates an event hand ler .
21 ∗
22 ∗ @param async Set to t rue i f h e ven t wa i t shou ld b l o c k
wh i l e wa i t ing f o r even t s .
23 ∗ @return The event hand ler o b j e c t .
24 ∗/
25 struct hevent∗ heven t c r ea t e ( int b lock ing ) ;
26
27 /∗ ! \ b r i e f No t i f y an event .
28 ∗
29 ∗ @param event Event hand ler .
30 ∗ @param n o t i f i c a t i o n The ac t ua l event / n o t i f i c a t i o n . Must
be an i n t e g e r .
31 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
32 ∗/
33 int heven t no t i f y ( struct hevent∗ event , int n o t i f i c a t i o n ) ;
34
35 /∗ ! \ b r i e f Wait on event .
36 ∗
37 ∗ Wait may be performed in a b l o c k i n g or
38 ∗ non−b l o c k i n g manner ( see h e v en t c r ea t e ) .
39 ∗
40 ∗ @param event Event hand ler .
41 ∗ @return The event on succe s s . −1 on f a i l u r e .
42 ∗/
43 int hevent wai t ( struct hevent∗ event ) ;
D. HWOS 118
44
45 /∗ ! \ b r i e f Remove an event hand ler .
46 ∗
47 ∗ @param event Event hand ler .
48 ∗ @return 0 on succe s s . −1 on f a i l u r e .
49 ∗/
50 int hevent remove ( struct hevent∗ event ) ;
51
52
53 #endif
Listing D.13: hicap.h
1 /∗ ! \ f i l e h icap . h
2 ∗ \ b r i e f [ I n t e rna l Conf i gura t ion Port I n t e r f a c e ] I n t e r f a c e
f o r p a r t i a l r e c on f i g u r a t i on on the FPGA.
3 ∗
4 ∗ This module makes i t p o s s i b l e to wr i t e a p a r t i a l
b i t s t r eam to the ICAP on the FPGA. Based on
5 ∗ the important work done by Sverre Hamre in h i s t h e s i s .
6 ∗
7 ∗ Orig ina l author (2009) : Sverre Hamre
8 ∗ Modif ied (2011) : Sindre Hansen
9 ∗ − Customized i n t e r f a c e f o r the HWOS.
10 ∗ − Added convers ion from CLB column to a b s o l u t e
column .
11 ∗
12 ∗/
13
14 #ifndef HICAP H
15 #define HICAP H
16
17 // ! Number o f CLB columns on the XC4VFX12 (UG070 , page 184) .
18 #define HC CLBCOLS XC4VFX12 24
19 /∗ ! Number o f CLB minor rows on the XC4VFX12 (UG070 , page
184) .
20 ∗ A minor row i s de f ined here as a row tha t i s 1 CLB in
v e r t i c a l width and
21 ∗ spanning the en t i r e dev i c e h o r i z o n t a l l y .
22 ∗/
23 #define HC CLBMINORROWS XC4VFX12 64
24
25
26 // TODO: These d e f i n e s shou ld be g iven the HC −p r e f i x (
Sindre Hansen ) .
27
28 #define DUMMYPACKET 0xFFFFFFFFUL
29 #define SYNC PACKET 0xAA995566UL
30 #define NOPPACKET 0X20000000UL
31 #define CMDPACKET 0x30008001UL
32 #define FARPACKET 0x30002001UL
33 #define IDCODE PACKET 0x30018001UL
34 #define STAT PACKET 0x2800E001UL
35
36
37 #define WCFGCMD 1 // Write c on f i g u ra t i on
comand .
38 #define LFRMCMD 3 // Write l a s t frame
command .
39 #define RCFGCMD 4
40 #define RCRCCMD 7
41 #define DESYNCCMD 14
42
43
44 /∗ Conf igura t ion r e g i s t e r s ∗/
45 #define FDRI 2 //Frame data r e g i s t e r
input
46 #define STAT 7 // S ta tus r e g i s t e r
47
D. HWOS 119
48 /∗ Constant to use f o r CRC check when CRC has been d i s a b l e d
∗/
49 #define XHI DISABLED AUTO CRC 0x0000DEFCUL
50 #define XHI TYPE 1 PACKET MAXWORDS 1024
51
52 #define WORDSPRFRAME 41
53
54 #define TYPE2 30 // B i t p o s i t i o n f o r
type 2 header
55 #define TYPE1 29 // B i t p o s i t i o n f o r
type 1 header
56
57 #define REGADDR 13 // B i t p o s i t i o n f o r
r e g i s t e r address in type 1 header
58
59 #define OP READ 27 // B i t p o s i t i o n f o r
read op
60 #define OPWRITE 28 // B i t p o s i t i o n f o r
wr i t e op
61
62 /∗ Device s p e c i f i c ID code ∗/
63 #define V4FX12ID 0x21E58093 ; // Value
read out from the v i r t e x fx12 used .
64
65
66 /∗ ! \ b r i e f Open the ICAP−dev i c e .
67 ∗
68 ∗ @param i c ap d e v i c e Path to the dev i c e in the system .
Example : /dev/ icap
69 ∗ @param de v i c e t y p e Type o f the Virtex−4 dev i c e . Only ”
XC4VFX12” i s supported at t h i s time .
70 ∗ @return A f i l e d e s c r i p t o r f o r the dev i c e . −1 on f a i l u r e .
71 ∗/
72 int hicap open (char∗ i c ap dev i c e , char∗ dev i c e type ) ;
73
74 /∗ ! \ b r i e f Close the ICAP−dev i c e .
75 ∗
76 ∗ @param icap hand l e The handle / f i l e d e s c r i p t o r to the
dev i c e as re turned by hicap open .
77 ∗ @return 0 on succe s s . Negat ive on f a i l u r e
78 ∗/
79 int h i c a p c l o s e ( int i c ap hand l e ) ;
80
81 /∗ ! \ b r i e f Write a b i t s t r eam to ICAP.
82 ∗
83 ∗ Each row spans the en t i r e FPGA ho r i z o n t a l l y and i s 16
CLBs in v e r t i c a l width .
84 ∗ The r e a l row address s t a r t from 0 at the top o f the
dev i c e and increments downwards .
85 ∗
86 ∗ @param icap hand l e The handle / f i l e d e s c r i p t o r to the
dev i c e as re turned by hicap open .
87 ∗ @param in f i l ename The f i l ename fo r the b i t f i l e .
88 ∗ @param rea l row Spec i f y a row from 0 to [max rows − 1 ] .
See d e f i n i t i o n o f row in d e s c r i p t i o n .
89 ∗ @param clb co lumn Spec i f y a CLB column from 0 to [max
CLB columns − 1 ] to s t a r t from .
90 ∗ @param f r ame s o f f s e t Spec i f y an o f f s e t o f frames wi th in
the g iven address ( normal ly 0) .
91 ∗ @param frames The number o f frames to be wr i t t en . 22
frames are 8 CLBs + 1 HCLK + 8 CLBs .
92 ∗ @return 0 on succe s s . Negat ive on f a i l u r e
93 ∗/
94 int h i c ap wr i t e ( int i cap handle , char∗ i n f i l ename , int
rea l row , int clb column , int f r ame s o f f s e t , int frames ) ;
95
96 /∗ ! \ b r i e f Get s t a t u s o f the ICAP−dev i c e .
97 ∗
D. HWOS 120
98 ∗ Refer to the work o f Sverre Hamre to f i nd out what t h i s
f unc t i on shou ld
99 ∗ re turn .
100 ∗
101 ∗ @param icap hand l e Handle f o r icap dev i c e . Returned by
hicap open .
102 ∗ @return 0 on f a i l u r e . Undefined on succe s s ?
103 ∗/
104 int h i c ap g e t s t a t u s ( int i c ap hand l e ) ;
105
106
107 #endif
Listing D.14: hlist.h
1 /∗ ! \ f i l e h l i s t . h
2 ∗ \ b r i e f [ Hardware OS Double Linked L i s t I n t e r f a c e ]
General f u n c t i o n a l i t y f o r doub le l i n k e d l i s t s .
3 ∗
4 ∗ General purpose doub le l i n k e d l i s t s t r u c t u r e . Each l i s t
( h l i s t ) has e lements ( h l e l ement ) .
5 ∗ Each element has data po in ted to by void−po in t e r s .
6 ∗
7 ∗ Author (2011) : Sindre Hansen
8 ∗
9 ∗/
10
11 #ifndef HLIST H
12 #define HLIST H
13
14 #include ” h s t ru c tu r e s . h”
15
16 /∗ ! \ b r i e f Get s i z e o f e lements in l i s t .
17 ∗
18 ∗ @param l i s t Pointer to l i s t .
19 ∗ @return Number o f e lements in l i s t .
20 ∗/
21 int h l i s t s i z e ( struct h l i s t ∗ l i s t ) ;
22
23 /∗ ! \ b r i e f Get data f o r the g iven element .
24 ∗
25 ∗ @param element Pointer to e lement .
26 ∗ @return Pointer to the e lement ’ s data .
27 ∗/
28 void∗ h l e l ement ge t da ta ( struct hle lement ∗ element ) ;
29
30 /∗ ! \ b r i e f Set data f o r the g iven element .
31 ∗
32 ∗ @param element Pointer to e lement .
33 ∗ @param data Pointer to the data .
34 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
35 ∗/
36 int h l e l emen t s e t da ta ( struct hle lement ∗ element , void∗ data
) ;
37
38 /∗ ! \ b r i e f Get prev ious e lement in l i s t .
39 ∗
40 ∗ @param element Pointer to e lement t ha t has the prev ious
f i e l d .
41 ∗ @return Pointer to the prev ious e lement on succe s s . NULL
on f a i l u r e .
42 ∗/
43 struct hle lement ∗ h l e l ement ge t p r ev ( struct hle lement ∗
element ) ;
44
45 /∗ ! \ b r i e f Get next e lement in l i s t .
46 ∗
47 ∗ @param element Pointer to e lement t ha t has the next
f i e l d .
D. HWOS 121
48 ∗ @return Pointer to the next e lement on succe s s . NULL on
f a i l u r e .
49 ∗/
50 struct hle lement ∗ h l e l ement ge t nex t ( struct hle lement ∗
element ) ;
51
52 /∗ ! \ b r i e f Create a new l i s t .
53 ∗
54 ∗ @return Pointer to the new l i s t s t r u c t u r e on succe s s .
NULL on f a i l u r e .
55 ∗/
56 struct h l i s t ∗ h l i s t c r e a t e ( ) ;
57
58 /∗ ! \ b r i e f Enqueue a new element in a l i s t .
59 ∗
60 ∗ @param l i s t Pointer to the g iven l i s t .
61 ∗ @param data e l ement Pointer to the data t ha t shou ld be
connected to the element .
62 ∗ @return Pointer to the new element on succe s s . NULL on
f a i l u r e .
63 ∗/
64 struct hle lement ∗ h l i s t enqueu e ( struct h l i s t ∗ l i s t , void∗
data e lement ) ;
65
66 /∗ ! \ b r i e f I n s e r t a new element a f t e r another in the g iven
l i s t .
67 ∗
68 ∗ @param l i s t Pointer to the g iven l i s t .
69 ∗ @param element Element t ha t the new element shou ld be
p laced a f t e r .
70 ∗ @param data e l ement Pointer to the data t ha t shou ld be
connected to the element .
71 ∗ @return Pointer to the new element on succe s s . NULL on
f a i l u r e .
72 ∗/
73 struct hle lement ∗ h l i s t i n s e r t a f t e r ( struct h l i s t ∗ l i s t ,
struct hle lement ∗ element , void∗ data e lement ) ;
74
75 /∗ ! \ b r i e f I n s e r t a new element b e f o r e another in the g iven
l i s t .
76 ∗
77 ∗ @param l i s t Pointer to the g iven l i s t .
78 ∗ @param element Element t ha t the new element shou ld be
p laced b e f o r e .
79 ∗ @param data e l ement Pointer to the data t ha t shou ld be
connected to the element .
80 ∗ @return Pointer to the new element on succe s s . NULL on
f a i l u r e .
81 ∗/
82 struct hle lement ∗ h l i s t i n s e r t b e f o r e ( struct h l i s t ∗ l i s t ,
struct hle lement ∗ element , void∗ data e lement ) ;
83
84 /∗ ! \ b r i e f Remove a g iven element from the g iven l i s t .
85 ∗
86 ∗ @param l i s t Pointer to the g iven l i s t .
87 ∗ @param element Pointer to the g iven element .
88 ∗ @return Pointer to the e lements data on succe s s . NULL on
f a i l u r e .
89 ∗/
90 void∗ hle lement remove ( struct hle lement ∗ element , struct
h l i s t ∗ l i s t ) ;
91
92 /∗ ! \ b r i e f Create an l i s t e lement t ha t does not be long to
any l i s t .
93 ∗
94 ∗ @param data e l ement Pointer to the data t ha t shou ld be
connected to the element .
95 ∗ @return Pointer to the new element on succe s s . NULL on
f a i l u r e .
D. HWOS 122
96 ∗/
97 struct hlorphan ∗ h lo rphan c r ea t e (void∗ data e lement ) ;
98
99 /∗ ! \ b r i e f Get next e lement f o r the g iven element .
100 ∗
101 ∗ @param orphan Pointer to the e lement t ha t has the next
f i e l d .
102 ∗ @return Pointer to the e lement on succe s s . NULL on
f a i l u r e .
103 ∗/
104 struct hlorphan ∗ h lo rphan get next ( struct hlorphan ∗ orphan ) ;
105
106 /∗ ! \ b r i e f Set next f i e l d f o r the g iven element .
107 ∗
108 ∗ @param orphan Pointer to the e lement t ha t has the next
f i e l d .
109 ∗ @param next Pointer to the next e lement .
110 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
111 ∗/
112 int h lo rphan se t nex t ( struct hlorphan ∗ orphan , struct
hlorphan ∗ next ) ;
113
114 /∗ ! \ b r i e f Get prev ious e lement f o r the g iven element .
115 ∗
116 ∗ @param orphan Pointer to the e lement t ha t has the
prev ious f i e l d .
117 ∗ @return Pointer to the e lement on succe s s . NULL on
f a i l u r e .
118 ∗/
119 struct hlorphan ∗ h lo rphan get prev ( struct hlorphan ∗ orphan ) ;
120
121 /∗ ! \ b r i e f Set p rev ious f i e l d f o r the g iven element .
122 ∗
123 ∗ @param orphan Pointer to the e lement t ha t has the
prev ious f i e l d .
124 ∗ @param prev Pointer to the prev ious element .
125 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
126 ∗/
127 int h lo rphan se t p r ev ( struct hlorphan ∗ orphan , struct
hlorphan ∗ prev ) ;
128
129 /∗ ! \ b r i e f Get po in t e r to data f o r the g iven element .
130 ∗
131 ∗ @param orphan Pointer to the e lement t ha t has the data
f i e l d .
132 ∗ @return Pointer to the data on succe s s . NULL on f a i l u r e .
133 ∗/
134 void∗ h lo rphan get data ( struct hlorphan ∗ orphan ) ;
135
136 /∗ ! \ b r i e f Remove the f i r s t e lement ( dequeue ) from the l i s t .
137 ∗
138 ∗ @param l i s t Pointer to the g iven l i s t .
139 ∗ @return Pointer to the e lement ’ s data on succe s s . NULL
on f a i l u r e .
140 ∗/
141 void∗ h l i s t d equeue ( struct h l i s t ∗ l i s t ) ;
142
143 /∗ ! \ b r i e f Remove the g iven l i s t and i t ’ s e lements .
144 ∗
145 ∗ Note t ha t t h i s w i l l NOT remove the data e lements
146 ∗ connected to each l i s t e lement .
147 ∗
148 ∗ @param l i s t Pointer to the g iven l i s t .
149 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
150 ∗/
151 int h l i s t r emove ( struct h l i s t ∗ l i s t ) ;
152
153 /∗ ! \ b r i e f Get f i r s t e lement o f the g iven l i s t .
154 ∗
D. HWOS 123
155 ∗ @param l i s t Pointer to the g iven l i s t .
156 ∗ @return Pointer to the f i r s t e lement on succe s s . NULL on
f a i l u r e .
157 ∗/
158 struct hle lement ∗ h l i s t g e t f i r s t ( struct h l i s t ∗ l i s t ) ;
159
160 /∗ ! \ b r i e f Get l a s t e lement o f the g iven l i s t .
161 ∗
162 ∗ @param l i s t Pointer to the g iven l i s t .
163 ∗ @return Pointer to the l a s t e lement on succe s s . NULL on
f a i l u r e .
164 ∗/
165 struct hle lement ∗ h l i s t g e t l a s t ( struct h l i s t ∗ l i s t ) ;
166
167 #endif
Listing D.15: hlog.h
1 /∗ ! \ f i l e hevent . h
2 ∗ \ b r i e f [ Hardware OS Event I n t e r f a c e ] No t i f y and handle
even t s between threads .
3 ∗
4 ∗ This module conta ins f u n c t i o n a l i t y f o r hand l ing even t s
between threads .
5 ∗ Events can be handled both b l o c k i n g ( asynchronous ly ) or
non−b l o c k i n g ( synchronous ly ) .
6 ∗ The under l y ing implemenation use System V message queues
.
7 ∗
8 ∗ Author (2011) : Sindre Hansen
9 ∗
10 ∗/
11
12 #ifndef HEVENTH
13 #define HEVENTH
14
15 #define HEVENTBLOCKING 1
16 #define HEVENTNONBLOCKING 0
17
18 #include ” h s t ru c tu r e s . h”
19
20 /∗ ! \ b r i e f Creates an event hand ler .
21 ∗
22 ∗ @param async Set to t rue i f h e ven t wa i t shou ld b l o c k
wh i l e wa i t ing f o r even t s .
23 ∗ @return The event hand ler o b j e c t .
24 ∗/
25 struct hevent∗ heven t c r ea t e ( int b lock ing ) ;
26
27 /∗ ! \ b r i e f No t i f y an event .
28 ∗
29 ∗ @param event Event hand ler .
30 ∗ @param n o t i f i c a t i o n The ac t ua l event / n o t i f i c a t i o n . Must
be an i n t e g e r .
31 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
32 ∗/
33 int heven t no t i f y ( struct hevent∗ event , int n o t i f i c a t i o n ) ;
34
35 /∗ ! \ b r i e f Wait on event .
36 ∗
37 ∗ Wait may be performed in a b l o c k i n g or
38 ∗ non−b l o c k i n g manner ( see h e v en t c r ea t e ) .
39 ∗
40 ∗ @param event Event hand ler .
41 ∗ @return The event on succe s s . −1 on f a i l u r e .
42 ∗/
43 int hevent wai t ( struct hevent∗ event ) ;
44
45 /∗ ! \ b r i e f Remove an event hand ler .
D. HWOS 124
46 ∗
47 ∗ @param event Event hand ler .
48 ∗ @return 0 on succe s s . −1 on f a i l u r e .
49 ∗/
50 int hevent remove ( struct hevent∗ event ) ;
51
52
53 #endif
Listing D.16: hmqueue.h
1 /∗ ! \ f i l e hmqueue . h
2 ∗ \ b r i e f [ Hardware OS Message Queue In t e r f a c e ]
Func t i ona l i t y f o r keep ing message queue i d e n t i f i e r s .
3 ∗
4 ∗ This i s s imply a doub le l i n k e d l i s t o f System V message
queues .
5 ∗ This module shou ld probab l y be r ewr i t t en so i t uses
h l i s t i n t e r n a l l y .
6 ∗
7 ∗ Orig ina l author (2010) : Vegard Endresen
8 ∗
9 ∗ Modif ied (2011) : Sindre Hansen
10 ∗ − Changed naming o f f unc t i on s .
11 ∗ − Used more encap su l a t i n g .
12 ∗
13 ∗/
14
15 #ifndef HMQUEUEH
16 #define HMQUEUEH
17
18 // ! Types o f modes when c r ea t i n g a message queue .
19 enum hmqueue mode {
20 HMQCREATE,
21 HMQCONNECT
22 } ;
23
24 /∗ ! \ b r i e f Get prev ious e lement in l i s t .
25 ∗
26 ∗ @param element Pointer to e lement t ha t has the prev ious
f i e l d .
27 ∗ @return Pointer to the prev ious e lement on succe s s . NULL
on f a i l u r e .
28 ∗/
29 struct hmqueue∗ hmqueue get prev ( struct hmqueue∗ element ) ;
30
31 /∗ ! \ b r i e f Get next e lement in l i s t .
32 ∗
33 ∗ @param element Pointer to e lement t ha t has the next
f i e l d .
34 ∗ @return Pointer to the next e lement on succe s s . NULL on
f a i l u r e .
35 ∗/
36 struct hmqueue∗ hmqueue get next ( struct hmqueue∗ element ) ;
37
38 /∗ ! \ b r i e f Get key o f a g iven message queue .
39 ∗
40 ∗ @param element Pointer to l i s t e lement t ha t has the
message queue .
41 ∗ @return The key ( p o s i t i v e i n t e g e r ) on succe s s . Negat ive
on f a i l u r e .
42 ∗/
43 int hmqueue get key ( struct hmqueue∗ element ) ;
44
45 /∗ ! \ b r i e f Get ID of message queue .
46 ∗
47 ∗ @param element Pointer to l i s t e lement t ha t has the
message queue .
D. HWOS 125
48 ∗ @return ID of the message queue on succe s s . Negat ive on
f a i l u r e .
49 ∗/
50 int hmqueue get id ( struct hmqueue∗ element ) ;
51
52 /∗ ! \ b r i e f Get s i z e o f e lements in message queue l i s t .
53 ∗
54 ∗ @return Number o f e lements in l i s t .
55 ∗/
56 int hmqueue size ( ) ;
57
58 /∗ ! \ b r i e f Get ID of message queue i d e n t i f i e d by key .
59 ∗
60 ∗ @param key A key t ha t i d e n t i f i e s the message queue .
61 ∗ @return ID of the message queue on succe s s . Negat ive on
f a i l u r e .
62 ∗/
63 int hmqueue get id by key ( int key ) ;
64
65 /∗ ! \ b r i e f Get message queue element i d e n t i f i e d by key .
66 ∗
67 ∗ @param key A key t ha t i d e n t i f i e s the message queue .
68 ∗ @return L i s t e lement on succe s s . NULL on f a i l u r e .
69 ∗/
70 struct hmqueue∗ hmqueue get ( int key ) ;
71
72 /∗ ! \ b r i e f Adds a message queue and re turns the queue
i d e n t i f i e r .
73 ∗
74 ∗ Based on ”mode” t h i s f unc t i on e i t h e r connects to an
e x i s i t i n g
75 ∗ message queue or c r ea t e s an new message queue .
76 ∗
77 ∗ @param key Unique key f o r the new queue .
78 ∗ @param mode Mode f o r adding a message queue to the l i s t .
79 ∗ @return The message queue s t r u c t f o r the new queue on
succe s s . NULL on f a i l u r e .
80 ∗/
81 struct hmqueue∗ hmqueue add ( int key , enum hmqueue mode mode)
;
82
83 /∗ ! \ b r i e f Remove the g iven l i s t e lement .
84 ∗
85 ∗ @param element Pointer to l i s t e lement t ha t has the
message queue .
86 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
87 ∗/
88 int hmqueue remove ( struct hmqueue∗ element ) ;
89
90 /∗ ! \ b r i e f Remove the g iven message queue by key .
91 ∗
92 ∗ @param key Key t ha t i d e n t i f i e s the message queue .
93 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
94 ∗/
95 int hmqueue remove by key ( int key ) ;
96
97 /∗ ! \ b r i e f Remove a l l message queues in l i s t .
98 ∗
99 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
100 ∗/
101 int hmqueue remove al l ( ) ;
102
103 /∗ ! \ b r i e f Get f i r s t e lement in l i s t .
104 ∗
105 ∗ @return The f i r s t l i s t e lement .
106 ∗/
107 struct hmqueue∗ hmqueue ge t f i r s t ( ) ;
108
109 /∗ ! \ b r i e f Get l a s t e lement in l i s t .
D. HWOS 126
110 ∗
111 ∗ @return The l a s t l i s t e lement .
112 ∗/
113 struct hmqueue∗ hmqueue get las t ( ) ;
114
115 /∗ ! \ b r i e f Get the key o f the message s e r v e r s message queue .
116 ∗
117 ∗ @return The unique key o f the message s e r v e r .
118 ∗/
119 int hmqueue get daemonkey ( ) ;
120
121 #endif
Listing D.17: hmsg.h
1 /∗ ! \ f i l e hmsg . h
2 ∗ \ b r i e f [ Hardware OS Message I n t e r f a c e ] Func t i ona l i t y f o r
messages and message queues .
3 ∗
4 ∗ Def ines an i n t e r f a c e f o r message pass ing between the
HWOS and a c l i e n t a p p l i c a t i o n .
5 ∗ The message system i s implemented us ing System V Message
Queues .
6 ∗
7 ∗ Based on (2010) : Vegard Endresen
8 ∗
9 ∗ Author (2011) : Sindre Hansen
10 ∗ − Some commands in enum hmsg command taken from work
done by Vegard Endresen (2010) .
11 ∗/
12
13 #include ” h s t ru c tu r e s . h”
14
15 #ifndef HMSGH
16 #define HMSGH
17
18 // ! S i z e o f data f o r messages sen t to HWOS.
19 #define HMSG DATA SIZE 128
20
21 // ! De f i n i t i on o f message t ype s .
22 enum hmsg type {
23 HMTALL
24 ,HMTCTRL
25 ,HMTCTRLMEM
26 ,HMTREGPROC
27 } ;
28
29
30 // ! De f i n i t i on o f commands t ha t can be sen t through messages
from a c l i e n t a p p l i c a t i o n .
31 enum hmsg command {
32 HMCALLOC // ! A l l o ca t e BRAM memory .
33 ,HMCEXEC // ! Write program to
i n s t r u c t i o n segment o f BRAM.
34 ,HMCFREE // ! Free BRAM memory .
35 ,HMCLDDEV // ! Load dev i c e d r i v e r .
36 ,HMCREAD // ! Read data from backend .
37 ,HMCRMDEV // ! Remove dev i c e d r i v e r .
38 ,HMCRMQUE // ! Reg i s t e r r e c e i v e message
queue to HWOS.
39 ,HMCUMQUE // ! Unreg i s t e r r e c e i v e
message queue to HWOS.
40 ,HMCWRITE // ! Write data to backend .
41 ,HMC SET BITFILE // ! Set f i l ename fo r
b i t s t r eam f i l e ( depreca ted ?)
42 ,HMCREGPROC // ! Reg i s t e r a proces s .
43 } ;
44
D. HWOS 127
45
46 // ! De f i n i t i on o f re turn va l u e s t ha t can be sen t back to a
c l i e n t a p p l i c a t i o n .
47 enum hmsg return {
48 HMROK=0 // ! No er ro r s .
49 ,HMRERROR=−1 // ! Errors when
proce s s ing command .
50 ,HMRNOPID=−2 // ! No e x i s t s proces s
wi th the g iven PID .
51 } ;
52
53
54 /∗ ! \ b r i e f Filename fo r FPGA b i t s t r eam .
55 ∗
56 ∗ @param msg Message o f type HMTREGPROC.
57 ∗ @return The f i l ename .
58 ∗/
59 char∗ hmsg ge t b i t f i l ename (void∗ msg) ;
60
61 /∗ ! \ b r i e f Set f i l ename fo r FPGA b i t s t r eam .
62 ∗
63 ∗ @param msg Message o f type HMTREGPROC.
64 ∗ @param b i t f i l e n ame The f i l ename .
65 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
66 ∗/
67 int hmsg s e t b i t f i l ename (void∗ msg , char∗ b i t f i l e n ame ) ;
68
69 /∗ ! \ b r i e f Get n ice va lue ( user de f ined p r i o r i t y ) .
70 ∗
71 ∗ @param msg Message o f type HMTREGPROC.
72 ∗ @return Po s i t i v e n ice va lue on succe s s . Negat ive on
f a i l u r e .
73 ∗/
74 int hmsg get n i ce (void∗ msg) ;
75
76 /∗ ! \ b r i e f Set n ice va lue ( user de f ined p r i o r i t y ) .
77 ∗
78 ∗ @param msg Message o f type HMTREGPROC.
79 ∗ @param nice The p o s i t i v e va lue .
80 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
81 ∗/
82 int hmsg se t n i c e (void∗ msg , int n i c e ) ;
83
84 /∗ ! \ b r i e f Get sender ID fo r the message .
85 ∗
86 ∗ In t h i s v e r s i on o f the HWOS, t h i s i s
87 ∗ always the message queue ID of the queue
88 ∗ owned by the sender .
89 ∗
90 ∗ @param msg Message o f any type .
91 ∗ @return Po s i t i v e message queue ID on succe s s . Negat ive
on f a i l u r e .
92 ∗/
93 int hmsg get sender (void∗ msg) ;
94
95 /∗ ! \ b r i e f Set sender ID fo r the message .
96 ∗
97 ∗ In t h i s v e r s i on o f the HWOS, t h i s i s
98 ∗ always the message queue ID of the queue
99 ∗ owned by the sender .
100 ∗
101 ∗ @param msg Message o f any type .
102 ∗ @param id The message queue ID .
103 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
104 ∗/
105 int hmsg set sender (void∗ msg , int id ) ;
106
107 /∗ ! \ b r i e f Get type o f the message .
108 ∗
D. HWOS 128
109 ∗ @param msg Message o f any type .
110 ∗ @return The type ( p o s i t i v e i n t e g e r ) on succe s s . Negat ive
on f a i l u r e .
111 ∗/
112 int hmsg get type (void∗ msg) ;
113
114 /∗ ! \ b r i e f Get s i z e o f the data f i e l d .
115 ∗
116 ∗ @param msg Message o f type HMTCTRLMEM.
117 ∗ @return The s i z e ( p o s i t i v e i n t e g e r ) on succe s s . Negat ive
on f a i l u r e .
118 ∗/
119 int hmsg ge t s i z e (void∗ msg) ;
120
121 /∗ ! \ b r i e f Set s i z e o f the data f i e l d .
122 ∗
123 ∗ @param msg Message o f type HMTCTRLMEM.
124 ∗ @param s i z e The s i z e o f the data f i e l d .
125 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
126 ∗/
127 int hmsg s e t s i z e (void∗ msg , int s i z e ) ;
128
129 /∗ ! \ b r i e f Set re turn va lue f o r the s p e c i f i e d message .
130 ∗
131 ∗ @param msg Pointer to a response message .
132 ∗ @param r e t v a l Return va lue .
133 ∗ @return Po s i t i v e on succe s s . Negat ive on f a i l u r e .
134 ∗/
135 int hmsg se t r e turn (void∗ msg , enum hmsg return r e t v a l ) ;
136
137 /∗ ! \ b r i e f Get re turn va lue f o r the s p e c i f i e d message .
138 ∗
139 ∗ @param msg Message o f type HMTCTRL or HMTCTRLMEM.
140 ∗ @return The re turn va lue ( cons t ra ined by enum
hmsg return ) .
141 ∗/
142 enum hmsg return hmsg get return (void∗ msg) ;
143
144 /∗ ! \ b r i e f Get data ( s t r i n g o f cha rac t e r s ) f o r message .
145 ∗
146 ∗ @param msg Message o f type HMTCTRLMEM.
147 ∗ @return Pointer to data on succe s s . NULL on f a i l u r e .
148 ∗/
149 char∗ hmsg get data (void∗ msg) ;
150
151 /∗ ! \ b r i e f Set data ( s t r i n g o f cha rac t e r s ) f o r message .
152 ∗
153 ∗ @param msg Message o f type HMTCTRLMEM.
154 ∗ @param data Pointer to the data . S i z e not l a r g e r than
HMSG DATA SIZE.
155 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
156 ∗/
157 int hmsg set data (void∗ msg , char∗ data ) ;
158
159 /∗ ! \ b r i e f Set address f i e l d f o r a message .
160 ∗
161 ∗ @param msg Message o f type HMTCTRLMEM.
162 ∗ @return Po s i t i v e on succe s s . Negat ive on f a i l u r e .
163 ∗/
164 int hmsg set addres s (void∗ msg , int address ) ;
165
166 /∗ ! \ b r i e f Get address f i e l d f o r a message .
167 ∗
168 ∗ @param msg Message o f type HMTCTRLMEM.
169 ∗ @return Po s i t i v e address on succe s s . Negat ive on
f a i l u r e .
170 ∗/
171 int hmsg get address (void∗ msg) ;
172
D. HWOS 129
173 /∗ ! \ b r i e f Get command fo r message .
174 ∗
175 ∗ @param msg Message o f any type .
176 ∗ @return The command entry ( p o s i t i v e i n t e g e r ) f o r the
message . Negat ive on f a i l u r e .
177 ∗/
178 enum hmsg command hmsg get command (void∗ msg) ;
179
180 /∗ ! \ b r i e f Set command fo r message .
181 ∗
182 ∗ @param msg Message o f any type .
183 ∗ @param The command entry ( p o s i t i v e i n t e g e r ) f o r the
message .
184 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
185 ∗/
186 int hmsg set command (void∗ msg , enum hmsg command command) ;
187
188 /∗ ! \ b r i e f Create and a l l o c a t e memory f o r a message .
189 ∗
190 ∗ @param type The type o f the new message .
191 ∗ @return Pointer to the memory o f the new message on
succe s s . NULL on f a i l u r e .
192 ∗/
193 void∗ hmsg create ( int type ) ;
194
195 /∗ ! \ b r i e f Destroy message .
196 ∗
197 ∗ Actua l l y j u s t runs f r e e (msg) i f msg != NULL.
198 ∗
199 ∗ @param type Message o f any type .
200 ∗ @return 0 on succe s s . NULL on f a i l u r e .
201 ∗/
202 int hmsg remove (void∗ msg) ;
203
204 /∗ ! \ b r i e f Send a message .
205 ∗
206 ∗ Puts a message in a message queue .
207 ∗
208 ∗ @param msg Message o f any type .
209 ∗ @param msq Message queue . Message i s put in t h i s queue .
210 ∗ @return 0 on succe s s . Negat ive va lue on f a i l u r e .
211 ∗/
212 int hmsg send (void∗ msg , struct hmqueue∗ msq) ;
213
214 /∗ ! \ b r i e f B lock ing wai t f o r an incoming message .
215 ∗
216 ∗ @param msq The message queue where the message a r r i v e s .
217 ∗ @return Pointer to the r e c e i v ed message on succe s s . NULL
on f a i l u r e .
218 ∗/
219 void∗ hmsg rece ive ( struct hmqueue∗ msq) ;
220
221
222 #endif
Listing D.18: hplacer.h
1 /∗ ! \ f i l e hp l ace r . h
2 ∗ \ b r i e f [ Hardware OS Placer I n t e r f a c e ] Respons i b l e f o r
i n t e r r u p t i n g and r ep l a c i n g a running proces s on the FPGA
.
3 ∗
4 ∗ Author (2011) : Sindre Hansen
5 ∗
6 ∗/
7
8 #ifndef HPLACER H
9 #define HPLACER H
10
D. HWOS 130
11
12 #include ” h s t ru c tu r e s . h”
13
14
15 /∗ ! \ b r i e f Set the f u l l f i l ename o f the ICAP dev i c e .
16 ∗
17 ∗ Example : ”/ dev/ icap ”
18 ∗
19 ∗ @param dev i c e The f u l l f i l ename to the dev i c e .
20 ∗/
21 int hp l a c e r s e t i c a pd e v i c e ( const char∗ dev i c e ) ;
22
23 /∗ ! \ b r i e f Set the path o f the b i t f i l e s .
24 ∗
25 ∗ Se t s the path to the f i l e s where the proces s
26 ∗ b i t f i l e s are l o c a t e d .
27 ∗
28 ∗ @param path The path wi thout the ’/ ’ a t the end .
29 ∗/
30 int h p l a c e r s e t b i t f i l e s p a t h ( const char∗ path ) ;
31
32 /∗ ! \ b r i e f Create f u l l b i t f i l e n ame .
33 ∗
34 ∗ This w i l l a l l o c a t e a s t r i n g con ta in ing the path o f the
b i t f i l e s and
35 ∗ the f i l e name o f the g iven process ’ b i t f i l e n ame .
36 ∗
37 ∗ @param process Take b i t f i l e n ame from t h i s proces s .
38 ∗/
39 char∗ h p l a c e r c r e a t e f u l l b i t f i l e n am e ( struct hproces s ∗
proce s s ) ;
40
41 /∗ ! \ b r i e f I n t e r rup t a proces s running on the FPGA.
42 ∗
43 ∗ This assumes t ha t the proces s i s a l r eady p laced on the
FPGA and i s running .
44 ∗
45 ∗/
46 int hp l a c e r i n t e r r up t p r o c e s s ( struct hproces s ∗ proce s s ) ;
47
48 /∗ ! \ b r i e f S t a r t a proces s running on the FPGA.
49 ∗
50 ∗ This assumes t ha t the proces s i s a l r eady p laced on the
FPGA, but not running .
51 ∗
52 ∗/
53 int hp l a c e r s t a r t p r o c e s s ( struct hproces s ∗ proce s s ) ;
54
55 /∗ ! \ b r i e f Communicate wi th the running FPGA−proces s and
make i t send the
56 ∗ s t a t e in format ion back to the HWOS.
57 ∗
58 ∗ This assumes t ha t the proces s i s a l r eady p laced on the
FPGA and i s s topped .
59 ∗ This ve r s i on o f the HWOS w i l l a lways s t o r e the s t a t e
in format ion in a f i l e .
60 ∗
61 ∗/
62 int hp l a c e r s a v e s t a t e ( struct hproces s ∗ proce s s ) ;
63
64 /∗ ! \ b r i e f Load s t a t e in format ion from memory/ d i s k in t o the
running proces s
65 ∗ on the FPGA.
66 ∗
67 ∗ This ve r s i on o f the HWOS w i l l a lways f e t c h the s t a t e
in format ion from
68 ∗ a f i l e on d i s k .
69 ∗
70 ∗/
71 int hp l a c e r l o a d s t a t e ( struct hproces s ∗ proce s s ) ;
D. HWOS 131
72
73 /∗ ! \ b r i e f Load process in t o FPGA area .
74 ∗
75 ∗ This w i l l p l a ce the g iven process on the FPGA.
76 ∗
77 ∗/
78 int hp l a c e r l o ad p r o c e s s ( struct hproces s ∗ proce s s ) ;
79
80
81 #endif
Listing D.19: hprocess.h
1 /∗ ! \ f i l e hprocess . h
2 ∗ \ b r i e f [ Hardware OS Process I n t e r f a c e ] I n t e r f a c e f o r a
proces s on the FPGA.
3 ∗
4 ∗ Author (2011) : Sindre Hansen
5 ∗
6 ∗/
7
8 #ifndef HPROCESS H
9 #define HPROCESS H
10
11 #include ” h s t ru c tu r e s . h”
12
13 // ! Lowest p r i o r i t y p o s s i b l e f o r a proces s .
14 //#de f i n e HPROCESS LOWEST PRIORITY 0
15
16 #define HPROCESS MAX FILENAME SIZE 100
17
18 // ! Number o f p o s s i b l e s t a t e s .
19 #define HPSNUMBER 5
20 // ! De f i n i t i on o f s t a t e s f o r a proces s in the HWOS.
21 enum hpstate {
22 HPS NEW // ! Not running , j u s t
r e g i s t e r e d . An app l i c a t i o n has j u s t asked to run
t h i s proces s .
23 ,HPS READY // ! Not running , but i s
s chedu l ed f o r placement and runtime on the FPGA.
Not p laced on FPGA.
24 ,HPS BLOCKED // ! Blocked ( wa i t ing f o r an
event or preempted ) . Not p laced on FPGA.
25 ,HPS BLOCKED PLACED // ! Blocked ( wa i t ing f o r an
event or preempted ) . Placed on FPGA.
26 ,HPS RUNNING // ! Running . Process i s
p laced and running on the FPGA.
27 } ;
28
29
30 /∗ ! \ b r i e f Get base p r i o r i t y .
31 ∗
32 ∗ Returns the p r i o r i t y g i ven to proce s s e s when
33 ∗ they are f i r s t accepted to the system .
34 ∗
35 ∗ @return Base p r i o r i t y o f p roce s s e s .
36 ∗/
37 const int hp r o c e s s b a s e p r i o r i t y ( ) ;
38
39 /∗ ! \ b r i e f Get maximum proce s s e s in system .
40 ∗
41 ∗ @return Maximum number o f p roce s s e s in the system .
42 ∗/
43 const int hproce s s max proce s s e s ( ) ;
44
45 /∗ ! \ b r i e f Set b i t f i l e n ame fo r the proces s .
46 ∗
47 ∗ @param f i l ename Filename .
D. HWOS 132
48 ∗ @param f i l ename Name o f f i l e where the FPGA b i t s t r eam i s
.
49 ∗ @return Maximum number o f p roce s s e s in the system .
50 ∗/
51 int hp r o c e s s s e t b i t f i l e n ame ( struct hproces s ∗ process , char∗
f i l ename ) ;
52
53 /∗ ! \ b r i e f Get b i t f i l e n ame fo r the proces s .
54 ∗
55 ∗ @return Name of f i l e where the FPGA b i t s t r eam i s .
56 ∗/
57 char∗ hp r o c e s s g e t b i t f i l e n ame ( struct hproces s ∗ proce s s ) ;
58
59 /∗ ! \ b r i e f Get proces s by PID .
60 ∗
61 ∗ Worst case time o f t h i s f unc t i on i s O(n) ,
62 ∗ where n i s the number o f p roce s s e s .
63 ∗
64 ∗ @param pid Process ID .
65 ∗ @return Pointer to the proces s on succe s s . NULL on
f a i l u r e .
66 ∗/
67 struct hproces s ∗ hpro c e s s g e t ( int pid ) ;
68
69 /∗ ! \ b r i e f Create proces s .
70 ∗
71 ∗ @param nice User de f ined va lue to d e f i n e p r i o r i t y f o r
proces s .
72 ∗ @return Process po in t e r on succe s s . NULL on f a i l u r e .
73 ∗/
74 struct hproces s ∗ hp r o c e s s c r e a t e ( int n i c e ) ;
75
76 /∗ ! \ b r i e f Set p r i o r i t y f o r proces s .
77 ∗
78 ∗ @param pid Process ID fo r proces s .
79 ∗ @param p r i o r i t y P r i o r i t y f o r proces s .
80 ∗ @return The new p r i o r i t y ( l a r g e r than 0) on succe s s .
Negat ive on f a i l u r e .
81 ∗/
82 int hp r o c e s s s e t p r i o r i t y ( struct hproces s ∗ process , int
p r i o r i t y ) ;
83
84 /∗ ! \ b r i e f Get p r i o r i t y f o r proces s .
85 ∗
86 ∗ @return The p r i o r i t y f o r the proces s on succe s s .
Negat ive on f a i l u r e .
87 ∗/
88 int hp r o c e s s g e t p r i o r i t y ( struct hproces s ∗ proce s s ) ;
89
90 /∗ ! \ b r i e f Set n ice va lue f o r proces s .
91 ∗
92 ∗ This i s a user de f ined va lue t ha t a f f e c t s the p r i o r i t y
o f the proces s .
93 ∗
94 ∗ @param process A f f e c t ed proces s .
95 ∗ @param nice Nice va lue f o r proces s .
96 ∗ @return The new nice va lue ( l a r g e r than 0) on succe s s .
Negat ive on f a i l u r e .
97 ∗/
98 int hp r o c e s s s e t n i c e ( struct hproces s ∗ process , int n i c e ) ;
99
100 /∗ ! \ b r i e f Get n ice va lue f o r proces s .
101 ∗
102 ∗ @param process Pointer to proces s .
103 ∗ @return The nice va lue f o r the proces s on succe s s .
Negat ive on f a i l u r e .
104 ∗/
105 int hp r o c e s s g e t n i c e ( struct hproces s ∗ proce s s ) ;
106
D. HWOS 133
107 /∗ ! \ b r i e f Get PID fo r proces s .
108 ∗
109 ∗ @param process Pointer to proces s .
110 ∗ @return The PID fo r the proces s on succe s s . Negat ive on
f a i l u r e .
111 ∗/
112 int hp ro c e s s g e t p i d ( struct hproces s ∗ proce s s ) ;
113
114 /∗ ! \ b r i e f Get next proces s in the l i s t o f p roce s s e s .
115 ∗
116 ∗ @param process Pointer to proces s t ha t has the next
f i e l d .
117 ∗ @return Pointer to the next proces s . NULL on f a i l u r e .
118 ∗/
119 struct hproces s ∗ hpro c e s s g e t n ex t ( struct hproces s ∗ proce s s )
;
120
121 /∗ ! \ b r i e f Get prev ious proces s in the l i s t o f p roce s s e s .
122 ∗
123 ∗ @param process Pointer to proces s t ha t has the prev ious
f i e l d .
124 ∗ @return Pointer to the prev ious proces s . NULL on f a i l u r e
.
125 ∗/
126 struct hproces s ∗ hpro c e s s g e t p r ev ( struct hproces s ∗ proce s s )
;
127
128 /∗ ! \ b r i e f Get s t a t e o f proces s .
129 ∗
130 ∗ @param process Pointer to proces s .
131 ∗ @return S ta t e ( p o s i t i v e i n t e g e r ) f o r proces s . Negat ive
on f a i l u r e .
132 ∗/
133 int hp r o c e s s g e t s t a t e ( struct hproces s ∗ proce s s ) ;
134
135 /∗ ! \ b r i e f Get s t a t e o f proces s .
136 ∗
137 ∗ @param process Pointer to proces s .
138 ∗ @return S ta t e ( p o s i t i v e i n t e g e r ) f o r proces s . Negat ive
on f a i l u r e .
139 ∗/
140 int hp r o c e s s i s v a l i d p i d ( int pid ) ;
141
142 /∗ ! \ b r i e f Remove proces s .
143 ∗
144 ∗ @param process Pointer to proces s .
145 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
146 ∗/
147 int hprocess remove ( struct hproces s ∗ proce s s ) ;
148
149 /∗ ! \ b r i e f Get po in t e r to the queue the proces s be l ong s to .
150 ∗
151 ∗ @param process Pointer to proces s .
152 ∗ @return Pointer to the queue on succe s s . NULL on f a i l u r e
.
153 ∗/
154 struct hsqueue∗ hproce s s g e t pa r en t queue ( struct hproces s ∗
proce s s ) ;
155
156 #endif
Listing D.20: hsignal.h
1 /∗ ! \ f i l e h s i g na l . h
2 ∗ \ b r i e f [ Hardware OS S igna l I n t e r f a c e ] Def ines p o s s i b l e
s i g n a l s t h a t can be connected between modules .
3 ∗
4 ∗ Author (2011) : Sindre Hansen
5 ∗
D. HWOS 134
6 ∗/
7
8 #ifndef HSIGNAL H
9 #define HSIGNAL H
10
11 enum hs i gna l {
12 // ! h s q l i s t to hsqueue . A queue has been added to a
l i s t .
13 HSIGNAL ADD QUEUE,
14 // ! h s q l i s t to hsqueue . A queue has been removed
from a l i s t .
15 HSIGNAL REMOVE QUEUE,
16 // ! hsqueue to hprocess . A proces s has been added to
a queue .
17 HSIGNAL ADD PROCESS,
18 // ! hsqueue to hprocess . A proces s has been removed
from a queue .
19 HSIGNAL REMOVE PROCESS
20 } ;
21
22 #endif
Listing D.21: hsqlist.h
1 /∗ ! \ f i l e h s q l i s t . h
2 ∗ \ b r i e f [ Hardware OS Schedu ler I n t e r f a c e ] Def ines a l i s t
o f proces s queues .
3 ∗
4 ∗ The main purpose o f t h i s module i s to keep l i s t s o f
proces s queues .
5 ∗ Since the r e i s a sma l l and cons tant number o f l i s t s ( as
many as the r e are
6 ∗ proces s s t a t e s ) , l i s t s are re f e r enced by a number from 0
to HPS NUMBER.
7 ∗
8 ∗ Author (2011) : Sindre Hansen
9 ∗
10 ∗/
11
12 #ifndef HSQLIST H
13 #define HSQLIST H
14
15
16 #include ” h s t ru c tu r e s . h”
17
18 /∗ ! \ b r i e f Get s i z e o f l i s t .
19 ∗
20 ∗ @param l i s t The l i s t .
21 ∗ @return S i z e o f the l i s t .
22 ∗/
23 int h s q l i s t s i z e ( int l i s t ) ;
24
25 /∗ ! \ b r i e f Create new l i s t .
26 ∗
27 ∗ @param l i s t The l i s t .
28 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
29 ∗/
30 int h s q l i s t c r e a t e ( int l i s t ) ;
31
32 /∗ ! \ b r i e f Get queue by p r i o r i t y in l i s t .
33 ∗
34 ∗ @param l i s t The l i s t .
35 ∗ @param p r i o r i t y The p r i o r i t y o f the proce s s e s in the
queue .
36 ∗ @return Pointer to the queue on succe s s . NULL on f a i l u r e
.
37 ∗/
38 struct hsqueue∗ h s q l i s t g e t q u e u e ( int l i s t , int p r i o r i t y ) ;
39
D. HWOS 135
40 /∗ ! \ b r i e f Add a queue to the l i s t .
41 ∗
42 ∗ @param l i s t The l i s t .
43 ∗ @param queue The queue to be added .
44 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
45 ∗/
46 int hsq l i s t add queue ( int l i s t , struct hsqueue∗ queue ) ;
47
48 /∗ ! \ b r i e f I n s e r t queue b e f o r e another queue in the l i s t .
49 ∗
50 ∗ @param l i s t The l i s t .
51 ∗ @param second queue The queue t ha t i s a l r eady in the
l i s t .
52 ∗ @param queue The queue to be i n s e r t e d .
53 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
54 ∗/
55 int h s q l i s t i n s e r t q u e u e b e f o r e ( int l i s t , struct hsqueue∗
second queue , struct hsqueue∗ queue ) ;
56
57 /∗ ! \ b r i e f Get s t a t e o f the proce s s e s in the l i s t o f queues .
58 ∗
59 ∗ This f unc t i on s assumes t ha t t h i s l i s t has a l i s t o f
queues
60 ∗ and t ha t the a l l p roce s s e s in the queues has the same
s t a t e .
61 ∗
62 ∗ @param l i s t The l i s t .
63 ∗ @return S ta t e ( p o s i t i v e i n t e g e r ) o f the proce s s e s .
Negat ive on f a i l u r e .
64 ∗/
65 int h s q l i s t g e t s t a t e ( int l i s t ) ;
66
67 /∗ ! \ b r i e f Get f i r s t queue in the l i s t .
68 ∗
69 ∗ @param l i s t The l i s t .
70 ∗ @return Pointer to queue on succe s s . NULL on f a i l u r e .
71 ∗/
72 struct hsqueue∗ h s q l i s t g e t f i r s t q u e u e ( int l i s t ) ;
73
74 /∗ ! \ b r i e f Get l a s t queue in the l i s t .
75 ∗
76 ∗ @param l i s t The l i s t .
77 ∗ @return Pointer to queue on succe s s . NULL on f a i l u r e .
78 ∗/
79 struct hsqueue∗ h s q l i s t g e t l a s t q u e u e ( int l i s t ) ;
80
81 /∗ ! \ b r i e f Checks i f the g iven l i s t va lue i s in v a l i d range .
82 ∗
83 ∗ @param l i s t The l i s t .
84 ∗ @return 1 i f l i s t va lue i s in v a l i d range . 0 i f not .
85 ∗/
86 int h s q l i s t i s v a l i d r a n g e ( int l i s t ) ;
87
88 /∗ ! \ b r i e f Remove the g iven queue from the l i s t .
89 ∗
90 ∗ @param l i s t The l i s t .
91 ∗ @param queue Pointer to the queue .
92 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
93 ∗/
94 int hsq l i s t r emove queue ( int l i s t , struct hsqueue∗ queue ) ;
95
96 /∗ ! \ b r i e f Remove the g iven l i s t .
97 ∗
98 ∗ @param l i s t The l i s t .
99 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
100 ∗/
101 int hsq l i s t r emove ( int l i s t ) ;
102
103 /∗ ! \ b r i e f Reg i s t e r the g iven s i g n a l .
D. HWOS 136
104 ∗
105 ∗ This f u n c t i o n a l i t y i s par t o f an
106 ∗ observer−ob s e r v a b l e pa t t e rn
107 ∗ between the l i s t module and the queue
108 ∗ module ( to keep co r r e c t
109 ∗ encapsu la t i on ) .
110 ∗
111 ∗ The queue module r e g i s t e r a g iven s i g n a l
112 ∗ with the l i s t module .
113 ∗
114 ∗ After a s i g n a l i s r e g i s t e r e d , changes done
115 ∗ in the l i s t w i l l make sure i n t e r n a l s t r u c t u r e
116 ∗ in the a f f e c t e d queues are updated .
117 ∗
118 ∗ @param l i s t The s i g n a l as de f ined by enum hs i gna l in
h s i g na l . h .
119 ∗ @param func t i on Pointer to the c a l l b a c k func t i on in the
hqueue module .
120 ∗ @return 0 on succe s s . Negat ive on f a i l u r e .
121 ∗/
122 int h s q l i s t r e g i s t e r s i g n a l ( int s i gna l , int (∗ f unc t i on ) ( ) ) ;
123
124 #endif
Listing D.22: hstructures.h
1 /∗ ! \ f i l e h s t r u c t u r e s . h
2 ∗ \ b r i e f [ Hardware OS pu b l i c s t r u c t u r e s ] Pub l i c s t r u c t u r e s
used in the HWOS.
3 ∗
4 ∗ Author (2011) : Sindre Hansen
5 ∗
6 ∗/
7
8 #ifndef HSTRUCTURES H
9 #define HSTRUCTURES H
10
11 // ! A proces s in the HWOS. Can be running on the FPGA.
12 struct hproces s ;
13
14 // ! A message queue in the HWOS. Used by the message s e r v e r .
15 struct hmqueue ;
16
17 // ! A queue o f p roce s s e s in the HWOS.
18 struct hsqueue ;
19
20 // ! A dev i c e d r i v e r f o r a backend on the FPGA.
21 struct hdev ;
22
23 // ! A gener i c l i s t s t r u c t u r e f o r use in the HWOS.
24 struct h l i s t ;
25
26 // ! A gener i c l i s t e lement f o r use in the gener i c l i s t
s t r u c t u r e .
27 struct hle lement ;
28
29 // ! A gener i c l i s t e lement t ha t does not be long to any l i s t
e lement .
30 // ! I t has a next and a prev ious f i e l d .
31 struct hlorphan ;
32
33 // ! A po in t e r to a b l o c k o f BRAM−memory on the FPGA.
34 struct hvmemptr ;
35
36 // ! An event f o r use in the event system between threads in
the HWOS.
37 struct hevent ;
38
39 #endif
