Case Study of Finite Resource Optimization in FPGA Using Genetic Algorithm by Wang, JingXia & Loo, Sin Ming
Boise State University
ScholarWorks
Electrical and Computer Engineering Faculty
Publications and Presentations
Department of Electrical and Computer
Engineering
6-1-2010
Case Study of Finite Resource Optimization in
FPGA Using Genetic Algorithm
JingXia Wang
Shenzhen Polytechnic
Sin Ming Loo
Boise State University
This document was originally published by International Society for Computers and Their Applications in International Journal of Computers and Their
Applications. Copyright restrictions may apply. http://www.isca-hq.org/journal.htm
[JCA, Vol. 17, No.2, Junc 2010 95 
Case Study of Finite Resource Optimization in 
FPGA using Genetic Algorithm' 
JingXia Wang* 
Shenzhen Polytechnic, ShenZhen, GuangDong, 518055, PRe 
Sin Ming Loot 
Boise State University, Boise, fD 83725, USA 
Abstract 
Modem Field-Programmable Gate Arrays (FPGAs) are 
becoming very popular in embedded systems and high 
perfonnance appl ications. FPGA has benefited from the 
shrinking of transistor featurc size, which allows more on-chip 
reconfigurable (e.g., memories and look-up tab les) and routing 
resourccs available. Unf011unately, the amount of 
reconfigurable resources in a FPGA is fixed and limited. This 
paper investigates the mapping scheme of the applications in a 
FPGA by utilizing sequent ial processing (e.g., Altera Nios II or 
Xilinx Microblaze, Ilsing C programming language) and task 
specific hardware (using hardware description language). 
Genetic Algorithm is used in this study. We found thaI plac ing 
sequential processor cores into FPGA can improve the 
resource utilization efficiency and achieve acceptable system 
perfonnancc. ln this paper, three cases were studied to 
detennine the trade-off between resource optimization and 
system performance. 
Key Words: FI'GA, resource utilization, genetic algorithm, 
scheduling. 
1 Introduction 
In recent years, Fie ld-Programmable Gate Array (FPGA) has 
gained popularity in the dig ita l intcgrated circuit market, 
specificaHy in high-perfonnance embedded applications. One 
of the most sign ificant feanlres of FPGAs is that designers can 
configure them to imp lement complex hardware in the field. 
With the improvemcnt of integrated ci rcuit tcchnology, very 
large logical structures are a llowed to reside in a sing le FPGA 
chip [5]. Not only the hardware function un its (implemented 
using hardware description language) can be placed and routed 
into FPGAs, embedded processors can also be configured into 
FPGAs. There arc two types of embedded Intell ectual 
Property ( I P) processor cores: hard and soft cores. I·lard cores 
arc physical manifestations (built into the chip by the designer 
I A ~ubset of this paper was publishcd at 2009 Word Summit on 
Genctic and Evolutionary Computation Conference, June t 2- t 4, 
2009, Shanghlli, Chi na . 
• Dcpm1mCilt of Electrical Enginccring, XiLi Lakc. NanShan District. 
Email : ljwjxlyr2005@yahoo.com. 
t Departmcnt of Elcctrical and Computcr Enginccring. Email: 
smloo@boisestate.edu. 
and foundry) of the IP design. Soft eorcs, which are more 
portable and flexible than hard cores, are logical existence as 
integrated circuit nellist or hardware description language 
code. In general, an IP-based processor core, such as Xilinx 
Microblaze or Altera Nios II , is mueh more flexible than the 
general hardware logic. Once a processor is implcmented in 
the FPGA, it can be reused many times in a time-shared 
manner. Unfortunately, the sequential operations limit the 
system perfonmmce [I]. 
It is important to note that FPGA resources are limited. 
When the hardware resources requ ired by a tas k are more than 
the available resources in a single FPGA chip, generally, it is 
impossible to rea li ze this system with the given FPGA. We 
can configure the processor Ii> cores in to FPGAs with the 
application specific hardware. Th is allows us to implement 
some portions of the des ign llsing C programming and the rest 
arc implemented using hardware description language. For 
example, we can implement a task system, which consumes 
6674 Configurable Logic Blocks (CLBs), using a single FPGA 
chip with just 1920 CLBs by placing one soft processor IP core 
into the FPGA. This allows the fi ni te resources in a FPGA to 
be used optimally and effic iently. More than one proce;;sor II' 
cores may be used in a single FPGA to implemcnt the complcx 
applications. However, it is noted tlmt the processor IP cores 
consume both the hardware logic (LUTs) and block RAM 
resources of FPQAs_ The number of processor IP cores inte-
grated into FPGAs is limited by the finite hardware resources 
of FPGA. We found that, it was not good to put the maximum 
number processor II' cores into a FPGA because the sequential 
operations in the processor will restrict the system perfonnancc 
too much. In this paper, three different cases are studied with 
FPGAs Finite Resource Optimization Analysis Model in order 
10 obtain the optima! FPGA finite resources utilization scheme. 
This paper is organized as follows. An overview of FPGA 
architecture is described in Section 2. Section 3 presents 
FPGAs Finite Resource Optimization Analysis Mode l with 
Genetic Algorithm. Sections 4 and 5 provide a computational 
complexity analysis with an example. Simulations selup and 
resu lts arc presented in Section 6. The conclusion is presented 
in Section 7. 
2 FPGA Architectures 
The basic architecture o r FPGAs consists of an array of logic 
tSCI\ Copyright© 2010 
96 
blocks, progranunable interconnect, and 1/0 blocks. A logic 
block which includes a fIXed number of LUTs and flip-flops is 
called a cOllfigurable logic block (CLD) or a logic array block 
(LAB). In this paper, CLD is u~cd . Programmable 
interconnect joins these logic blocks to provide the required 
interconnections. 1/0 block is a pin level interface circuit, 
which provides the interface between package pins find th e 
internal conligurable logic. 
Wi th the development of micro-electrical technology. the 
archi tecture of modem FPGAs is more compl icated than 
before. It is composed of more resource clcmenls, such as 
embedded memory blocks (bRAM s), mult ipliers, and even 
processor I I' cores. Figure I shows the generic architecture 
overview of Xilinx Virtex lI -Pro FPGA [5]. Xilinx Virtex-Il 
Pro FPGA includes an array of CLI3s, IOU, Mult iplicrs, Block 
RAM , Embedded RocketlO, and Iwo processors (IBM 
PowerPCTM40S) [5]. 
The embedded [BM PowcrPC 405 is a J2-bit high 
performance and low power IUSC hard processor core. It is 
fully compliant with the J2-bi t implementations of the 
PowerPC User Instruct ion Sct Architecture (ISA) and can be 
reprogrammcd with the other resources together during the 
FPGA configuration . Xilinx Virtex-4 is the newest generation 
with more high-speed 110 technology and digital signal 
processing features. With thcse modern FPGAs, it is possible 
to configure multiple soft processors into a single FPGA 
device by designers. 
The processor in FPGAs provides incomparable vast 
flexibility for the trade-off between software and hardware. In 
general, the number of soft processors that can fit into a FPGA 
is only li mited by the resource of the FPGAs. 
-
DCM 
/ 
-
-
CLe 
, 
RockellO o r ROCkE'IiO X 
Mul1i·Gtg(\bi, Tran&CQI"'" 
/ 
r 
CLe 
CLe 
-
-
--
CLe 
~'8;.~l'" , , 
, r-
S",IE'cIlO·Ultm 
Figure [ : The generic arch itecture overview of Xilinx Virtex-
II Pro FPGA 
3 FPGA Finit e Resource Optimiza lion Analysis Mod el 
wilh Genetic Algori lhm 
The FPGA Finite Resource Optimization Analysis Model is 
shown in Figure 2. 
[t is used to describe the major e lcments of the FPGA 
IJCA, Vol. 17, No.2, June 20 10 
FPGA R~ourees II 
Lls i 
Tlllik System 
List 
The FI'GA Resource 
Assignmc ul Algorilhm 
-I-
FPGA Resources II 
Uti lization information I L-~~---' 
Task System 
Schedule 
Figure 2: FPGA fin ite resource optimization analysis model 
reSOllfces assignment algorithm. The nucleus of this model is 
the FPGA resources assignment algori thm based on Genetic 
A[gorit hm. One of th c inpul s to the algorithm is the frGA 
resources list, which includes the number of CLBs for a given 
FPG A device, the number of so n processors to be integrated 
into the given FPGA , and the number of CLBs and the size of 
the bRAM s that ·each son processor employs. Another input is 
thc system info rmation list, which gives the deta iled 
information of the applieal ion (number o f tasks and how these 
tasks are related). The FPGA resources assignment algori thm 
generates the FPGAs resources utili zation infomlation and the 
task systcm schedule . Thc fonner presents the percentage of 
the specific FPGA s hardware utili:.wtion (CLBs) for the given 
task system. The latter shows the execution time (wtit time) of 
thc task system. The shorter the schedule length is, the more 
desirable the so lutions arc. 
3, 1 FPGA Il.esourees 
There is 11 number o f hardware resources in a single FPGA 
chip, including CLBs, lOBs, bRAMs, spccial logics (e.g., 
mu[tiplier and digita l signal processing block), and routing 
resources. A soft processor placed and routed in a FPOA witt 
usc a fixed amount of CLBs and bRAN1. For example, it has 
bcen found through experimcnts that Xi!inx Microb[aze witl 
usc abom 400-500 CLBs. In this paper, the number of CLBs 
in a FPGA is considered as the hardwarc resources restrictions. 
In addition, the number ofbRAMs, which arc used for internal 
program and data storage of the eonfigurcd soft processors, 
prescnl the limitation of the tasks that arc assigned to the son 
processors. Thus, FPGA resources list includes the number of 
the CLBs where the hardware applications can be assigned to 
and the size of memories which store the program and datil for 
the soft processor. 
3.2 Task Systcm 
Task system is a model of the application. It i~ decomposed 
into a set of tasks which can be cxccuted as software processes 
on a soft processor core, or ilS hardware functions 
(implemented using hardware description langlmgc) within a 
FPGA. When the task system is created, we define thc 
execution time for the software and hardware respectively. i\ 
sample task system data is shown as Table I [2J. We \ISI! a 
directed acyclic graph (DAG) to present a task system (I]. A 
sample task system DAG is shown as Figure J (2]. 
IJCA, Vol. 17, No. 2, June 2010 
Table I: A sample task system resources 
Soft Processor FPGA Ilardware Logic 
Task Executio Program Executio FPGA 
name 
n time Memory n time Resources 
(unit required (unit Requi red 
time) (Ily"') time) (CLS,) 
Task 74,640 184 2,560 1 32,220 
Task 5,836,56 19,240 208,400 2 0 30,509 
Task 34,200 22,348 400 3 30,509 
Figure 3: A sample task systcm DAG 
3.3 FPGA Resource Ut ilization Information 
One of the outputs of FPGA resources assignmellt algorithm 
is FPGA resource utilization report. It describes the all ocation 
scheme fo r every task, whether it is In the processor or task 
specific logic, and the percentage of the utilization of CLBs fo r 
a FPGA device. 
3.4 Task System Sc hedule 
Another output of FPGA resources assign mcnt algorithm is 
task system schedule. This report g ives the eKecu tion lime of 
the task systcm and the order of execution of the tasks wi th in 
each soft processor in a FPGA chip. The execution time o f the 
task system determines the system performance. 
3.5 The FPGA n.e$uurces Assignment Algoritlllll~Ccnctic 
Algorithm 
Genetic Algori thm (GA) based on Darwinian natum l 
evo lution and selection is a search techn ique for approx imatc 
solutions to optimization problems. The basic operations of 
GA include initialization, eva luation, se lection, reproduct ion, 
and tem1ination [4]. Starting from an initiul popu latio n (list 
scheduling is used), a population is randomly ini tiali zed with 
tasks assignment using the available hardware resources or sort 
processors. Thc individual member of the population is 
expressed by a separate data structure as candidate. For each 
candidate, the schedule length is the fitness fu nction. The 
selection of candidates for the subsequent generation is based 
97 
on their fitness function. A set of parents with the best genetic 
information (better schedule length) are selected to breed the 
offspring . A rou lette-whee l-style selection is used in this 
process. The reproduction process creates the next generation 
populati on th rough two genetic operations: crossover and 
mutation. A single-point crossover teelmique is used for the 
crossover process. We randomly select a location Oil the 
chromosome strucUlre as the crossover po int. The new 
candidatc is generated by replicati ng and combining two parent 
candidates at the chosen crossover point. One is from the fi rs t 
parent chromosome above the crossover point, and the other is 
from the second parent chromosome below the crossover 
poin! . The mutation operation wi ll takc plaec according to a 
givcn probability of mutat ion parameter. The task assignments 
arc selected at random for mutation process when the mutation 
occurs. There is an equal probability that the genes arc chosen 
from either parent when the mutation does not occur (1] . The 
evaluation, select ion, and reproduct ion processes are repeated 
until a te1ll1ination condit ion has becn reached. Figure 4 
depicts this Genetic Algorithm process. 
4 Co mputation a l Co mplexity of the F'PGA Resources 
Assignmellt Algorithm 
[n this section, we discuss the time complexity and 
efliciency of the FPGA resources assignment algorithm . There 
are many factors that will influence the GA efficiency, 
including the population encoding, fitness function, population 
size, the probabil ity of mutation , and the termination condi tion. 
In thi s paper, the following parameters are cons idered: 
p: the size of population in GA 
i: the number of evolution generation 
pm: the probability of mutation 
s: the number of soft processors to be configured into FPGA 
k: the number of tasks in the task system to be assigned 
To determi ne the complexity of this GA and measure the 
efficiency, tests werc carried nut using a computer system wi th 
Pcntium 4 3G Hz processor IGByte memory runn ing CentOS 
4 .3 Kernel 2.6.9. Table 2 and Figure 5 show the cxecution 
time of GA with the different parameters. 
The curve in Figure 5(a) shows the lime complex ity of GA 
for paramcter p is 0(p2). From Figure 5(b), the approximate 
straight lines show the linear characteristic and the time 
comp lexity of G/\ for parameter i is O(i). Both Figure 5(c) 
and Figure 5(d) show thc characteristic of the log2n (n is a 
posi tive integer) . So, the time complexity of G/\ for parameter 
pm and s is 0(log2pm) and O(lOg25) respective ly. 
According to the computational complexity theory, we can 
know that the time complexity of GA is 0(1'2). In the 
algorithm, there is a sorti ng operation that sorts all candidates 
in a popU lation with a bubble sorting algorithm fo r crossover 
and mutation operations. The time complexity of candidates 
sOl1 ing operation is 0 (p2 ) [6J. The sorting operat ion is the 
most complex process in this GA implementation. So, we can 
use the time complexity of this sorting operation to prcsent the 
time complexity of GA. On the other hand, the population size 
98 
I llegin l 
Read th ~ pammel.,s for GA 
Input FPGA rcsourc~ am] task 
system list Inili.1 l i z~tiOll 
Initialize a populaTion I (list schedul ing) 
Generate the al location vector 
for all ca"didates 
Evaluatiol1 
Calculate th ~ schedule length as 
fitness function 
Generate the likclihOlx] of all 
<'andid~t cs for selection 
I Usc a roulette whecl to select the parent 
Selection 
C,os\over wilh single point 
Lrosso\cr 
Reproduction 
Determine whether the lTlutation 
will h~ppen or not 
J.. 
N Reach the given 
,l!.cncmtions 
• y 
} Tonn',",'o, 
Ompul the results 
End 
Figure 4: Genetic algori thm process 
P is all important parameter for GA and can detenninc the 
search splice complexity of GA directly. If we enlarge Ihe 
search space, the computationa l complexity will be increased 
greatly. From this analysis, the population size must be 
selected carefully for GA. If it's too low, the search space is 
too small to provide enough samplings and GA gets the poor 
results. If it's too large, time complexity of the algorithm will 
be raised ,Illd GA gets the lower algorithm efficiency [3]. 
Also, finding a good termination condition and decreasing the 
number of the evolution generations can also improve the 
algorithm efficiency. 
5 FPGA Resource Utilization Analysis Case Studi es 
In this example, the Power Qual ity Monitor System (PQMS) 
[4] is placed and routed into Xilinx XC3S1000 FPGA in 
)JCA, Vol. 17, No. 2, June 20 J 0 
differenl assignment schemes to lest the utilization of the 
FPGA resources. The PQMS is designed to measure the 
quality and reliability of lhe power system. It can be 
decomposed into 10 tasks. Figure 0 is a OAG for the PQMS 
[2]. Table 3 shows the performance data of hardware and 
software of the PQMS. 
XHinx XC3S1000 FPGA consists of 1,920 CLBs and 55,296 
bytes bRAM. We assume that each Xilmx Microb laze soft 
proccssor will consumc 500 CLBs. When all the tasks arc 
implemented using the FPGA hardwarc logic (using HUL), it 
has been determined thm the task system wi ll require 12,020 
CLBs (Sec Table 3). This value is more tll(lIl the hardw(lre 
log ic resources in the given FPGA. So, it is impossible for this 
task system to be implemented using pure HDL. 
As shown in Table 4, three cases with 1,2, and 3 Mieroblaze 
soft processors in the FPGA arc simulated using the Fin ite 
Resource Optimization Analysis Model. When one Xilinx 
Microblaze is used, the whole task system can fit lll(O the 
FPGA because some of the tasks are assigned to the soft 
processor. The task system consumes 95.408 percent of FPGA 
hardware resources (Iud the design gcts an acceptable system 
performance. When two Xilinx Microblazcs arc employed, the 
percentage of the hardware resources utili za1ion of FPGA is 
90.668 percent and the better system performance is obtained. 
When three Xilinx Microblazes are used, the smallest 
percentage of the hardware resource utilization of FPGA is 
achieved, but the worst system perfomtance is found. The 
reasoll is th at the more tasks th(lt are assigned into the soft 
processors, the perfonnance suffers from the highly sequential 
operation in the processors. Thus, the best area-tlTne trade-oIT 
in this example is to integrate two soft processors into a single 
Xilinx XC3Sl000 FPGA. With this configuration, some tasks 
execute in software processors (lnd the rest of the tasks arc 
implemented using HOL. All are in one FPGA . 
Table 2: The execution time of the GA with the different 
oarameters 
The execution time (m i lise~ond) 
Test parameters k: Task numbers 
k- IO k- 16 k- 30 
Population :;ize: p 10 ! 7750 2t 100 26670 
(i ~ 1 000,pll1~0.08, 25 92650 108560 141330 
s~l) 50 334360 405150 532990 
100 1277990 1541400 2036900 
Generation number: , 100 9890 t0510 13570 
(p-25,pm- 0.08, 500 47600 53550 70900 
s- l) 1000 92650 108 560 141330 
1500 137940 t64250 211690 
Probabitity of 0.08 92650 108560 141330 
mUlalio,,: pm 0.t6 123670 141570 162660 
(p-25,i~ t 000, 5=1) 0.24 143730 157350 171860 
0.32 153570 164620 173100 
Soft processor I 92650 108560 141330 
numl>cr: , , 116210 146350 156040 (1'-25,i-1000, pm~O . OR) 
l 115960 149410 158570 
4 132370 150050 157760 
'Ftve tests are earned oul for every lIem and the average 
value is shown in the table . 
lJCA, Vol. 17, No, 2, June 2010 99 
pf 
to' 2$OfJOOO \;' moooa 
~ 2000000 • • 
~ 
~ 2000000 
0 0 
0 1500000 0 1500000 
" 
/~ " 3 1000000 3 1000000 
• _t~ l' , 
>00000 $00000 
"- ~ " 0 0 
• 
• / ;:..-e 
-! ~ 
~ 
.. 25 50 100 
Populat,on S'LO p 
100 1500 
Otnenllon Numbe.", 
_____ 10 ______ 16 - . 10 
---.- 10 __ 16 ..... - 30 
(a) Execution time characteristics with (b) Execution time characteristics with 
different population size generat ion number r 
!.~ 200000 
8 150000 
g 100000 
§' 50000 
'i 0 
" 
!k:::t: I 
008 016 0.24 0)2 
P,.b.b.llly of M"!~t;on· pm 
___ 10 - . - 16 ..... - 30 
I:' 20000 0 -,----------
e : ~~~~~ j v;-!--1 i SOOO Oo j---~--------
- 2' 
Soflprocenor numb" , 
__ 10 _ - 16 ..... }(l 
(c) Execut ion time characteristics with (d) Execution time characteristics with 
probability of Mutation pm soft processor number s 
Figure 5: The execution time with the different parameters 
Inns, I 
Pave Kdist 
rnDi 
Figure 6: The DA G of PQMS 
Table 3: Power quali ty monitor system task system 
. ~ f In Orilla Ion 
Soflilrocessor FPGA hardware logic 
Task Exccution Program Execution 
FI'GA 
name 
timc (unit memory time (unit resources 
time) required time) required (bytes) (CLI3) 
DPT 83,124,480 7,688 6 1,540 697 
Vrrns 60,520 1,004 92.25 275 
Phi 3 13,460 1,464 200 564 
Irms 15,115,930 156 10,240 779 
Inns, I 60,520 1.004 92.25 275 
Kdisp 1, 161,3 10 1,092 150 122 
Pave 1.201.S40 1,284 150 1,339 
Kdist 26,680 28 1)2.02 619 
I" 8.600 28 29.72 14 
THDi 73,0) 0 72 272.42 1.330 
6 Morc Comprehension Si mulatiOlls and Rcsull s 
In th is sect ion, we perfomled more simu lations and 
expanded the resu lts with 1I 30-tasks system, Space Shuttle 
Turbo Pump Task Systcm [2J. The given FPGA is Xilinx 
100 ileA, VoL 17 , No.2, June 2010 
Table 4: The resource utilization of different FPGA resource assignment for PQMS 
I son Processor 2 Soft Proct:s~()rs 3 Soft Processors 
Simulation RU (%) Sl RU (%) 
(UI) 
I 95.94 18254300 l!R.65 
2 95.94 18254300 88.70 
3 95.94 18254300 88.70 
4 95.26 18507300 92.66 
5 95.26 18507300 92.71 
6 91.98 18340300 92.60 
7 95.94 18254300 92.60 
8 95.94 18254300 SIUO 
9 95.94 18254300 92.66 
10 95.94 18254300 88.70 
M" 95.94 18507300 92. 7 1 
Min 91.98 18254300 88.65 
Ave 95.408 18313500 90.6(,8 
• RU(Resources UtlhzatlOn), SL(Sehedule length), \It (UIlH wile) 
*The schedule length is an approximate value. 
Sl RU ('Yo) Sl 
(ut) (lli ) 
15917000 82.34 98340100 
15844000 82.40 98340100 
15844000 82.34 98340100 
15831000 96.72 98340100 
16965600 82.40 98340100 
151BIOOO 96.72 98340100 
15831000 %.82 9R340100 
15844000 82.40 98340100 
15831000 82 .)4 98340100 
15844000 96.93 98340100 
16965600 96.93 98340100 
15831000 82.34 98340100 
15958260 88.141 98340100 
*It is assumed that each Xilinx Microblaze soft processor consumes 500 C L13 s in a single FPGA. 
*The number of CLBs in Xilinx XC3Sl 000 is 1920. 
*Total resource usage if all tasks are to assign to: soft processor: 12 ,020, CLB. 6,674. 
"'The different random seed of GA is used in the simu lat ion for each row in above tabl e. 
XC3S5000 with 8,320 CLBs and 234 K bytes bRAMs. We 
also assumed that each soft processor consumes 500 CLBs. 
The FPGAs Fin ite Resou rce Optimization Allalysis Model was 
used to simulate and test the utilization resources for the 
different number of soft processor to obtain the best possible 
perfonnance within the available resources. Table 5 contains 
the results of the simulations. 
Table 5: Feasible utilization resource found by GA for 
For each category of a different number configuration (1-8) 
in Table 5, 10 simulations with different random seeds of 
Genetic Algorithm were completed. The total of eighty 
simulations is completed in this test. Generally, when we 
put eight soft processors into FPGA, Ihe better resource 
utilization and the best system performance were llchieved. 
But this result doesn't mean that the more soft processors we 
usc, the better solution wi\! be obtained. 
Global System fo r Mobile Communication (GSM) [2] 
task system, with 16 tasks, is used to carry the same 
simulations. In this simulation, wc usc Xilinx XC3S1500 
with 3,328 CLBs and 72K bytes bRAMs. Table 6 shows the 
results of the simulations for the 16 tasks. When one soli 
processor was integratcd into FPGA, the beller resource 
utilization and the best system pcrfonnance were found in 
these cases. 
different FPGA resource assi nment 
Number of Resource utilization (%) Schedule length 
soft (unit time) 
processor Min Max A", Min M" A", 
0 . . . . . . 
I 95.16 99.87 97.55 325 560 452.1 
2 55.89 96.96 73.63 202 436 303.2 
3 38.70 99.71 1l1.3] 139 225 179.6 
4 65.44 99.31 90.42 II] lSI ]24.9 
5 54.54 93.5 I 76.91 86 139 115.0 
6 63.17 95.50 86.66 84 96 90.7 
7 63.15 98.50 91.14 85 99 91.8 
8 75. 12 99.27 85.56 84 98 88.8 
"'Ten sUllulattons wtth dIfferent random seeds of GA are made 
for carh row in above table. 
"' It is assumed that each Xilinx ·Microblaze soft proccssor 
consumes 500 CLBs in a si ngle FPGA. 
"'The number ofCLBs in Xil inx XC3S5000 is 8320. 
"'Total resource usage if all tasks are to assign to: soft 
processor: 24580, CLB: 28 I 575. 
LJCA, Vol. 17, No.2, June 2010 10 1 
Table 6: Feasible utilization resource found by GA for different FI'GA resource assignment for 
I" k .\ 16 , stem WIt 1 tasks 
Number of Resource utilization 
soft (%) Schedule length (unit time) 
processor Min Mox Ave Min Max Ave 
0 - - - - - -
I 83.44 99.43 94.88 156615 162516 160075.2 
2 89.24 99.19 94.46 167131 173384 170734.1 
3 83.02 99.90 97.05 172455 179460 176498.2 
4 94.32 98.02 97.02 177675 188379 181404.6 
5 93.72 99.76 96.61 199260 204091 2017 16.5 
6 97.45 98.44 98.31 23058 1 234769 231999.8 
*Tcn S1nmlatlOns wIth dIfferent random seeds ofGA are made for each row 111 above table. 
*It is assumcd that each Xilinx Mierob laze soft processor consumcs 500 CLl3s in a single FPGA. 
*Thc number ofCLBs in Xilinx XC3S 1500 is 3,328. 
*Total resource usage if all tasks arc to assign to: soft processor: I 1,858, CLB: 12,014. 
7 Conclu sions 
This paper presents a basic overview of the genet ic 
algorithm with resource utilization analysis for FPGAs. How 
to utilize the finite FPGA resources opti mally and efliciently is 
significant in the FPGA design. Integrating soft processor into 
FPGAs can greatly improve FPGA's resource utilization 
efficiency, but it doesn't mean that the more soft processors 
are better for the performancc. In fact, it is the reversc case. 
For the different applications and given FPGA devices, we 
show that the trade-off between the resource uti lization and the 
system performance can be found using FPOAs Finite 
Resource Optimization Analysis Model. 
References 
[I] Sin Ming Loo and Earl Wells, "Task Scheduling in a 
Fin ite- Resource, Rcconfigurable Hardware/Software 
Codesign Environment," INFORMS JOl/I"I/al of 
Compuling, 18(2): 151-172, Spring 2006. 
[2] David Lee McCarver, Fillite Resource Reconfigurable 
Hardware/Software Codesign: Case SIt/die.I·, M.S. 
.lingXia Wang received her M.S. It1 
Electrica l Engineering from the 
Un iversity of WuHan Surveying and 
Mapping Techno logy in 1994. From 
1994 to 2003, she was emp loyed as an 
Assistant Professor and current ly is an 
Associate Professor in the Department of 
Electrica l Engineering III Shenzhcn 
Polytechnic. From 2005 to 2006, she 
worked in the Department of Electrical 
and Computer Engi neeri ng at Boise 
State University in the USA as a visiting scholar. Her research 
interests include embedded system, reconfigurablc computing, 
computer architecture, hardware/software codesign. 
Dissertation in Computer Engineering, Boise State 
University, June 2005. 
[3] Ling Wang, Intelligelll Optimization Algorithms wilh 
Applicatiolls, Peking, TsingH ua Univ Press, in Chinese, 
200 1. 
[4] Theerayod WiangTong, Peter Y. K. Cheung, and Wayne 
Luk, "Comparing Thrce Heuristic Search Methods for 
Functional Partitioning in Hardware-Sof1ware Codesign," 
Design AulomQlio/l for Embedded Systems, 6:425-449, 
2002. 
[5] Xilinx Virtex-ll Pro and Virtex-ll Pro X Platfonn 
FPGAs: Complete Dat<l sheet, http: //direct.xilinx.comlbv 
docs/publications/ds083.pdf, 2007. 
[61 ZhaoKun Yang and ChongJun Yang, The Challenge of 
the Mathematicians It! the Future, 
http: / lepi ste.math. nt u .ed u. t w/art ic les/mm/ mm _ I 0_2_ 041i 
ndex.html, in Chinese, 6:28-36, 2002. 
Sin Min g Loo received his Ph.D. in 
Computer Engineering from the 
University of Alabama at Birmingham 
and the University of Alabama in 
Huntsville in 2003. From 1998 to 2003 , 
he was involved in research projects 
which included development of space 
plasma simu lation model utilizing 
parallel processing on 256-processor HP 
Exemplar X2000 and I 28-proeessor SG I 
Origin, built and maintaincd 80-node Pentium-4 Beowulf 
cluster, and development of digital system rapid prototyping 
course materia ls. Dr. Loo is presently an Associate Professor 
of Electrical and Computer Eng ineering at Boise State 
University. His research interests include scheduling, parallel 
processing, d istributed sensor networks, embedded system, 
hardware/software codesign, and reconfigurable computing. 
