At present, the main research in Evolvable Hardware (EHW) 
Introduction
Evolutionary al gorithms (E A) [1] a re sto chastic search m ethods tha t mimic t he metaphor o f natural biological evolution. Evolutionary algorithms operate on a population of potential solutions applying t he p rinciple of sur vival o f the fittest t o produc e b etter and be tter approxi mations to a solution. Impressed by Charles Dar win, Prof. J ohn Holland of the Univ ersity of Michigan viewed the p rocess o f Bio logical Evo lution as a process of opt imization, w here natu re sele cts t he best evolutionary settings to survive in the next generation of offspring [2] . At each generation, a new set of approx imations i s created by t he process of s electing in dividuals ac cording to their level o f fitness in the probl em do main an d b reeding th em to gether using op erators borrowed from na tural genetics. T his process leads to the evo lution o f populations of indivi duals that are better suited t o their environment than the individuals that they were created from, just as in natural adaptation. Evolutionary al gorithms model n atural processes, such as selecti on, reco mbination, mutation, migration, loc ality and neighborhood. Figure 1 show s the structu re o f a simple e volutionary algorithm. Ev olutionary algorithms work on populations of individuals instead of single solutions. In this way the search is performed in a parallel manner.
Fig 1. The evolutionary algorithm operators
The process of developing an EA for a particular application consists of the following chief phases: 
Evolvable Hardware
Inspired by natural evolution, evolvable hardware (EHW) [3] was developed in the early 1990s as a new con cept i n the development of adaptive machines. I n con trast to the tr aditional hardware where structure and functions are irreversibly fixed once i mplemented, EHW ref ers to a ty pe of hardwar e whose architecture and f unctions can change dyna mically and aut onomously by i nteracting wi th its environment. T his adap tation ability of EHW, achieved b y em ploying ev olutionary learnin g and reconfigurable device, has great potential for the dev elopment of innovative and po werful r eal-world applications [4] .
Fig2. The algorithm for evolving circuits
The basic idea o f Evo lvable Hardware (EHW) is to regar d the co nfiguration bits o f a so ftwarereconfigurable device as the chromosome of EA. The search-space of configurable bits is very huge, but E As are very ef fective wit hout pri or kno wledge o f the sear ch-space. As a fitness f unction, we choose the performance of the hardware circuit. For example, in Data Compression with EHW, we use a predict ive function i mplemented w ith hardware. As a fit ness function we ch oose the d ata compression r ate. When a go od chromosome i s obt ained, it is i mmediately d ownloaded i nto t he reconfigurable device.
In EHW , it i s n ot required to speci fy th e detail ed hardwa re desig n. I nstead w e define a f itness function. A fitness f unction is t he in stinct of the circui t to evolve its elf. If t he fitness val ue o f a hardware circuit is degraded due to partial malfunction or some changes in the environment, then the EA-process of EHW is invoked, and the sea rch f or a bett er hardwa re c onfiguration is in itialized. Hence, EHW continues to reconfigure itself in order to get better performance.
The chromosome of EHW specifies two things. One is the function type of the evolution unit. In figure 3 , the e volution u nits co rrespond to g ates li ke AND -gate an d OR-gate. The other is t he interconnection among the evolution units. EHW can be classified into two classes according to the grain-size of an evolution unit; gate-level and function-level. The figure is an example of gate-level evolution. In f unction level evolution, each evol ution un it is higher hardware f unction than g atelevel evolution [5] .
Fig 3. Evolvable Hardware
There are many methods to implement intrinsic evolution. A methodology for direct evolving is described in, it is t he most intuitionistic intrinsic evolution, however, this approach requires a very professional understanding of a given FPGA as well as a familiarity of its configuration bitstream structure. Although the author introduces the bitstream composition of XC2V40 and teaches us how to l ocalize the LUT contents in a con figuration b itstream, t here are t wo f atal l imitations i n i ts flexibility: On one han d, ev en p roduced b y the sa me cor poration, different types of FPG A ha ve different configuration co mpositions, i f the experi mental environment is chan ged, we m ust spen d unwanted ene rgy on familiarizing o urselves with th e new environ ment and re-parse the configuration, then locate the LUT co ntents from millions of configuration bitstream bits, and link them as a gene for evoluti on ,th at is obviously a lack of p ortability. On th e other hand, illega l bitstream may destroy FPGA, however, ev olve the con figuration bi tstream directly wi ll ge nerate some illegal bitstream easily, when these illegal ones are downloaded to the FPGA, the chip will not work even damaged.
Complete intrinsic evolution on FPGAs

Virtual reconfigurable architecture
To speed up fitness evaluation, with the rapid development of reconfigurable devices, different intrinsic EHW approaches have been propo sed in r ecent years. FPGAs which enj oy both the high performance of a dedicated hardware solution and the flexibility of software offered by its inherent reprogram ability feature are the most popularly utilized commercial reconfigurable logic device for digital intrinsic EHW . A num ber o f work s have b een d one in t he ar ea o f FPGA -based intrinsic EHW.
In this paper, we develop an d d esign a virtual recon figuration-technique-based EH W plat form, named virtual reconfigurable architecture (VRA) [6] . The VRA, which is described in an HDL, is a second reconfiguration layer developed on the top of an FPGA. The key goals of our proposed VRA are to provide a much simpler intrinsic EHW platform -thus reducing the length o f th e genotype description and providing the feature of fast internal reconfiguration.
A shown in MUX1 MUX2 and MUX3 are used to select the inputs to the LUT. So, each LUT input can be driven from 1 of a set of 8 inputs.
Fig 4. The architecture of Configurable Cell
In total, this configurable cell requires 17 con figuration bits. Each multiplexer requires 3 selection bits. The LUT has 8 selectable outputs. So the total number of configuration bits = (3 * 3) + 8 = 17 bits. Fig.5 shows that our VRA consists of so me function element (FE) arrays. The top function of the FE array is co nfigured using the chro mosomes generate d by EA, which is i mplemented on the s ame FPGA. The chromosome encodes the functions performed by each FE a nd the i nterconnection of the FE array. As the fitness calculation is also carried out in the same FPGA, we can benefit from pipeline processing allowing reasonable time of a candidate circuit evaluation.
The configurable circuit is created from an array of these cells in fig 4. The array is shown below. Each cell input is connected to the outputs of the one previous column. For the special case of the first column, the previous inputs are in fact connected to the cell array inputs. overcome. (2) Becau se the VRA is avai lable at t he le vel o f HDL so urce code , it can easi ly be modified and s ynthesized for variou s target p latforms. (3) The V RA can be designed exa ctly according to the requirements of a given problem. This feature means that the granularity of the FE array can exactly fit the needs of a given application.
Genotype-phenotype mapping
A pivotal problem in EHW domain is how to encode circuits to chromosomes, which has a direct influence on both the course of evolution and the final outcome. A good coding should satisfy three conditions at least: the first, its length should be controlled in the range which the EA could handle; the second, it should be able to decode to a practical circuit conveniently; the third, the evolution of the co ding s hould be able t o r eflect th at o f both functionality and routing. T he proposed methodology in this paper has advantage in obtaining such a good coding.
In the experiment, we evolved 2 × 2 bit multiplier and design an 8*8 VRA, the modules in every column were 8 inp uts numbered from '0' t o '7', the anterior 8 sequence numbers represented the 8 primal inputs and the latter 8 sequence numbers represented the 8 inputs from the previous column's outputs. The 8*8 VRA has 64 such modules, joint these 64*17 integers together as a chromosome, so the length of chromosome in our experiment is 1088.
Experiment
Implementation of VRA
In the proposed system, designing the virtual circuit in VHDL is a preliminary work. The virtual circuit con tains the t hree modules: t he f irst module marked " FPGA Space" is a location allocated from all usable memory r esources mentioned in t he previ ous section, the second is VRA, and t he third is a Counter.
The FPGA Space module is prepared for receiving chromosome and returning the Truth Table, this module is an alternating interface between the EA process and the VRA: the EA process writes every chromosome to FPGA Space module, and the FPGA Space module returning the Truth Table to EA process. The VRA module read datum from FPGA Space m odule, and then maps them to a real fun ction c ircuit. T he Co unter module's ou tputs ar e t he inp uts o f the VRA module, i t s tarts generating the input combinations of the VRA module as soon as the circuit finished mapping, once an input combination has been generated, a corresponding output combination will be written to an appointed location o f t he FPGA Space m odule, when all th e input co mbinations have been completed, these locations constitute the Truth Table of this circuit.
When t his preli minary w ork is fin ished, it is a un iversal desig n for ev olving d ifferent targe t circuits, or only needs some tiny modification such as broadening the size or shifting the routing of VRA to evolve more complex target circuits.
The cell arr ay p rovides us wi th a method of tes ting candi date ci rcuit solu tions as p art of a n evolutionary a lgorithm. A major ad vantage o f performing fi tness eval uations in hardware is that performing multiple evaluations in parallel can be achieved quite simply. In fact 8 cell arrays have been combined into a single block called the EvoBlock. The EvoBlock also contains extra circuitry to help determine the fitness of an individual.
The figure to the left shows the basic parts of the EvoBlock Module. The block of RAM is used to store the truth table of the target circuit. When test vectors are applied to the EvoBlock the correct output value is output from the RAM. This can then be compared to the output from each cell array.
The direct out put from each cell arr ay can be read, but a m atch out put, is also generated. This output shows which bits o f the RAM ou tput and the cell arr ay ou tputs match, i.e. wh ich bits ar e correct. The xnor gate ou tputs a '1' on a match and a '0 ' when t he bi ts differ. T o determine how many bits the current individual has correct for a particular input vector, you just need to count the number of 1s in the match output.
The final feature of the EvoBlock is the m ask r egister. Th e mask r egister is used to mask o ut unwanted outputs from the match result. This is used when the circuit being evolved does not make use o f all 8 cell array outputs. For example the 2-bit multiplier only has three outputs, so if t hese signals map to cell array outputs dOut2 to dOut0, the mask register will be loaded with 0x7. The match values for bits 7 down to 3 will always be '0'. 
Result
The fitness is evaluated by GA, once the GA process obtains the Truth Table of current circuit from  the appointed locations of FPGA, it compares this truth table with the Truth Table of the target circuit, checks th em b it by bi t and c ounts th e match-bits, the p ercentage o f the match-bits is the fitness o f current circuit . There wi ll be so me trivial diff erences in fitness evaluation bet ween d ifferent GA processes.
Evolution runs were conducted on ou r on-chip evolution system and a Pentium 4 (P4) workstation for speed co mparisons. The P4 workstation has a clock f requency of 2 GHz.
For th e speed test , 10,000 generations of 20 individuals were evolved. The fitness evaluation was for a 2 × 2 bit multiplier [7, 8, 9] . Correct 2 × 2 bit multiplier cir cuits were evolved af ter an aver age of 4 902 g enerations over 10 evolution runs . The sa me exper iment was conducted as verification on the PC p latform, where the average was 5349. The different numbers can be exp lained by the different progra ms using different random number gen erators. St ill, the results are rather s imilar, wh ich ind icates th at the FPGA implementation works correctly. In ou r exp eriment, t he max generation o f GA was 2,00 0,000, and the progra m was executed 10 times, there were 9 times step into "fitness stalling effect", only once the fitness reached 100%, the low success rate is mainly attributed to the following reasons:
Firstly, theoretically speaking, an 8*8 VRA is enough to evolve a 2-bit multiplier, but in fact, it is difficult to evolve a f ully functionality 4 -bit ad der in fin ite generations wi th an si ze o f 8*8, so increasing the size of VRA will help to improve the success rate.
Secondly, the routing between multiplexer modules in our experiment was stiff, every module could only connect to the previous column modules or primary inputs, and each module has only one output, which result in many modules have not been fully utilized, so a more flexible routing will be beneficial to increasing the likelihood of success evolution in limited size. Thirdly, th e function styles defined i n th e proposed sy stem are too si mple, see tabl e 1, all th e function styles are only one ou tput, extending multi-output fu nction s tyles or m ore complex function styles will be able to improve the efficiency of evolution. In addition, the GA operators in our experiment were not efficient enough, because we did not try our best to seek a better algorithm for evolution, this maybe another reason for the low success rate.
Conclusion
We di scussed in th is pa per, the application of Evolu tionary Algorithms to the pr oblem o f developing Evolvable Hardware. Evolvable Hardware can deal with practical industrial applications, especially those that require the ability to cope with time-varying problems and real-time constraints.
The powerful computation ability of our propo sed evolvable platfo rm is pr esented in th e experimental results. W hen co mpared with the equivalent so ftware si mulation, o ur FPGA implementation obtains a performance increas e of over 1 00 times i n all cases. On the other hand, according to the analysis of f itness eval uation process in in trinsic evolut ion, the propo sed VRA shows a promise to avoid the bottleneck introduced by the slow reconfiguration speed in traditional FPGA conf iguration bitstream-based int rinsic EHW. Fu ture work will be co ncentrated o n developing th e r eported VRA for solving more co mplex and de manding r eal-world industrial problems.
To conclude, Evolutionary Algorithms is currently a fertile ground for research and application development. While a rich set of techniques and models are available, covering a range of domains, there are many areas remaining to be understood and exploited. It must however be noted, that this technique (as every other te chnique!) poses so me li mitations ov er th e ap plication areas it can b e used in, and hence scope for further development and research in this area is vast.
