Abstract-We present an IP-core called PHCA for Programmable Hardware Cellular Automaton (CA). PHCA is a hardware implementation of a general purpose cellular automaton entirely programmable. The heart of this structure is a PE array with reconfigurable side links allowing to implement a 2-D CA or a 1-D CA. As an illustration of a PHCA program we present the implementation of a symmetric cryptography algorithm called ISEA for Ising Spin Encryption Algorithm. Indeed ISEA is based on a 2-D Ising spin lattice presenting random series of disordered spin configurations. The main idea of ISEA is to use the disordered spin configurations to encrypt data.
I. INTRODUCTION Cellular automata were originally introduced by von Neumann for studying self-reproduction in biological systems [1] . Then they have been used for language recognition and modelling of physical systems [2] . The mathematical properties of cellular automata were also studied. Nowadays the concept of quantum cellular automata is defined [3] . This paper proposes an IP (Intellectual Properties)-core for a Programmable Hardware Cellular Automaton (PHCA). PHCA is a powerful tool for the conception and test of 1-D or 2-D cellular automata rules applications. An acyclic one dimensional cellular automaton and a cyclic two dimensional cellular automaton can be implemented into the PHCA. The architecture of the PHCA is a fine grained fully parallel structure inspired by a classic SIMD (Single Instruction Multiple Data) structure compound of 1-bit Processing Elements (PE) [4] .
An example of PHCA program concerns a cryptography application. The cryptography field is still increasing nowadays. Electronic transactions become very important and require security since most of them concern payments or confidential data. Public key and secret key cryptographic algorithms provide a solution to this security problem. These algorithms are able to ensure data authenticity, integrity and confidentiality [5] . Secret key algorithms are more suitable for hardware implementation.
In the context of the secret key algorithms we propose a symmetric algorithm based on cellular automata rules. This algorithm is called Ising Spin Encryption Algorithm (ISEA) because it uses a system of Ising spins. In this paper we focus on a 2-D Ising spin lattice [6; 7] . The time evolution of the spin configuration in this lattice is managed by local rules leading to disordered configurations in accordance with certain conditions. The configuration space is explored by a random walk imposed by a microcanonical Monte Carlo method [8] . ISEA uses the disordered spin configurations to encrypt data by combining the spin lattice and an array of data to be encrypted. This encryption process is rather fast. Moreover the permanent exchanges between neighbour sites introduce a constant noise useful against the attacks based on power analysis.
The PHCA may be programmed according to 1-D or 2-D cellular automata (CA) rules. This work focus on the PHCA programmed according to ISEA rules. Each site of the spin lattice system is updated by a PE. All the PE apply concurrently the same rule. An example of resulting encrypted data array is given below. A first version of a PHCA provided with a 32x32 PE array has been implemented into a Xilinx FPGA xc3s5OOO. More precisely the next state of each cell depends on the present state of the neighbour cells [9] . The cell to be updated may be included or not into this neighbourhood. A cellular automaton can be of any dimension and can be cyclic or acyclic. In order to design an IP-core mapping this definition, we chose to describe a multiprocessor fine-grained structure operating in fully parallel mode. For the instruction stream organization we chose a SIMD scheme in order to avoid synchronization as well as connection problems.
The heart of this SIMD structure is an array of Processing Elements (PE) controlled by the same instruction. The memory is distributed. At each clock cycle, all the PEs execute concurrently the same instruction on the data stored in their internal memory elements.
B. PHCA Symbol and Interconnections
The PHCA logic symbol for a MxN PE array is given in Fig. 1 . The external data enter through the N-bit south data bus (CMS) and exit through the N-bit north data bus (CMN). The thirteen Control lines bring the same instruction word to each PE. As we shall see below each PE has a private 32xl bit RAM controlled by the W/R input and addressed by the 5-bit ADD input bus. All the registers of the PHCA are synchronized by the same clock Clk. 
C. PHCA Processing Element
The IP-core contains MxN single-bit Processor Elements. The structure of a PE is detailed in Fig. 3 . A PE is equipped with a 32-bit RAM, five multiplexers, one single bit Arithmetic and Logic Unit (ALU), four registers (NS, EW, C, CM), and input/output ports on all four sides.The multiplexers input are selected by the Control word. The registers and RAM accept data from up to eight possible sources through the five multiplexers. The ALU is a full adder/subtractor. The result of an addition is given on the ALU outputs CY and SM and the result of a subtraction on the ALU outputs BW and SM. These ALU outputs CY, SM and BW correspond also to logic operations in accordance with certain conditions. N/S and E/W links connect a processor cell to its four neighbour. CMS/CMN links provide the PE array with a second vertical link system which is particularly useful because it does not communicate with the ALU. So these CMS/CMN links allow a South-eNorth shift of the data stream through the whole array concurrently with other PE operations. The dark-grey outputs are re-injected as multiplexers inputs into the PE itself. Up to five commands can be executed simultaneously during each instruction cycle.
III. LOCAL RULES FOR EN/DECRYPTION
Since a 2-D Ising spin lattice presents a random series of disordered spin configurations, we decided to use these configurations to encrypt data.
A. Main idea
Numerical simulations are powerful tools to simulate phase transitions on statistical systems. Monte Carlo and molecular dynamics represent two complementary schemes for such simulations. A Microcanonical Monte Carlo (MMC) [8] method represents a simulation algorithm interpolating between the Monte Carlo and molecular dynamics technics. The MMC method consists of taking a random walk on a surface of constant energy. This random walk will generate successive configurations of the statistical system. In order to ensure a fast and secure encryption of sensitive data through the PHCA we propose to use these configurations. The PHCA has to perform the three following actions: * storing the data to be encrypted coming from the south input bus CMS and shifting these data through the PE array up to the north output bus CMN; * ensuring a permanent random walk by executing the Microcanonical Monte Carlo local rules; * combining the data flow and the lattice statistical system configurations in order to encrypt the data.
B. Microcanonical Monte Carlo Method
The statistical system to be simulated is the 2-D Ising model. Let us consider a square lattice of MxN sites with one spin S at each site. The spins may be up or down. With the MMC method, each site i is also provided with a reservoir containing an energy Er,i.
Two kinds of energies are involved in this model. The first one is a magnetic interaction energy; for a link (ij), between two neighbour sites i and j, the magnetic energy is expressed by mij = Si xor Sj. So mij = 0 if the two considered spins are parallel, otherwise mij = 1. The second kind of energy is called "reservoir" energy; it is the sum of all the private site reservoir energies E,i. So two arrays of 1-bit values coexist simultaneously in the PHCA: the array of spins updated at each time step and the array of data shifting to the north. In order to encrypt the data, each PE xores the bit of data and the bit of spin.
During the initialisation phase, the programmer has to choose the initial spin configuration and to distribute the reservoir energy. Then he has to choose the number of iterations of the MMC rules to compute before xoring the spin bit and the data bit. These choices constitute the key of the encryption process. This cryptography algorithm is symmetric and the key is secret.
V. RESULTS AND DISCUSSION
The first sub-section describes the sender and receiver actions to manage the data flow en/decryption. An illustration of the corresponding results is given. The second sub-section presents our first FPGA implementation of PHCA. Resulting resource and performance are shown and compared to other cores performances.
A. Encryption/decryption process and results
The simulation results of a data stream encryption and decryption are presented in Fig. 5 . The sender and the receiver need a PHCA. Let us consider for instance an array of 256x512 PEs. Fig. 5 .a shows the clear message where a black site represents a bit of data equal to '1'. The encryption process follows the steps described below: 1)The sender imposes the initial spin values S and distributes the reservoir energy R. Then he programs the PHCA in order to perform U (one thousand in the example) initial spin lattice configuration updates.
2)The sender introduces the clear message through the south side of the PHCA, row per row. The message shifts to the north and after each shift step it is xored with the spin lattice configuration. The result of this clear message transformation is given in Fig. 5 .b. One can notice that the initial message becomes completely scrambled.
3)The receiver has to get the secret key (i.e. S, R and U) through a secure channel. Then he initialises the PE array and programs it to perform U spin lattice configuration updates; 4)The receiver introduces the encrypted message into the south side of its PHCA. These operations allow to recover the initial data message (Fig. 5.c) at the north side of the receiver PHCA. The implementation of ISEA into PHCA leads to a data rate of 3 Mbits/s which is much lower than the performance obtained from the core dedicated to ISEA. Nevertheless, PHCA has the important advantage to be programmable and to have configurable interconnect. It could be easily multi-algorithm.
