Abstract
Introduction
Radio Frequency Identification (RFID) is a promising technology for automatic identification of remote objects. In an RFID system, each object is labeled with a small transponder, called an RFID tag, which receives and responds to radio-frequency queries from a transceiver, called an RFID reader. An RFID tag is composed of a tiny integrated circuit for storing and processing identification information, as well as a radio antenna for wireless data transmission. There are three basic types of RFID tags: active, semi-passive, and passive tags. Active tags contain internal batteries so that they can initialize communications with the reader, whereas a passive tag does not contain any battery, it solely obtains power from the reader for both computation and communication. Semi-passive tags use batteries only to power their circuit and harvest power from the reader for communication. Passive RFID tags usually have constrained capabilities in every aspect of computation, communication and storage due to the extremely low production cost. The reading range of a passive tag is up to several meters. For most RFID applications, the security and privacy are important or even crucial requirements [16] . Since most protocols for securing RFID systems proposed so far are based on the usage of an on-board true random and/or pseudorandom number generator (TRNG/PRNG), a number of solutions have been proposed in the literature for implementing TRNGs/PRNGs on RFID tags [2, 7, 15, 19, 20, 22] . In particular, the EPCglobal Class-1 Generation-2 (EPC C1 Gen2 in brief) standard [10] uses a couple of 16-bit random numbers in the tag identification protocol. All of the proposals for TRNGs are based on analog circuits that sample a random physical phenomenon like thermal noise. To the best of our knowledge, only three PRNGs have been proposed for EPC C1 Gen2 passive tags [7, 20, 22] , among which two proposals use TRNGs as a component and the security properties of those two PRNGs rely on the security of TRNGs.
Considering the high power consumption, large area and low throughput of TRNGs, we propose a lightweight PRNG for low-cost EPC C1 Gen2 tags in this paper. The basic idea of our design is to replace the TRNG in [7, 20] by a lightweight pseudorandom sequence generator with good statistical properties. To this end, nonlinear feedback shift registers (NFLSRs) have been fully exploited in our design. The security properties of the proposed PRNG are analyzed in great detail by using cryptographic statistical tests specified by the EPC C1 Gen2 standard as well as the NIST test suite. Various cryptanalysis techniques have been applied to demonstrate the attack resistant properties of the proposed PRNG. Furthermore, a hardware implementation on a Xilinx Spartan-3 FPGA device shows that the new PRNG can be implemented using 46 slices. receiving a request from the reader, a tag generates a 16-bit random number, denoted by RN16, and temporarily stores the number in a slot counter. When the slot counter is zero, the tag backscatters RN16 to the reader. Thereafter, the reader copies RN16 to an acknowledgement packet to be sent to the tag. When the tag receives the acknowledgement packet, it first compares the random number in the acknowledgement packet with RN16. If these two numbers are the same, then the tag backscatters the acknowledgement packet.
Figure 1. EPC C1 Gen 2 Tag Identification Protocol
In the access operation (Steps 5-7 in Figure 1 ), after receiving a request, denoted by Req RN , from the reader, the tag compares the random number in the request Req RN with the stored RN16. If these two random numbers match, then the tag generates another random number RN16', which is called handle and backscatters it to the reader. Then the reader issues the commands such as Read, Write, and BlockWrite. Steps 8-10 in Figure 1 demonstrate a further access operation. Note that for each access operation the tag generates a new random number.
Related Work
In this section, we give a brief overview of three previous PRNG proposals for EPC C1 Gen-2 passive RFID tags.
Che et al.'s PRNG
Che et al. [7] designed a PRNG based on a combination of an oscillator-based TRNG and a linear feedback shift register (LFSR) with 16 stages. In their design, the TRNG is implemented using an analog circuit and exploits thermal noise of the circuit. To introduce randomness, one truly random bit from the TRNG is XORed with each bit of a 16-bit sequence generated from the LFSR. In 16 clock cycles, a 16-bit random number is generated by the PRNG. Due to the linear structure, Che et al. ' s scheme has been attacked by Melia-Segui et al. in [20] with a high success probability +1 8 , where n is the length of the LFSR.
Melia-Segui et al.'s PRNG
To avoid such an attack on Che et al.'s PRNG, Melia-Segui et al. [20] proposed a similar design by employing multiple primitive polynomials instead of one in the LFSR. The design consists of a true random source, a module with eight primitive polynomials, and a decoding circuit taking inputs from the true random source, where the decoding circuit is designed in such a way that the same primitive polynomial is not chosen consecutively. At each clock cycle, one primitive polynomial is chosen according to the decoding logic and true random bits for producing a pseudorandom bit. Thus, the PRNG produces a 16-bit random number in 16 clock cycles and the security of the PRNG relies on the TRNG.
Peris-Lopez et al.'s PRNG
In [22] , Peris-Lopez et al. proposed a PRNG named LAMED for RFID tags, which is in compliance with the EPC C1 Gen2 standard and can provide 32-bit as well as 16-bit random numbers. The basic operations for updating the internal state of LAMED consist of bitwise XOR operations, modular algebra, and bit rotations. The internal state of the LAMED is of 64-bit, including a 32-bit key and a 32-bit initial vector. The key length can be further increased by replacing the IV bits with the key bits. Note that the LAMED always outputs a 32-bit random number and a 16-bit random number is obtained by dividing 32-bit number into two equal halves and XORing them together.
Preliminaries
In this section, we define some terms and notations that will be used to describe the proposed pseudorandom number generator. , where is co-prime to 2 − 1 [12] . The WG transformation has excellent cryptographic properties such as high nonlinearity, algebraic degree and at least 1-order resiliency for a proper selection of basis. Moreover, a sequence generated by the WG transformation has high linear complexity.
Notations:
- 2 = (2) = {0,
Description of the Proposed PRNG
The proposed PRNG is composed of two main building blocks. The first one consists of two NLFSRs of length 17 and 18 over 2 , each one generating a span n sequence or modified de Bruijn sequence with optimal linear complexity, whereas the second one includes an NLFSR over 2 5 and each NLFSR uses one or two WG transformation modules. In our design, the binary sequence generated by the first building block is converted to a sequence over 2 5 and this sequence is used in the recurrence relation in the second building block. The final output sequence is filtered by the WG transformation and n-bit random numbers are generated by taking disjoint n-bit sequences from the final output sequence. A high-level architecture of the proposed PRNG is illustrated in Figure 2 . 
Building Block I: An Alternative to TRNG
The first building block contains two NLFSRs whose lengths are chosen to be co-prime in order to achieve the maximum period. The reason that two shorter NLFSRs are used instead of a long one is due to the impossibility of generating shift distinct sequences from a long NLFSR for different initial states. In other words, by XORing the output sequences from two NLFSRs we can obtain shift distinct sequences for different initial states. In our design, the WG transformation with decimation = 3 over 2 5 , denoted by WG5 in Figure 2 , is used as a nonlinear feedback function to generate span n sequences. For = 5, the WG permutation is defined as
19 + ( + 1) 21 , ∈ 2 5 and the WG transformation over 2 5 is given by
, which has the maximum nonlinearity 12, the algebraic degree 3 and the maximum algebraic immunity 3. The n-stage nonlinear recurrence relation is defined as
for all ≥ 0, and 0 < 1 < ⋯ < 5 < are tap positions of two NLFSRs, where ⊕ denotes addition over 2 . Using the parameters and recurrence relations in Table 1 , we can generate two span n sequences = { } ≥0 , and = { } ≥0 with NLFSR1 and NLFSR2, respectively. The output sequence of the first building block is denoted by = { } = { ⊕ }, ≥ 0 which is almost balanced and has the following statistical properties:
a) The period is ( [17] . The sequence t is used in the second building block for introducing nonlinearity in the recurrence relation in each 5 clock cycles (see Section 5 for details). This building block is used as an alternative to the TRNG in [7, 20] .
Building Block II: Pseudorandom Number Generator
The second building block consists of an NLFSR and two WG transformation modules given by ( ) and ( 3 ), respectively. Letting the length of NLFSR3 be = 6 and the primitive polynomial be ( ) = 6 + + , where = 15 ∈ 2 5 , the recurrence relation is defined as
is the nonlinear feedback with the least signification bit generated by the WG transformation ( ) and = { } ≥0 is the sequence over 2 5 that is defined in the previous subsection. While the WG transformation ( ) (i.e., VWG5 in Figure 1 ) is only used as a nonlinear feedback function in NLFSR3, the WG transformation (
3 ) (i.e., WG5 in Figure 1 ) is employed to generate nonlinear feedback for NLFSR1 and NLFSR2 as well as to filter the output sequences. In the above recurrence relation (1), the nonlinearity is introduced by and and those feedback will affect other bit positions after multiplying by . Note that the period of the sequence = { } ≥0 is a multiple of the period of t. Moreover, the final output sequence = { } of the second building block is defined by = ( +5 3 )
for ≥ 0. The period of o is a multiple of 
System Initialization
The proposed PRNG has an internal state 65 bits, including a 45-bit secret seed as well as a 20-bit initial vector (IV). While the secret seed and the IV are preloaded into RFID tags at the very beginning, the 20-bit IV is also updated at the end of each protocol session. Before generating random numbers, a 36 rounds of initialization phase is applied to mix the key and IV properly. In our design, the secret seed and IV are preloaded as follows: the first consecutive 12, 11 and 22 positions of NLFSR1, NLFSR2 and NLFSR3 are respectively reserved for key bits, whereas the remaining positions in each NLFSR are for the IV. The initialization process is illustrated in Figure 3 . During the initialization phase the internal states of the three NLFSRs are updated as follows:
, ≥ 0, where 18+ , 17+ and +6 are the updated values of NLFSR1, NLFSR2 and NLFSR3, respectively, and is generated by the WG transformation ( ). Sequence { } is the XOR sequence of two output bits from NLFSR1 and NLFSR2 and five consecutive 's are collected to form a 5-bit vector . The output of NLFSR3 is used as a nonlinear feedback to affect the internal states of both NLFSR1 and NLFSR2. Remark 1. A 20-bit IV can be generated from the initial SRAM state of tags when tags are powered up (see [15] ). The entropy of IV can also be increased by employing the von Neumann technique, which can be efficiently implemented in hardware [24] . However, the implementation of these components needs additional hardware support.
Security Analysis of the PRNG
The security analysis of the proposed PRNG is conducted in two steps. In the first step, we performed all cryptographic statistical tests that are specified in the EPC C1 Gen2 standard [10] and the NIST standard [23] on several sets of pseudorandom sequences generated by the proposed PRNG with different initial states. In the second step, we investigate the attack resistant properties of the new PRNG by launching the algebraic attacks, cube attacks, and time-memory-data tradeoff attacks.
Randomness Analysis of the PRNG
According to the EPC C1 Gen2 standard, a true random or pseudorandom number generator must satisfy the following three statistical properties:
-Probability of a single sequence: The probability that any 16-bit random sequence (RN16) drawn from the PRNG has value j, shall be bounded by For a tag population up to 10,000, the probability that any of two or more tags simultaneously generate the same sequence of bits shall be less than 0.1%, regardless of when the tags are energized. -Probability of predicting a sequence: A sequence drawn from the PRNG 10ms after the end of transmission shall not be predictable with a probability greater than 0.025% if the outcomes of prior draws from the PRNG, performed under identical conditions, are known. We implemented our PRNG in software for checking whether the proposed PRNG meets the above three criteria. To verify the first criterion, we generated 18 different test sequences for different initial states of the NLFSRs and calculated the probability of occurrence of 16-bit numbers. Our experimental results show that the probability of any 16-bit number j, i.e., Pr ( 16 = 16 , which are better bounds than those obtained in [20] . The upper and lower bounds of probability values for different tests are given in 2 nd and 3 rd columns in Due to the high linear span of the sequence s, it is impossible to generate the next consecutive 80 bits from previous known 80 bits in practice. Furthermore, it is also difficult for an adversary to intercept 2 16.26 consecutive random numbers in one protocol session because the communication session in RFID systems is usually quite short and the IV is different. Moreover, the secret seed can also be updated for different sessions. Hence, the attacker can guess the next 16-bit random number with the better probability 2 −16 , which is much less than 0.025% as specified in the EPC C1 Gen2 standard.
To measure the linear dependency between an n-bit output and the previous n-bit output, we performed a serial correlation test [18] on the sequences generated by the PRNG. We generated 18 distinct sequences for different initial values of the NLFSRs, each one is of size 2 26 bytes and calculated the serial correlation coefficient for 1-bit, 1-byte and 2-byte lag. Our experimental results demonstrate that the serial correlation coefficients are close to zero, which indicates the good randomness of the generated sequences. The serial correlation coefficients for different sequences are given in 4 th , 5 th and 6 th columns of Table 2 . Table 3 . Non-overlapping template matching test results are not given in Table  3 because of 148 entries. However, the proposed PRNG has passed the test successfully. It is not difficult to find out that each TS set can pass the NIST test suite successfully. 
Cryptanalysis of the PRNG
In this subsection, the attack resistant properties of the PRNG are investigated by considering the algebraic attacks, cube attacks, and time-memorydata tradeoff attacks in detail. Since our PRNG uses nonlinear feedback shift registers over different fields, we also explain below why the correlation attacks [21] , Discrete Fourier Transformation (DFT) attacks [13] , and differential attacks [25] are not applicable.
Algebraic Attack
Algebraic attack [8] Cube attack [9] is a generic key-recovery attack that can be applied to any cryptosystem, provided that the attacker can obtain a bit of information that can be represented by a low-degree decomposition multivariate polynomial in Algebraic Normal Form of the secret and public variables of the target cryptosystem. According to the cube attack, our PRNG can be regarded as a system of multivariate polynomials ( 1 , … , 45 , 1 , … , 20 ) with public IV variables 1 , 2 … , 20 and secret key variables 1 , 2 , … , 45 .
The polynomial
We implemented the cube attack against our PRNG in CUDA and exploited the power of a GPU (i.e, a Tesla C2070 from NVIDIA) for accelerating the computation significantly. We took the first output bit after the 36-round initialization phase in order to find the maxterms in the master polynomial and performed an exhaustive search over all possible cube dimensions ranging from 1 to 20. However, our experiment did not find any linear and quadratic superpoly equations for different cube dimensions.
Time-Memory-Data Tradeoff Attack
Time-memory-data tradeoff attack is a generic cryptanalytic attack which can be applied to any cipher. In a stream cipher, the complexity of a timememory-data tradeoff attack depends on the length of the internal state, which is given by (2 2 ), where n is the length of the internal state [4] . We note that a stream cipher with low sampling resistance is vulnerable to a more flexible time-memory-data tradeoff attack. In our PRNG, the WG transformation is the filtering function as well as the internal state update function and the number of terms in the algebraic normal form representation of the WG transformation is 15, among which only two terms are linear and the remaining terms are either quadratic or cubic. Only by fixing four input variables in the WG transformation, one can obtain a linear function in one variable. Thus, the sampling resistance of the proposed PRNG is high. Since the length of the internal state is 65-bit in our PRNG, the expected complexity of the time-memory-data tradeoffs attack is (2 ), where l equals 32.5.
Other Attacks
In the fast correlation attacks [21] , the internal state of an LFSR based stream cipher can be recovered by first determining a system of linear equations according to a statistical model and then solving the system of linear equations. In our PRNG, the internal state is updated in a nonlinear way. Thus it is hard for an attacker to decide such a system of (non-) linear equations according to some statistical models.
For an LFSR based stream cipher, the DFT attacks [13] can be applied when the exact linear complexity of the output sequence and enough consecutive output bits are known. In our PRNG, the exact linear complexity and period of the output sequence are not known for an initial state. Therefore, the DFT attacks cannot be applied to our PRNG. Moreover, in the EPC C1 Gen2 standard protocol, it is hard for an attacker to obtain enough consecutive bits.
A chosen IV attack on the original version of WG cipher was presented in [25] , where one can distinguish several bits of the output sequence by building a distinguisher based on differential cryptanalysis. In our PRNG, two nonlinear terms (i.e., an output from the WG transformation as well as a 5-bit tuple generated by the first building block) are added to the recurrence relation. Thus the differentials after 36 rounds of the initialization phase will contain most internal state bits. As a result, it would be hard for an attacker to distinguish output bits generated by the proposed PRNG.
Comparisons with Sponge-based PRNGs
A sponge-based PRNG is constructed using a sponge function [3] , which is composed of two phases: an absorbing phase and a squeezing phase. While truly random seeds are fed into the internal state of the sponge function in the absorbing phase, the squeezing phase outputs pseudorandom numbers [3] . Any sponge-based hash function such as U-QUARK [1] , DM-PRESENT [6] , PHOTON [14] , and SPONGENT [5] can be used to construct a sponge-based PRNG. Since a sponge-based PRNG requires a multiple seeding mechanism, one needs additional random sources to provide multiple truly random seeds to the PRNG while generating pseudorandom numbers. Note that the multiple seeding mechanism provides the forward secrecy for the sponge-based PRNGs, which can also be achieved by our PRNG provided that additional random sources are available 1 1 Assume that the length of a seed K is a multiple of the length of the secret key = 1 �| 2 |� … || , ≥ 1. If a multiple seeding mechanism is applied to our PRNG, the seed is updated by repeating the following step r times: ( = 1, 2, … , ) is XORed with the values in the key bit positions of the current internal state, followed by an 18-round of the initialization phase. design has desired randomness properties like period and linear span (see Table 4 ).
Hardware Implementation
In this section we report efficient FPGA implementation of the proposed PRNG core on the low-cost Xilinx FPGA series Spartan-3 and compare our results with other reported lightweight PRNG implementations.
Target Platform and Design Tools
FPGAs are composed of configurable logic blocks (CLB) and a programmable interconnection network. We implement the proposed PRNG core in VHDL for the low-cost Spartan-3XC3S50 (Package PQ208 with speed grade -5) FPGA device from Xilinx [26] . Considering the availability of SRL16 in Spartan-3 generation of FPGAs, we have coded NLFSR1, NLFSR2, and NLFSR3 properly to guide the synthesis tool to infer SRL16 shift register cells, which enables us to reduce the area of the resulting implementation significantly. We use the integrated FPGA development environment Aldec Active-HDL 9.1 for writing, debugging and simulating VHDL codes. Furthermore, Synopsys Synplify Premier with Design Planner E-201103-SP2 and Xilinx ISE Design Suite v13.2 are employed for the design synthesis and implementation, respectively.
The PRNG core works as follows. Under the control of clock enable pins CE1, CE2, and CE3, the 45-bit secret seed and 20-bit IV will be first loaded into NLFSR1, NLFSR2, and NLFSR3 within 18 clock cycles through pins DIN1, DIN2, and DIN3, respectively. After loading the required key and IV, the initialization phase will be performed in the next 36 clock cycles without any output (i.e., CE5 is disabled in this phase). The running phase will start from the 55-th clock cycle and the PRNG core will output one bit every five clock cycles under the control of clock enable pins CE3 and CE5.
The finite state machine (FSM) has two 1-bit input signals CLK and RST and seven 1-bit output control signals CE1--CE5, Init, and Load. In our design, a binary counter is used to keep track of the number of clock cycles elapsed. The FSM starts by pulling up the reset signal RST to '1', which resets the counter to be 0. At this time instance, the FSM sets two control signals Init = '0' and Load = '1' and starts loading key and IV. When the counter reaches a values of 17, the FSM goes into the initialization phase and two control signals become Init = '1' and Load = '0', respectively. During the initialization phase, the counter continues increasing by one at every clock cycle until it hits a value of 53. The FSM then transfers to the running phase. During the running phase, both control signals Init and Load are set to be '0' and a 16-bit random number will be generated every 16 clock cycles.
Hardware Architecture

6.3.
Implementation
Results and Comparisons
The hardware implementation shows that the PRNG core totally occupies 46 slices (12 and 34 slices for building blocks I and II, respectively) on the target FPGA device and achieves a throughout of 45 Mbps. Table 4 presents a comparison with other PRNGs in terms of hardware implementation and achieved randomness properties. One can notice that our PRNG has a lower hardware complexity than that in [22] . When compared to the PRNG proposed in [20] our design costs a similar number of logic gates with the usage of two NLFSRs replacing the TRNG in [20] . However, if we only compare the hardware implementation cost for the pseudorandom number generator module (i.e., the building block II in our design) in both proposals, our design only needs a half number of logic gates as that in [20] . Although the hardware complexity of our PRNG is slightly larger than that of SPONGENT-80 [5] , our design can provide desired randomness properties such as period and linear complexity that cannot be guaranteed by SPONGENT-80. In terms of the time delay for generating the first 16-bit pseudorandom number, our design totally requires 134 clock cycles, including 18 clock cycles for loading key and IV, 36 clock cycles for the initialization, and 80 clock cycles for generating the first 16-bit random number. After that, each 16-bit random number can be obtained every 80 clock cycles. Assuming that the EPC tags run at the clock frequency of 100 KHz and two 16-bit random numbers are needed for the tag identification protocol according to the EPC C1 Gen2 standard, one can identify about 510 tags in one second by using the proposed lightweight PRNG.
Remark 2.
In the proposed PRNG, we can update the 45-bit key at the end of each session by generating 45 extra bits in 225 clock cycles and these 45 bits will be loaded at proper aforementioned key positions. This key updating procedure can be used to provide better security. In this way it is possible to generate at least 2 16.26 × 2 20 consecutive random numbers for one key and for different IVs.
Conclusions
In this paper, we propose a lightweight pseudorandom number generator that is in compliance to the EPC Class-1 Generation-2 standard and has guaranteed randomness properties like period and linear span. Considering the high power-consumption, large area and low throughput of TRNGs, we replace the TRNG used in previous works by a PRNG with good statistical properties. In our design, the pseudorandom sequence is generated using a nonlinear feedback shift register. Moreover, the statistical tests specified by the EPC C1 Gen2 and the NIST standards, algebraic attacks, cube attacks and time-memory-data tradeoff attacks are employed to characterize the security properties of the proposed PRNG and a comparison with the sponge-based PRNGs is conducted. In addition, an FPGA implementation shows that the proposed PRNG can be implemented using 46 slices and can generate a 16-bit random number every 80 clock cycles after an initialization process of 36 clock cycles.
