A new TRNG based on coherent sampling with self-timed rings by Martín González, Honorio et al.
Universidad 
Carlos rn de Madrid G-Archivo
lnstitutiona I Repository 
This is a postprint version of the following published document: 
Martín, H., Peris-Lopez, P., Tapiador, J.E., San Millan, E. (2016). A New TRNG Based on Coherent 
Sampling With Self-Timed Rings. IEEE Transactions on Industrial Informatics, vol. 12, no. 1, pp. 
91-100.
Available in http:/10.1109/TII.2015.2502183
© XXXX. IEEE. Personal use of this material is permitted. Permission 
from IEEE must be obtained for all other uses, in any current or future 
media, including reprinting/republishing this material for advertising or 
promotional purposes, creating new collective works, for resale or 
redistribution to servers or lists, or reuse of any copyrighted component of 
this work in other works. 
A New TRNG Based on Coherent Sampling 
With Self-Timed Rings 
Honorio Martin, Pedro Peris-Lopez, Juan E. Tapiador, and Enrique San Millan 
Abstract-Random numbers play a key role in applica-tions 
such as industrial simulations, laboratory experimen-tation, 
computer games, and engineering problem solving. The design 
of new true random generators (TRNGs) has attracted the 
attention of the research community for many years. Designs with 
little hardware requirements and high throughput are demanded 
by new and powerful applica-tions. In this paper, we 
introduce the design of a novel TRNG based on the coherent 
sampling (CS) phenomenon. Contrary to most designs based 
on this phenomenon, ours uses self-timed rings (STRs) 
instead of the com-monly employed ring oscillators (ROs). Our 
design has two key advantages over existing proposals based 
on CS. lt does not depend on the FPGA vendor used and does 
not need manual placement and routing in the manufacturing 
process, resulting in a highly portable generator. Our exper-iments 
show that the TRNG offers a very high throughput with a 
moderate cost in hardware. The results obtained with ENT, 
DIEHARD, and National lnstitute of Standards and Technology 
(NIST) statistical test suites evidence that the output bitstream 
behaves as a truly random variable. 
lndex Terms-Coherent sampling (CS), FPGAs, self-timed 
ring (STR), true random generator (TRNG). 
l. INTRODUCTION 
S OURCES of random numbers are always in demand. They play a key role in computer games, problem solv­
ing techniques in engineering, industrial simulations, security 
primitives and protocols, and a variety of other applications. 
In many cases, the quality of the randomness must be as high 
as possible, e.g., when used in security applications to gener­
ate keys, nonces, session identifiers etc., while in others it is 
also required a ve1y high throughput in the generation process. 
In this regard, hardware-based pseudo and tme random number 
generators [pseudo-random number generator (PRNG) and tme 
random generator (TRNG), respectively] are very appealing 
H. Martín and E. S. Millan are with Department of Electronics
Technology, Universidad Carlos 111 de Madrid, 28911 Madrid, Spain 
(e-mail hmartin@ing.uc3m.es; quique@ing.uc3m.es). 
P. Pelis-Lopeze and J. E. Tapiador are with the Computer Seculity
Laboratory, Department of Computer Science, Universidad Carlos 
111 de Madrid, 28911 Madrid, Spain (e-mail: ppelis@inf.uc3m.es; 
jestevez@inf.uc3m.es). 
because of their superior perfomance when compared with 
software implementations [1]-[3]. 
Motivated by this, many researchers have pointed out the 
convenience ofusing fiel programmable gate arrays (FPGAs) 
as TRNG platfo1ms due to their low cost and versatility [4]­
[ 6]. However, FPGAs offer a resource-constrained environment 
(fi ed logic blocks) that does not indude analog blocks, which 
are frequently employed to generate ve1y entropic outputs. 
Thus, the typical phenomena used in the generation of random­
ness in FPGAs are metastability and jitter [7], [8]. Since FPGAs 
are initially designed and implemented to reduce their random 
behavior, it is considerably more challenging to implement a 
TRNG in an FPGA than in other digital devices. 
From the implementation point of view, a TRNG should 
not be technology dependant. For instance, in [9] it is pre­
sented a TRNG that exploits the technique of coherent sampling 
(CS) and uses an analog phase-locked loop (PLL) to obtain a 
fin control over the dock signal. In particular, the proposed 
TRNG is implemented in an Altera FPGA, which indudes a 
PLL. Unfo1ttmately, this design is not portable to the FPGAs 
of other important manufacturers of the sector, e.g., such as 
the ones produced by Xilinx, whose FPGAs mainly use delay­
locked loops (DLLs) instead of PLLs. Apart from being tech­
nological independent, proposed designs should not be device 
dependant. For example, Kohlbrenner and Gaj present in [10] 
a TRNG that uses ring oscillators (ROs) and takes advantage 
of the CS technique. The design consists of two independent 
and identically configure ROs with similar but not identical 
frequency. A sampling circuit uses one dock signal to sam­
ple the other dock signal. Although the design works well 
theoretically, Kohlbrenner and Gaj show how not only there 
is a great variation between the RO frequencies in the same 
FPGA (7%) but also between different FPGAs. In other words, 
the design depends so much on the used device that it has 
to be manually ttmed (placement and routing) for each FPGA 
implementation. 
To the best of our knowledge, one of the firs simple and 
portable designs of a TRNG suitable for different FPGAs was 
presented by Stmar et al. in [11]. Nevertheless, this design was 
rapidly discarded since it suffers from implementation prob­
lems, mostly related to the number of signals handled by the 
XoR-tree. Moreover, the quality ofthe raw signal is rather poor 
and needs postprocessing. In [12], Wold and Tan present an 
enhanced design based on Stmar et al. 's, which attempt to solve 
its implementation problems while avoiding the need of a post­
processing stage. This enhanced version proposes a reduction 
in the number of rings that, as repo1ted in [13], cause the 
loss of entropy. However, such a lack of entropy is masked by 
the pseudorandomness caused by XOR-ing clock signals hav­ing diferent frequencies. In addition, these two designs were 
successfuly atacked in [14] and [15] exploiting the use of 
ROs and their vulnerabilities to frequency injection, in which 
the ROs are locked to an injected frequency and the jiter 
phenomena as a source of randomness is neutralized. 
More recently, inspired by Sunar design, Cherkaoui 
[16] have presented a new design in which the ROs are 
replaced by a self-timed ring (STR). An STR is a multiphase 
generator that can maintain constant phase diference between 
the diferent outputs. Thereore, this construction is resistant 
to the common vulnerabilities used to atack TRNGs based on 
ROs. Cherkaoui TRNG seems to be a secure design (no 
atack has been published yet), but the design consumes a sub­
stantial amount of resouces in terms of power and circuit area. 
This renders it unsuitable or constrained devices. More specif­
icaly, the STR generates 63 signals that are sampled and inaly 
passed through an XoR tree, generating a high activity and, 
correspondingly, having a high power consumption. Moreover, 
the hardware requirements are superior to the ones that can be 
aforded in most constrained devices. 
In this paper, we present a new TRNG based on the CS tech­
nique. 0n the one hand, the design takes advantage of sone key 
STR eatures, which help us to solve the inplementation prob­
lems (mainly the device dependence) sufered by Kohlbrenner 
and Gaj's design [10], while simultaneously avoiding the vul­
nerabilities linked to the use of ROs. On the other hand, the 
proposed design is very eicient in hardware, which makes it suitable or devices with limited capabilities, and ofers a 
relatively high throughput. 
This paper is organized as folows. In Section Il, we describe 
the CS technique. We explain the phenomenon and the ran­
domness extraction technique. An overview of STRs and their 
operation principies is presented in Section In Section IV, 
our proposal is presented together with sone inplementation 
considerations needed. The experimental results, both about the 
randomness quality and hardware requirements, are presented 
in Section V, together with a comparison between our proposal 
and the most relevant designs. Finaly, Section VI concludes the 
paper and summarizes our main contributions. 
In this section, we irst introduce the principies of CS. After 
that, we present the main TRNG proposals that exploit this 
technique. 
CS is a wel-known technique to sample periodic signals at
iner time intervals. CS reers to an integer number of cycles 
that its into a predeined sampling window. Mathematicaly, 
this can be expressed as 
fin 
fs.mple = Ns.mples (1) 
where /in is the sampled signa! (S1) frequency, !s.mple is the sampling signa! (S2) frequency, Ncyc is the number of cycles of the sampled signa!, and Ns.mples is the number of samples. 
das 
Radom umer 
Ncyc and Ns.mples are high and coprines, the repetition period of samples wil be maximum, i.e., we wil have the 
highest resolution of the sampled signa!. This is an interesting 
feature because if the number of periods (frequencies) is con­
stant or ideal sources of S1 and S2, in physical systems where 
these clock signals contain jiter, this number wil be random 
because of the Gaussian random component contained in the 
jiter. 
The general architecture of a TRNG using CS is depicted in 
Fig. 1. The signa! S1 would be sampled by the signa! S2, gen­
erating a digitized analog signa! known as "das." the quality 
of the raw output is not high enough, a postprocessing stage is 
added to guarantee a uniorm output. A mathematical model of 
physical RNGs based on CS can be ound in [17]. 
The irst time, to the best of our knowledge, that CS was
used in an FPGA to generate random numbers was in [9]. In
that work, Fischer and Drutarovsky used a PLL embedded in 
an Altera FPGA to guarantee the relation between Ncyc and Ns.mples. As explained in Section I, the main drawback of this proposal is that the TRNG is not portable to other FPGA 
vendors. Besides, PLLs are not supported in al FPGAs. 
In [10], Kohlbrenner and Gaj replaced PLLs by ROs with the 
aim of obtaining a portable design or FPGAs rom diferent 
vendors. The RO frequencies are selected to be close but not 
identical. Toe RO outputs are connected to a sampler circuit 
that generates a stream of O' s and l 's. The length of this stream 
is counted module 2 to generate a random bit. The weakest 
point of this design is that it requires a very complicated manual 
placement and routing process to inely set the ring frequencies. 
According to [10], this is a consequence of the high variation 
(up to 7%) between the RO frequencies in the same FPGA. To 
overcome such a sensitivity to placement, the authors suggest a 
design with four ROs that are sampled by a ifth one. 
In [18], Cret take up the basic idea of using only 
two ROs. In this design, the authors introduce a multiplexer to 
altenate the sampling signa!. They claim that the placement 
sensitivity is overcome using a parametrizable postprocessing. 
The main weakness of this TRNG is that the quality of the 
raw output, without the postprocessing stage, is realy poor. In
addition, Cret present the cycle lengths of the signa! gen­
erated in the sampler and its distribution is not an evidence of 
the claimed randomness-which is actualy far away from a 
uniform distribution. 
2
Finaly, in [19], the authors present three designs based on 
different clock generators or diferent FPGA models. More 
precisely, the generators are RO-RO, RO-PLL (for Altera 
FPGAs), and RO-DFS (or Xilinx FPGAs). Apart from the 
c-1
c-1
3
1
--- --------- ---, 
SAL STR-A
¡ 1 SA2 -
1 
1 
1 
1 Stagei 1 
1 
1 L _____________ _ --------------,-. 
1 
Stage1 Stage2 Stagei 1 
1 
1 Ss2 
1 
s,L STR-BJ L _____________ _ 
Fig. 5. STR structure of our TRNG. 
Fig. 6. Sampler structure of our TRNG. 
1) STR robustness to voltage variations can be enhanced by
adding more stages. ROs do not offer this feature.
2) STRs present a lower extra-device frequency variation
when operating at high frequencies.
3) In STRs, the period jitter does not depend on the number
of stages but it is mostly dependant on the jitter generated
at each stage.
From the security point of view, these features are very inter­
esting. In fact, in [21], the authors conclude that STRs are more 
robust to attacks than ROs and this property is inherited by our 
proposal. Furthermore, replacing ROs by STRs provides our 
design with the possibility of having at least L different signals 
in each STR. Each one of those L signals can be used as a sam­
pling or sampled signal, since each stage can be considered as 
an independent source of entropy [16], the number of stages is 
equal to the number of independent entropy sources. Moreover, 
STRs are highly configurable. In particular, it is very easy to 
set the desired frequency for the STR output, which allows a 
fine-grained control over the resulting speed of the TRNG. 
A. Architecture Overview
Figs. 5 and 6 show the different blocks that malee up our 
TRNG. Fig. 5 depicts the two STRs used in our design. Both 
STRs are composed of L stages that generate L different out­
puts with a frequency /sTR· The number of tokens and bubbles 
are selected in the reset phase attending to the frequency and 
phase necessities. 
The jitter contained in the STR outputs is extracted using the 
sampler circuit shown in Fig. 6. Each sampler circuit is com­
posed of four-dimensional (4-D)-type flip-flops and one XOR 
gate. The first flip-flop uses the signal S Bi to sample the sig­
nal S Ai. The signal So will be high while the rising edges of SB, 
occur during the high level of SA•· In Fig. 7, we show the 
behavior of So taking into account that SBi contains jitter. As 
Fig. 7. Sampler behavior. 
consequence of such a jitter, the cycle length of So will not be 
constant. 
In our design, both signals SBi and SAi contain jitter. As 
a variation of the original sampler design that includes a 1-bit 
counter latched by So, in our design, we use the simplified ver­
sion presented in [19]. In this scheme, instead of counting the 
cyclesof SBi, we count the number of cyclesthat So is al a high 
level. If such a number of cycles is even, the previous output is 
maintained; otherwise, the output changes. Two D flip-flops and 
one XOR gate are involved in this process. Finally, the last flip­
flop samples the signal Co using an extemal clock. This external 
clock determines the TRNG throughput. As our design is com­
posed of two STRs with L stages each, L sampler circuits are 
necessary (see Fig. 1 ). 
Finally, our design in eludes a postprocessing unit that might 
be needed depending on the quality, in terms of randomness, 
of the raw data. The selected postprocessing is a parity filter, 
which has been widely used as postprocessing in previous pro­
posals such as [16] and [18]. More precisely, an nth parity filter 
takes n consecutive bits and XOR ali of them together to pro­
duce one bit. This postprocessing offers a simple bias reduction 
with the penalty of a throughput reduction-the filler reduces 
the bit generation by a factor of n. 
V. EXPERIMENTAL RESULTS 
In order to evaluate the portability of our proposal, we have 
implemented our design on FPGAs from three different man­
ufacturers: l )  a Spartan-3E XC3S500E FPGA from Xilinx; 
2) an lgloo M lAGLlOOO from Microsemi; and 3) a Cyclone 11
EP2C5F256C8 from Altera. As expected, the obtained results
are similar in all of them. In addition, to show the independence
of our design from the manufacturing technology, we have also 
implemented one final chosen design on another two differ­
ent FPGAs that use different process technologies: l) a Virtex
5 XC5VLX110T (65 nm); and 2) a Virtex 6 XC6VLX240T
(40 nm) from Xilinx. In the following, we discuss our results
in detail.
Two eight-stage STRs have been irnplemented and config­
ured in the reset phase to obtain an STR output frequency of 
300 MHz. Several frequencies have been used in the external 
clock that samples the signal C'o. Eight bits are generated with 
each rising edge of the sampling clock. 
In order to obtain almost the same propagation delay in the 
different stages, a hard macro (or its equivalent for other FPGA 
vendors) has been designed. This hard macro implements a 
Muller gate and an inverter using a single look-up table (LUn. 
4 
.. " 
POfiod numbcr 
Fig. 8. Time evolution and histogram of So. 
We have chosen two STRs with eight stages since this con­
figuratio is easily tunable and offers a good tradeoff between 
area and throughput. In addition, this configuratio allows the 
generation in parallel of 8 bits (1 Byte), which is a typical bit 
length used in many applications. Toe throughput goal has been 
set to 1 Mb/s to be comparable to other TRNGs proposals based 
on CS. This throughput threshold will set the Iowest sampling 
frequency that can be used in our design. 
A. Testing Randomness
In this section, we discuss the quality of the TRNG output in 
tenns of randomness. Following the standard practice in this 
field we firs show that the STR outputs have Gaussian jit­
ter and then repo1t the results obtained with three widely used 
suites of statistical tests for cryptographic applications. We have 
also caffied out a resta1t experiment to provide evidence that the 
output is different after repeatedly resta1ting the system tmder 
the same conditions. 
Due to space limitations, all results repo1ted in this section 
coffespond to traces obtained with the Spartan-3E XC3S500E 
FPGA. The conclusions for the other four FPGAs are identical 
to those shown here. 
1) Evidence of the Gaussian Jitter: In order to show evi­
dence of the presence of Gaussian jitter in the STR outputs, we 
have cotmted the number of cycles of the signa! So, as done in 
[19] and [10]. Fig. 8 depicts the time evolution of the So length 
(top) and a histogram of the cycles (bottom). Toe his­togram 
population coffesponds with 1.3 x 106 measurements. Toe 
average period of So is 38.69 ns with an standard devia­tion of 
0.215 ns. As the frequency of the STR has been set to 300 
MHz, which means that the average cycle length is 11.61 
cycles. In conclusion, the histrogram distribution clearly shows 
evidence ofthe tmderlying randomness in the sampling process, 
and by extension, in each stage of the STR. 
2) Restart Experiment: Following the same procedure 
used in [12] and [23] to distinguish the amotmt of true ran­
domness contained in a pseudorandom oscillating signa!, we 
have carried out a resta1t experiment. In Fig. 9, nine oscillo­
grams of repeated restaits from identical starting conditions are 
presented. Toe horizontal axis represents the time and shows 
the firs 20 bits generated after each restart (the period of time 
shown for each restait is 20 µs using a sampling dock of 
Fig. 9. Nine output sequences captured after restarting the TRNG. Note 
that all sequences are different. 
1 MHz). The vertical axis is the voltage of the output signa!. 
Only nine cmves out of the 1000 generated are shown. It is 
clear that the TRNG generates different traces after the same 
restaiting point. 
3) Statistica/ Eva/uation of the Output: Toe testing of
our proposal has been cairied out using the NIST statistical 
test suite [24], as com only done to validate previous pro­
posals ( e.g., [9], [1 O], [18]). To transfer the bits generated by 
the TRNG in the FPGA to the host computer where the NIST 
tests are executed, a FIFO memo1y and an RS232 communi­
cation protocol have been used. In addition, the postprocessing 
has been conducted in the host computer in order to reduce the 
acquisition time of the traces. 
We have evaluated the TRNG output for the following set 
of sampling frequencies: 50, 25, 10, 5, 1, and 0.5 MHz). A 
higher sampling frequency will imply a higher throughput, but 
also a Iower quality of the random bits due to the fact that 
the jitter accumulation time is sho1ter. According to the study 
presented in [25], a longer accumulation time is desirable so 
that the contribution of the the1mal noise (responsible of the 
nondete1ministic jitter) is perceptible. On the other hand, the 
use of a longer accumulation time causes that the flic er noise 
(responsible ofthe dete1ministic jitter) dominates the jitter. This 
pai·adox forces designers to fin a tradeoff to set the sampling 
frequency. 
For the postprocessing, we have tested the minimum parity 
filte order (bit-wise XoR tree) necessaiy to pass the NIST tests 
for the different sampling frequencies studied. A third-order fil 
ter is needed for 50 MHz, while a second-order filte suffice 
for the rest. As expected, the postprocessing necessities are 
higher when higher sampling frequencies are used. Although 
many sampling frequencies need the same order pai·ity fil 
ter, it is important to notice that the propo1tion of failed tests 
before the postprocessing rises when the sampling frequency 
is increased, as explained below. This is a cmcial point if for 
sorne reason the TRNG will be used without the postprocessing 
block. 
We have evaluated the quality ofthe raw data before the post­
processing for the six sampling frequencies studied. Fig. 1 O 
shows boxplots of the p-value distribution for each sampling 
5 
• 
0.9 
o.e • 
0.7 
i" o.s 
.! 
� 0.4 
.. 
0.3 
0.2 
0.1 
���M���� ���M���� ���M���� �������� �������� ���M���� 
50 MHz 25 MHz 10 MHz 5 MHz 1 MHz 0.5 MHz 
Sampling stages (b1-b8) for dlferrent frequencies 
Fig. 10. Boxplots of p-value distributions for each sampling stage (bl to b8) and different frequencies. 
TABLE 1 
EXPERIMENTAL RESULTS: PASS RATE (PR) PROPORTION ANO AVERAGE P-VALUE (PV) FOR GENERATED TRACES 
0.5 MHz 1 MHz 5 MHz 10 MHz 25 MHz 50 MHz 
PR PV PR PV PR PV PR PV PR PV PR PV 
bl 97.91 0.41 98.58 0.44 98.41 0.57 99.08 0.55 99.5 0.58 93.83 0.22 
b2 98.41 0.43 99.33 0.34 82.58 0.18 82.5 0.30 84.66 0.22 39.66 0.16 
b3 98.83 0.46 99.08 0.62 98.58 0.59 98.91 0.41 98 0.44 99.33 0.60 
b4 99.66 0.59 99.41 0.45 99.33 0.36 99.41 0.45 99 0.59 83.41 0.17 
b5 92.66 0.25 87.16 0.20 98.5 0.59 98.25 0.55 99.16 0.39 97.91 0.30 
b6 98.41 0.54 98.75 0.47 31.66 0.04 31.58 0.01 31.25 0.08 22.25 o.os
b1 98.75 0.36 99.16 0.45 98.83 0.53 98.91 0.61 98.58 0.36 66.58 0.24 
b8 99.16 0.39 98.91 0.31 98 0.44 97.5 0.40 96.83 0.43 48.5 0.15 
Total 97.97 0.43 97.55 0.41 88.23 0.41 88.27 0.41 88.37 0.39 68.93 0.24 
stage (bl-b8) and different frequencies. According to the doc­
umentation provided by NIST, a random stream must present 
unifomúty in the distribution of its p-values. lt can be seen 
in Fig. 10 that higher sampling frequencies presents less 
unifomúty for its p-values distribution than lower sampling 
frequencies, which are more tuúfo1m. 
Further evidence of this phenomenon is presented in Table I, 
which shows the propo1tion of traces that pass the statistical 
tests (PR) and the average p-value (PV) for the different sam­
pling stages and frequencies. Note that traces conesponding to 
b5 and b6 perform quite badly, specially b6. For sampling fre­
quencies of 0.5 and 1 MHz, only a single trace (b5) fails the 
NIST tests before the postprocessing. lt is notewo1thy, however, 
that b5 fails the tests by a nairnw margin. Three traces of b5 fail 
the tests for the sampling frequencies of 5, 1 O, and 25 MHz, and 
seven traces fail for 50 MHz. As for b6 traces, they fail badly for 
the sampling frequencies between 5 and 50 MHz. This consis­
tent behavior in b6 is mainly due to the fact that the synthesizer 
has placed the sampling stage that generates the b6 stream in 
a way that causes a huge delay between the sampling (SA6) 
and the sampled (SB6) signals. Tlús problem could be solved 
using a manual placement and routing process. In fact, we have 
tested this using manual placement and routing and setting the 
sampling frequency to 50 MHz results in a design such that the 
raw stream ofbits without postprocessing passes the NIST tests. 
Neve1theless, one major design goal of our proposal is to avoid 
such a manual procedures. 
We ha.ve evaluated the quality of our proposed TRNG after 
postprocessing. A sampling frequency of 1 MHz has been 
selected for this experimentation since this frequency offers a 
tradeoffbetween throughput and randomness quality before the 
postprocessing stage. We ha.ve opted for having a good quality 
signa! without postprocessing to make stronger our TRNG pro­
posal against sorne attacks. ENT [26], DIEHARD [27], NIST 
[24], and AIS31 [28] suites have been used for analyzing the 
randomness quality. 
In Table II, we summarize the results obtained with ENT, 
which resemble those obtained with a genuine random vari­
able, such as the chi-square test is pa.ssed, entropy is extremely 
high, the serial conelation is ve1y low, etc. DIEHARD is a 
much more demanding batte1y of tests for checking random­
ness. As in the case ofthe NIST suite, DIEHARD is particularly 
6 
TABLE I
ENT RESULTS FOR A SAMPLING FREQUENCY SET TO 1MHZ
Fig. 11. Distribution of p-values for the DIEHARD and NIST test suites.
TABLE II
AIS31 RESULTS FOR THREE FPGAS
designed for cryptographic applications and includes a number 
of statistical tests (e.g., frequency, rank, ft, monkey, runs, and 
so on). A fina p-value is obtained for each test. If we take a 
significanc level of 0.05, pl< 0.025 or p> 0.975 means that 
the TRNG fails the test. To show evidence that our proposed 
TRNG behaves as a random variable, in Fig. 11, we depict the 
distribution of p-values for al tests included in both suites. In 
particular, al the p-values in the NIST and DIEHARD suites 
are within the interval [0.2, 0.8], so the TRNG passes al tests 
in both suites.
Finaly, we have evaluated the data acquired from the three 
FPGAs using the AIS31 statistical test suite. Using two sam-
pling frequencies of 50 and 1 MHz, we have gathered a 1-MB 
sequence of raw data. The results for the AIS31 statistical suite 
are depicted in Table II. Note that tests T1–T4 corespond to 
four FIPS 140-1 tests (poker, monobit, runs, and long runs). 
T5 is an autocorelation test, T6 is a uniform distribution test, 
T7 is a comparative test for multinomial distributions, and T8 
is an entropy test. According to the AIS31 recommendations, 
raw data from the TRNG output or at least data at the output 
of the arithmetic postprocessing, should pass T5 through T8. 
The column npmin represents the minimum filte order to com-
ply with this requirement. For the AIS31 results—as shown in
TABLE IV
HARDWARE RESOURCES
Table II and, equivalently, for NIST p-values in Fig. 10—we 
have obtained beter results for lower frequencies.
From al of the above, we can conclude that our TRNG 
outputs a bit stream that looks like a true random variable.
B. Hardware Resources
The results presented in this section corespond to our 
chosen design with a sampling frequency of 1 MHz. The 
architecture consists of two eight-stage STRs, eight sam-
pling stages, and a second-order parity filte as postprocessing 
block.
Since each STR stage uses a single LUT, the STRs occupies 
2 × L LUTs. As shown in Fig. 6, the sampler structure uses 
four registers and an XOR gate (one LUT). Therefore, the 
number of LUTs used by the sampler structure is L and the 
number of reg-isters is 4L. Finaly, the postprocessing 
requirements depend on the parity filte of order n. The LUTs 
used by the postpro-cessing is also conditioned by the inputs of 
each LUT. Since a four-input XOR gate, as in the case of two-
input XOR gates, can be implemented using one LUT, the 
number of LUTs, and reg-isters wil be L and nL, respectively
—the filte order is 2 for 1 MHz sampling frequency, as 
explained in Section V-A3.
In summary, observing the results above we can conclude 
that each raw random bit (before postprocessing) has a cost of 
three LUTs and four registers. Therefore, for a given sampling 
frequency (fsampling), a designer could improve the throughput 
by adding more stages to the STRs. This wil result in fsampling 
bps per additional stage. On the other hand, this improvement 
translates into a circuit area penalty of three LUTs and four 
registers per additional stage.
Table IV 
7
summarizes the amount of resources needed to 
implement our TRNG on fiv diferent FPGAs. The difer-
ences in terms of the combinational logic elements for the set 
of FPGAs analyzed are related to the optimizations caried out 
by the diferent synthesis tools. Note that the same amount of 
hardware resources are obtained for the three Xilinx FPGAs 
(i.e., Spartan and Virtex). This is due to the fact that we have 
tailored the hard macro created for the Spartan-3E to fi into 
the Virtex 5 and 6. It is important to emphasize the decision of 
implementing each STR stage in a single LUT to have almost 
the same delay between consecutive stages to avoid botleneck 
efects.
Regarding throughput, our proposal is able to generate ran-
dom bits in paralel. For the proposed architecture, eight ran-
dom bits are generated each two clock cycles. This feature 
can be very interesting for some applications that require the 
generation of random bits in paralel.
TABLE V 
TRNG COMPARISON 
Hardware resources Throughput Hardware complexity Portability 
Our proposal 32 LUTs 48 Registers 4 Mb/s Medium Yes 
Fischer et al. [9] 121 LCs 4 ESBs and I PLL 69 Kb/s Medium No 
Kohl et al. [ 1 O] 12 LUTs 24 Registers 300 Kb/s High Yes 
Valtchanov et al. [ 19) R O -RO 15 LUTs 4 Registers 2 Mb/s High Yes 
RO-PLL 12 LCs 4 Registers and I PLL 2 Mb/s Medium No 
RO-DFS 11 LUTs 6 Registers and 2 DFS 2 Mb/s Medium No 
Cherkaom et al. L 16] 1 320 LUTs 320 Reg1sters 
C. Comparison With TRNGs Based on es
Next, we present a comparison between our proposal and
other TRNG designs that use CS. We also include the proposal 
of Cherkaoui et al. [16] since it is based on STRs. For each 
proposal, we have analyzed the hardware resources needed and 
the offered throughput. In addition, we have also considered 
the hardware complexity, including the degree of automation of 
the design, and its portability (device independence). Regarding 
hardware complexity, we distinguish three categories. 
1) Low complexity is devoted for designs that can be easily
implemented.
2) Medium complexity implies designs that need to use hard
macros or specifi components like PLL or DFS.
3) High complexity considers designs that require a manual
place and route process.
Finally, the portability aspect represents whether the design 
needs special resources or effo1ts to be implemented in different 
FPGA vendors or devices. 
We emphasize here that the hardware results presented in 
Table V constitute an estimation for the designs in which the 
authors do not provide specifi results. For Cherkaoui et al. 's 
proposal, we selected the architecture that implements 255 
stages. 
Table V shows the comparison between our design and 
other TRNG proposals. 1t can be noted that our proposal offers 
a very good tradeoffbetween the set ofparameters evaluated. 
TRNGs that need a complicated place and route process ( e.g., 
[1 O] and RO-RO [19]) are superior in temis ofhardware 
resources, but these designs have the drawback of requiring a 
specifi design for each paiticular device. Among the TRNGs 
based on CS, our design offers the highest throughput. Note 
that this could be even better if a higher sampling frequency 
would have been selected, although this might degrade the 
quality of the random signa!. 0n the other hand, Cherkaoui et 
al. 's TRNG presents the highest throughput, but uses around 1 
O times more resources than our proposal. As aforementioned, 
in tenns of throughput our TRNG generates eight random bits 
in parallel. Finally, it is w01th mentioning that our proposal is 
highly po1table and complies with the two requirements set in 
Section I; i.e., our design is technology and device independent. 
D. Comparison With Other FPGA-Based TRNGs
In Section V-C, we have canied out an exhaustive compari­
son between our proposal and other TRNGs based on CS. Now, 
we present a qualitative comparison with other state-of-the-ait 
1 200 Mb/s 1 Medmm Yes 
TRNGs implemented on FPGAs that present sorne interesting 
metrics. 
Among the severa! proposals repo1ted in this field the TRNG 
proposed by Varchola and Dmtarovsky [29] stands out because 
of its lightweightness. This design takes advantage of the 
metastability on a new bi-stab le structure-transition effect ring 
oscillator (TERO). In tenns of area (two CLBs), this TRNG 
is more lightweight than our proposal but present a very poor 
throughput (250 Kb/s). In addition, experiments show that 
proper placement and routing strategies are essential. 
Exploiting sorne features of embedded RAMs has become 
a popular principle nowadays because of the high throughput 
that can be achieved. Among the proposals that take advantage 
of SRAMs, those that tise write collisions to extract entropy 
are w01th mentioning [30], [31]. The key idea here consists 
of generating a conflic in a paiticular address by trying to 
write opposite values at the same time. In compai·ison with 
our TRNG, these proposals present better results in terms of 
area and throughput, but their portability constitutes a handicap 
becatise an enrolment process is necessaiy in order to identify 
distinctive BRAMs in each FPGA. 
Another interesting proposal based on high fanout nets was 
presented by Cret et al. in [32]. Its perfo1mance is remark­
able regarding po1tability and its high throughput (60 Mb/s). 
However, in terms of area, our proposal outperfo1ms this 
design. Ve1y recently, Wieczorek presented a new FPGA-based 
TRNG [33] that offers metrics similar to those of our design. 
The po1tability of this TRNG is currently under study because 
only a Xilinx implementation has been repo1ted. Another inter­
est�g work is the co�lementaiy design proposed in [34],
which outperfo1ms ours m terms ofthroughput but whose po1ta­
bility has not been deeply studied since the results are only 
validated on a Vutex-6 FPGA. Furthe1more, the design in [34] 
includes a place and route procedure to guarantee the TRNG 
randomness, which implies sorne extra effo1t for the designer 
when implementing the TRNG in different FPGAs. 
All in all, our TRNG presents an attractive tradeoff among 
hardware footprint, throughput, and portability when compared 
with existing proposals. 
VI. CONCLUSION
There is a wide set of applications, ranging from security 
services to simulations, computer games, and problem solving 
tools, where pseudorandom number generators play a central 
role. In many cases, as a consequence of the high throughput 
8 
required and the high-quality randomness demanded, the use
of software-based solutions is simply infeasible. Motivated by
this, many FPGA-based proposals have appeared recently.
TRNGs in FPGAs mainly exploit metastability and jiter
phenomena as sources of randomness. In this paper, we have
proposed a TRNG based on CS, which is a phenomenon that
seems to provide good results in previous proposals. Most pre-
vious works based on CS rely on either a PLL or an RO. The
use of these components has two major drawbacks.
1) It makes the design dependent of the FPGA vendor, for
instance, not al FPGA vendors support PPLs.
2) It requires manual placement and routing for seting
particular frequencies for each device.
To avoid these two drawbacks, we have proposed a novel 
design where ROs or PLLs are replaced by STRs. We argue that 
the use of STRs is very convenient, because it provides robust-
ness against frequency and voltage variations while simultane-
ously ofering one independent source of entropy for each ring 
stage. Thus, the resulting TRNG combines the power of CS and 
the robustness and portability linked to STRs. Furthermore, our 
design does not depend on the FPGA vendor, and the placement 
and routing is performed automaticaly by the synthesis tool.
Our proposal outperforms al previous TRNGs based on CS 
and its throughput could be further increased if we relax our 
requirements about the quality of the random signals before the 
postprocessing (e.g., for non-cryptographic applications). We 
have studied in detail the most restrictive design with a sam-
pling frequency set to 1 MHz. In terms of randomness, our 
TRNG passes al bateries of tests for checking the randomness 
of a random number generator (ENT, DIEHARD, and AIS31), 
and also others like NIST that are devoted to evaluate generators 
designed for cryptographic applications.
REFERENCES
[1] S. Saab, J. Hobeika, and I. Ouaiss, “A novel pseudorandom noise and
band jammer generator using a composite sinusoidal function,”IEEE
Trans. Signal Process., vol. 58, no. 2, pp. 535–543, Feb. 2010.
[2] J.-L. Danger, S. Guiley, and P. Hoogvorst, “High speed true ran-
dom number generator based on open loop structures in FPGAS,”
Microelectron. J., vol. 40, no. 11, pp. 1650–1656, 2009.
[3] R. Vaidyanathaswami and A. Thangaraj, “Robustness of physical layer
security primitives against atacks on pseudorandom generators,”IEEE
Trans. Commun., vol. 62, no. 3, pp. 1070–1079, Mar. 2014.
[4] X. Fang, Q. Wang, C. Guyeux, and J. M. Bahi, “Fpga acceleration of a
pseudorandom number generator based on chaotic iterations,”J. Inf. Sec.
Appl., vol. 19, no. 1, pp. 78–87, 2014.
[5] D. B. Thomas and W. Luk, “The lut-sr family of uniform random number
generators for FPGA architectures,”IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 21, no. 4, pp. 761–770, Apr. 2013.
[6] P. Wieczorek, “Dual-metastability FPGA-based true random number
generator,”Electron. Let., vol. 49, no. 12, pp. 744–745, Jun. 2013.
[7] D. Lubicz and N. Bochard, “Towards an oscilator based trng with a cer-
tifie entropy rate,”IEEE Trans. Comput., vol. 64, no. 4, pp. 1191–1200,
Apr. 2015.
[8] V. B. Suresh and W. Burleson, “Entropy extraction in metastability-based
trng,” inProc. IEEE Int. Symp. Hardware-Oriented Sec. Trust (HOST),
Jun. 2010, pp. 135–140.
[9] M. Fischer and V. Drutarovsky, “True random number generator embed-
ded in reconfigurabl hardware,” inProc. Int. Workshop Cryptogr.
Hardware Embedded Syst. (CHES’02), 2002, vol. 2523, pp. 415–430.
[10] P. Kohlbrenner and K. Gaj, “An embedded true random number genera-
tor for FPGAS,” inProc. 12th Int. Symp. Field Programm. Gate Arrays
(ACM/SIGDA’04), 2004, pp. 71–78.
[11] S. Sunar, W. J. Martin, and D. R. Stinson, “A provably secure true random
number generator with built-in tolerance to active atacks,”IEEE Trans.
Comput., vol. 58, no. 1, pp. 109–119, Jan. 2007.
[12] K. Wold and C. H. Tan, “Analysis and enhancement of random number
generator in FPGA based on oscilator rings,”Int. J. Reconfi . Comput.,
vol. 2009, pp. 4:1–4:8, 2009.
[13] N. Bochard, F. Bernard, V. Fischer, and B. Valtchanov, “True-randomness
and pseudo-randomness in ring oscilator-based true random number gen-
erators,”Int. J. Reconfi . Comp., vol. 2010, 2010, Article ID 879281
[Online]. Available: htp:/dx.doi.org/10.1155/2010/879281
[14] P. Bayonet al., “Contactless electromagnetic active atack on ring oscil-
lator based true random number generator,” inProc. 3rd Int. Workshop
Construct. Side-Channel Anal. Secure Des. (COSADE), 2012, pp. 151–
166.
[15] A. T. Marketos and S. W. Moore, “The frequency injection atack on
ring-oscilator-based true random number generators,” inProc. 11th Int.
Workshop Cryptogr. Hardware Embedded Syst., 2009, pp. 317–331.
[16] A. Cherkaoui, V. Fischer, L. Fesquet, and A. Aubert, “A very high
speed true random number generator with entropy assessment,” inProc.
Int. Workshop Cryptogr. Hardware Embedded Syst. (CHES’13), 2013,
vol. 8086, pp. 179–196.
[17] F. Bernard, V. Fischer, and B. Valtchanov, “Mathematical model of phys-
ical RNGS based on coherent sampling,”Tatra Mt. Math. Publ., vol. 45,
pp. 1–14, 2010.
[18] O. Cret, A. Suciu, and T. Gyorfi “Practical issues in implementing
TRNGS in FPGAS based on the ring oscilator sampling method,”
inProc. 10th Int. Symp. Symbol. Numer. Algorithms Sci. Comput.
(SYNASC’08), 2008, pp. 433–438.
[19] B. Valtchanov, V. Fischer, and A. Aubert, “Enhanced TRNG based on
the coherent sampling,” inProc. 3rd Int. Conf. Signals Circuits Syst.
(SCS’09), 2009, pp. 1–6.
[20] I. E. Sutherland, “Micropipelines,”ACM Commun., vol. 32, no. 6,
pp. 720–738, 1989.
[21] A. Cherkaoui, V. Fischer, A. Aubert, and L. Fesquet, “Comparison of self-
timed ring and inverter ring oscilators as entropy sources in FPGAS,” in
Proc. Des. Autom. Test Eur. Conf. Exhib. (DATE’12), 2012, pp. 1325–
1330.
[22] A. Winstanley and M. Greenstreet, “Temporal properties of self-timed
rings,” inCorrect Hardware Design and Verificatio Methods, vol. 2144.
New York, NY, USA: Springer, 2001, pp. 140–154.
[23] M. Dichtl and J. D. Golic, “High-speed true random number genera-
tion with logic gates only,” inCryptographic Hardware and Embedded
Systems (CHES), vol. 4727, P. Pailier and I. Verbauwhede, Eds. New
York, NY, USA: Springer, 2007, pp. 45–62.
[24] A. Rukhinet al., “A statistical test suite for random and pseudoran-
dom number generators for cryptographic applications,” Natl. Inst. Stand.
Technol., Gaithersburg, MD, USA, Tech. Rep., 2010 [Online]. Available:
htp:/csrc.nist.gov/rng/
[25] P. Haddad, Y. Teglia, F. Bernard, and V. Fischer, “On the assumption of
mutual independence of jiter realizations in P-TRNG stochastic models,”
inProc. Des. Autom. Test Eur. Conf. Exhib. (DATE’14), 2014, pp. 1–6.
[26] J. Walker. (1998). Randomness Batery [Online]. Available:
htp:/www.fourmilab.ch/random/
[27] G. Marsaglia. (1996).The Marsaglia Random Number CDROM
Including the Diehard Batery of Tests of Randomness[Online].
Available: htp:/stat.fsu.edu/pub/diehard
[28] W. Schindler and W. Kilmann, “Evaluation criteria for true
(physical) random number generators used in cryptographic appli-
cations,” inProc. Revised Papers 4th Int. Workshop Cryptogr.
Hardware Embedded Syst., 2003, pp. 431–449 [Online]. Available:
htp:/dl.acm.org/citation.cfm?id=648255.752732
[29] M. Varchola and M. Drutarovsky, “New high entropy element for fpga
based true random number generators,” inCryptographic Hardware
and Embedded Systems (CHES 2010), vol. 6225, S. Mangard and F.-
X. Standaert, Eds. New York, NY, USA: Springer, 2010, pp. 351–365.
[30] T. Guneysu and C. Paar, “Transforming write colisions in block RAMs
into security applications,” inProc. Int. Conf. Field-Programm. Technol.
(FPT’09), Dec. 2009, pp. 128–134.
[31] T. Gyorfi O. Cret, and A. Suciu, “High performance true random num-
ber generator based on FPGA block RAMs,” inProc. IEEE Int. Symp.
Paralel Distrib. Process. (IPDPS’09), May 2009, pp. 1–8.
[32] O. Cret, T. Gyorfi and A. Suciu, “Implementing true random number
generators based on high fanout nets,”Rom. J. Inf. Sci. Technol., vol. 15,
no. 3, pp. 277–298, 2012 [Online]. Available: www.scopus.com
[33] P. Wieczorek, “An FPGA implementation of the resolve time-based true
random number generator with quality control,”IEEE Trans. Circuits
Syst. I: Reg. Papers 9, vol. 61, no. 12, pp. 3450–3459, Dec. 2014.
[34] X. Yang and R. C. C. Cheung, “A complementary architecture for
high-speed true random number generator,” inProc. Int. Conf. Field-
Programm. Technol. (FPT), Dec. 2014, pp. 248–251.
Honorio Martin received the Ph.D. degree in advanced electronics 
systems, from Universidad Carlos II de Madrid, Spain, in 2015.
He is a Postdoctoral Researcher with the Department of Electronics 
Technology, Universidad Carlos II de Madrid, Madrid, Spain. His 
research interests include the study of lightweight cryptography hard-
ware implementations, radio-frequency identificatio systems, and low-
power designs.
Pedro Peris-Lopez received the M.Sc. degree in telecommunications 
engineering and the Ph.D. degree in computer science from Universidad 
Carlos II de Madrid, Spain, in 2007.
He is a Visiting Lecturer with the Department of Computer Science, 
Universidad Carlos II de Madrid, Madrid, Spain. He has authored 
a great number of papers in specialized journals and conference 
proceedings on radio-frequency identificatio systems (RFID), and 
implantable medical devices (IMD). His research interests include pro-
tocols design, primitives design, lightweight cryptography, cryptanalysis, 
RFID, and IMD.
Juan E. Tapiador received the M.Sc. and Ph.D. degrees in computer 
science from the University of Granada, Granada, Spain, in 2000 and 
2004, respectively.
He is an Associate Professor with the Department of Computer 
Science, Universidad Carlos II de Madrid (UC3M), Madrid, Spain. 
Between 2009 and 2011, he was Research Associate with the University 
of York, York, U.K., before joining UC3M. His research interests include 
applied cryptography, and computer and network security.
Enrique San Milan 
10
received the M.Sc. degree in mathematics from La 
Rioja University, Logroño, Spain, in 1996 and the Ph.D. degree in math-
ematics engineering from the Universidad Carlos II de Madrid, Madrid, 
Spain, in 2003.
He is an Associate Professor with the Department of Electronics 
Technology, Universidad Carlos II de Madrid. His research interests 
include hardware design of digital circuits and systems for several field 
(cryptography, biometry, fault tolerant systems, and communications), 
and computer assisted design (CAD) tools for design automation and 
optimization of digital integrated circuits.
