An efficient high speed AES implementation using traditional FPGA and LabVIEW FPGA platforms by Rao, Muzaffar et al.
An efficient high speed AES implementation using 
Traditional FPGA and LabVIEW FPGA platforms 
 
Muzaffar Rao*, Admir Kaknjo, Edin Omerdic, Daniel Toal and Thomas Newe 
Department of Electronic and Computer Engineering 
Centre of Robotics & Intelligent Systems (CRIS) 
University of Limerick, Ireland 
Email: *muzaffar.rao@ul.ie 
 
 
 Abstract – The LabVIEW FPGA platform is based on 
graphical programming approach, which makes easy the FPGA 
programming and the I/O interfacing. The LabVIEW FPGA 
significantly improves the design productivity and helps to reduce 
the time to market. On the other hand, traditional FPGA 
platform is helpful to get an efficient/optimized design by 
providing control over each bit using HDL programming 
languages. This work utilized traditional as well as LabVIEW 
FPGA platforms to get an optimized high speed design of AES 
(Advanced Encryption Standard). The AES is considered to be a 
secure and reliable cryptographic algorithm that is used 
worldwide to provide encryption services, which hide the 
information during communication over untrusted networks, like 
Internet. Here, AES core is proposed to secure the communication 
between ROV (Remotely Operated Vehicle) and control station in 
a marine environment; but this core can be fit in any other high 
speed electronic communications. This work provides encryption 
of 128-bytes, 256-bytes and 512-bytes set of inputs (individually 
and simultaneously) using a 128-bit key. In case of simultaneous 
implementation, all the above mentioned set of inputs is encrypted 
in parallel. This simultaneous implementation is resulted in 
throughput of Gbps range. 
   
 Index Terms – AES, FPGA, LabVIEW FPGA, high speed 
 
I.  INTRODUCTION 
 AES (Advanced Encryption Standard) [1] is a broadly 
accepted/approved symmetric encryption algorithm, which is 
used to secure the electronic communication. Symmetric 
cipher used the same key for encryption and decryption; it 
means that the sender and the receiver must both use the 
same secret key. AES was selected in 2001 by the NIST 
(National Institute of Standards and Technology), after a long 
standardization and evaluation process [2]. AES replaced its 
predecessor DES (Data Encryption Standard), which is no 
longer considered secure [3]. 
 Generally, AES can be implemented in hardware or 
software. The software solutions are normally of slow speed as 
compared to the hardware solutions [4]. This slow speed is 
because of the high execution overhead for each individual 
operation in microprocessor. The hardware solutions provide 
two platforms namely; ASIC (Application Specific Integrated 
Circuit) and FPGA (Field Programmable Gate Array). ASICs 
can provide highly optimized design in terms of area, speed 
and power but problem with ASIC design is that it cannot be 
changed after the manufacturing of the chip. FPGA is the 
platform that fills the gap between ASICs and software 
(microprocessors) by providing high speed as well as high 
flexibility (re-configurability). That’s why FPGA is used here 
for AES implementation. 
 FPGAs are ICs that contain large numbers of unconnected 
logical elements/LUTs, whose function can be determined by 
downloading a bitstream to the FPGA that determines how the 
gates interconnect. The FPGA dynamically turns 
semiconductor switches on and off in accordance with the 
wiring list, making the connections modifiable in the field. 
Because of their moderate cost, re-configurability, and high 
operating speed, FPGAs are the fastest-growing way to 
implement digital hardware. Nowadays, FPGAs are 
everywhere [5] like; in design, test and control. 
 The Xilinx tools used the HDL language to 
program/configure the traditional Xilinx FPGAs [6][7][8]. 
This approach provides much control over FPGA that helps to 
design an optimized core within the targeted constraints. But, 
this approach cannot be used without the deep understanding 
of HDL languages. Also there are more complications involve 
regarding I/O interfacing and communicating data. The rapid 
rise in the use of FPGA has created a need for FPGA design 
tools that help designers to quickly become productive. This 
need is fulfilled by the LabVIEW FPGA.  
 LabVIEW FPGAs [9] are based on graphical 
programming approach and do no used any HDL languages. 
This platform has direct control over all of the I/O signals. 
LabVIEW FPGA significantly reduces the expertise and time 
required for application development. But, this approach is not 
helpful to get an optimized design because of the high level 
graphical programing approach and due to not having closed 
control over FPGA resources. For majority of LabVIEW 
FPGA developers this will not be a problem, but when we 
need an optimized design it can be a significant hurdle. To 
solve this issue, LabVIEW FPGA provides an option called 
IPIN (IP Integration Node) [10] that is used to import the 
optimized design of traditional FPGAs platform. This IPIN 
node creates the interface between LabVIEW FPGA and the 
physical interface.  
 This work used IPIN node to import the optimized AES 
core design into LabVIEW FPGA. The IPIN node is used to 
take advantage of both traditional and LabVIEW FPGA 
platforms. Here, AES is implemented in two phases. In first 
phase traditional FPGA approach is used by writing an 
optimized HDL program; while, in second phase the first 
phase design is imported into LabVIEW FPGA using above 
mentioned IPIN node. A parallel architecture is used to 
provide simultaneous encryption of multiple set of inputs like 
128-bytes, 256-bytes and 512-bytes. This encryption is done 
using 128-bit key. Here, the AES core is mainly designed to 
secure the Marine ROV communication with control station; 
but the proposed design can be fit in any electronic 
communication to provide encryption security services. 
 This paper is organized as follows: the AES algorithm is 
discussed in section II, traditional FPGA and LabVIEW FPGA 
are compared in section III. Brief discussion on why FPGA is 
suited for cryptographic algorithm is presented in section IV. 
The marine application is discussed in section V, while 
proposed AES implementation is detailed in section VI, 
Performance results and conclusion are given in section VII 
and section VIII respectively. 
 
II. ADVANCED ENCRYPTION STANDARD (AES) 
  
 As mentioned earlier, AES is the safest choice for 
encryption, as it is the standard used by the US government 
and backed by the leading experts in cryptography. Since the 
AES was introduced, attackers are continuously trying to break 
it; but not succeeded. The only possible practical successful 
attacks against AES are side-channel attacks [11] that based on 
weaknesses found in the implementation or key management 
of specific AES based encryption. 
 AES involves three block ciphers: AES-128 (this model is 
used here), AES-192 and AES-256. Each cipher encrypts and 
decrypts data in blocks of 128 bits using keys of 128, 192 and 
256-bits respectively. All key lengths are considered sufficient 
to protect the secret information. There are different rounds for 
each key; like 10 rounds for 128-bit keys, 12 rounds for 192-
bit keys and 14 rounds for 256-bit keys. Each round consists of 
substitution, transposition and mixing of the 
input plaintext and transforms it into the final output 
of ciphertext. The first step of the AES is to put the data into 
an array; and then AES transformations are repeated over a 
number of encryption rounds. The first transformation is called 
“Byte Substitution” operation, which involves substitution of 
data using a substitution table; the second transformation is 
called “Shift Row” operation that shifts data rows, the third is 
called “Mix Columns” which mixes the columns. The last 
transformation is called “Add Round Key” operation in which 
simple XOR operation is performed on each column using 
different round keys. Detail of each transformation operation is 
given in [1]. 
 The AES has excellent characteristics of confusion and 
diffusion. The confusion is provided by the “Byte 
Substitution” operation by using the S-box, which is very non-
linear and helps in destroying the patterns. The diffusion is 
achieved by using the “Shift Row” and “Mix Column” 
operations in which influence of each plaintext bit is spread 
over many cipher text bits. To increase the amount of 
scrambling, the confusion and diffusion are repeated a number 
of times for each input. The secret key is mixed at every stage 
using “Add Round Key” operation to make sure that an 
attacker cannot pre-calculate what the cipher does. A 
generalized AES structure is given in Fig. 1, in which Pre-
round transformation involves XOR operation of plaintext 
with initial key and the last round transformation applied three 
operations that include “Byte Substitution”, “Shift Row” and 
“Add Round Key”. The difference between intermediate 
rounds and the last round transformations is of “Mix Column” 
operation that is included in all intermediate rounds in addition 
of three other above mentioned operations of final rounds. 
 There are a number of block cipher (AES) mode of 
operations like; ECB, CBC, CFB, OFB and CTR. A mode of 
operation describes how to repeatedly apply a cipher's single-
block operation to securely transform amounts of data larger 
than a block. Here, CTR mode (counter mode) is used because 
of its fully parallelizing architecture (for both encryption and 
decryption) that ultimately provides high throughput. In CTR 
mode, both encryption and decryption depend only on 
Encryption technique. So, there is no need to implement 
inverse functions and key scheduling for decryption. The CTR 
mode encrypts the concatenated input of the Nonce (also 
called initial value) and incremental counter (each of 64-bit). 
At the end, the encrypted output of concatenated input is 
XORed with the plaintext to get the ciphertext. The CTR mode 
operation is given in Fig. 2.   
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Fig. 1 AES Structure 
 
Pre-round Transformation 
         Round 1 
         Round 2 
  
  
  
  
  
  
K
ey
 E
x
p
an
si
o
n
 
Round Keys 
 (128-bit) 
Plaintext (128-bit) 
Ciphertext (128-bit) 
. 
. 
. 
K1 
K0 
K2 
. 
. 
. 
Kn 
r -r  r f r ti  
           
           
 Round n (Last round)  
  
  
  
  
  
  
  
  
  
  
  
K
ey
 E
x
p
an
si
o
n
  
  
  
  
  
  
  
  
  
 
 
 
 
 
 
 
 
 
 
 
 
III. TRADITIONAL FPGA VS LABVIEW FPGA 
 
 FPGA is an integrated circuit that can be easily 
reconfigured by designers to perform a completely different 
function. FPGA consists of thousands of CLBs (configurable 
logic blocks) that are connected using programmable 
interconnects. After reconfiguration, these CLBs and                                
programmable interconnects are changed to build a new digital 
circuit. The best way to select the FPGA for targeted 
application is to compare the FPGA devices on the basis of 
supported clock frequency and space (in terms of slices and 
BRAMs). The traditional FPGA and LabVIEW FPGA 
platforms are discussed below.   
A. Traditional FPGA 
 As mentioned earlier, the traditional FPGA designing 
platform is based on hardware description languages (VHDL 
and Verilog). These low level languages are evolved as the 
primary languages for architecting the circuit to run on the 
FPGA chip. The design is synthesized, mapped, placed and 
routed using Xilinx tools and internal signals are 
wired/connected to the external I/O ports. The true parallel 
nature of the task execution on this FPGA platform is hard to 
visualize in a sequential line-by-line flow. HDLs are based on 
a dataflow model where I/O is connected to a series of 
function blocks. 
 To verify the traditional FPGA logic a test bench in HDL 
in written to wrap around and exercise the FPGA design by 
declaring inputs and confirming outputs. The test bench is run 
in a simulation environment that also provides the hardware 
timing performance of the FPGA chip. After designing and  
 
 
 
verification, the FPGA design is feed into a compilation tool 
that run several complex steps starting from synthesize to 
generating bit file. The resulted bit file contains information on 
how the components should be wired together. Also, there is 
an option to specify a mapping of signal names to the targeted 
FPGA pins.  The traditional FPGA architecture overview is 
shown in Fig. 3. 
  To avoid the complexities involve in traditional FPGAs 
and to enable graphical programming (instead of HDL), 
LabVIEW FPGAs are introduced.     
B. LabVIEW FPGA 
 The development of the graphical high level design tools, 
such as LabVIEW, has removed the major hurdles of the 
traditional FPGA platform. The LabVIEW programming 
environment is suited for FPGA programming to those who do 
not have HDL expertise. The graphical programming 
environment clearly represents parallelism and data flow. So 
users, who are experienced and inexperienced in traditional 
FPGA design, can easily work on LabVIEW FPGA. To 
simulate and verify the behavior of FPGA logic, LabVIEW 
offers features directly in the development environment. The 
test benches can be created without the knowledge of low level 
HDL languages. In addition, the flexibility of the LabVIEW 
environment helps more advanced users to model the timing 
and logic of their designs by exporting to cycle-accurate 
simulators. LabVIEW FPGA compilation process generates 
reports and errors (if any) as the compilation stages are 
completed. Like, if timing errors occur because of the FPGA 
design, LabVIEW highlights these critical paths graphically to 
accelerate the debugging process. In short, LabVIEW FPGA is 
fully equipped with built-in simulation capabilities and 
debugging tools that can catch as many implementation errors 
Fig. 2 AES CTR mode operation 
a) Encryption 
b) Decryption 
as possible before completion of compilation. LabVIEW 
FPGA platform is shown in Fig. 4. 
 LabVIEW FPGA provides IP developed by NI (National 
Instrument) and Xilinx, for basic functions such as counters or 
more advanced algorithms such as video decoding. But, still 
the LabVIEW design cannot compete the traditional design in 
terms of optimization. This issue is resolved in LabVIEW 
FPGA platform by using IPIN node. By using IPIN, an 
optimized design can be imported into LabVIEW FPGA to 
make post-design process easy and to link the design with high 
level LabVIEW applications.  
 
 
Fig. 3 Different Parts of traditional FPGA platform 
 
 
 
Fig. 4 Different Parts of LabVIEW FPGA platform 
 
IV. WHY FPGA IS NEEDED FOR CRYPTOGRAPHIC ALGORITHMS? 
 
 An efficient implementation of cryptographic algorithms 
like AES is a challenging task. In general, at least the 
following requirements must be satisfied for the 
implementation of cryptographic algorithms.  
A. Speed 
 The cryptographic algorithms should be implemented in 
such a way that it does not slow down a system considerably. 
These algorithms are computationally intensive that’s why 
their fast implementation need is obvious [12]. This high speed 
requirement can be achieves by using FPGA because of its 
high performance features; while, it’s difficult to achieve the 
required speed (performance) using software solutions 
[13][14]. 
B. Resources 
 Using FPGA, cryptographic algorithms can be 
implemented by targeting the required resources of FPGA for 
a specific environment. Like, in the resource constraint 
environment, tight constraints are set to the available 
resources. In this type of implementation only a small portion 
of the available resources is dedicated for cryptography.   
C. Easy to use for an end user 
 Cryptographic algorithms should be implemented in such 
a way that the implementation should be easy to use for an end 
user. Without this, it is likely that the algorithms will not be 
used or will be misused in a way that compromises security. 
This requirement rarely has any direct impact on hardware 
implementations because the user interface is usually the 
software engineer’s work rather than the hardware engineer. 
However, the hardware designer can take into account user’s 
comfort indirectly through other requirements, such as by 
using efficient implementation methods to provide optimized 
design to fulfil user requirement. The “Ease to use for an end 
user” requirement can be easily achieved by using the 
LabVIEW FPGA.  
D. Exactness 
 Cryptographic algorithms should be implemented exactly 
according to their specifications, as cryptographic algorithms 
will not work correctly; even in the case of one bit being 
incorrect. However, optimizations inside the algorithm are 
possible as long as they do not change the output. This 
optimization can be easily achieved using FPGA with exact 
implementation of algorithm.  
E. Security 
 Cryptographic algorithms should be implemented in such 
a way that they should not leak any information, which can 
possibly compromise data security.  In FPGA implementation, 
the chances of information leak are possible only through side 
channel attacks, but these chances are minimized in case of 
high speed implementation, that is presented here. 
 
V. MARINE APPLICATION   
 
 The AES core presented here is designed to provide 
secure communication between ROV and control station of 
marine application [15]. This marine application is developed 
to control a mini ROV through the Internet. This mini ROV 
has been used as a demonstration platform to validate the 
feasibility of long endurance deployment. This application has 
been proven to be simple, robust and stable. 
  The targeted marine application enables the development 
of a completely new generation of “Inspection ROV (I-ROV)”: 
low cost, long endurance robotic systems for routine 
inspection of marine renewable energy devices and offshore 
subsea oil & gas installations controlled from a remote control 
station through the Internet in real time. The main features of 
I-ROV include long endurance deployment on site, remote 
control, detecting and reporting abnormalities, faults and 
issues for repair and maintenance, monitoring of the offshore 
plant without the expense of mobilizing ships. 
 This developed marine application involves video/control 
signals communication. This communication between ROV 
and Control Station is through the Internet, which is an 
unsecure communication. Presented security solutions are 
proposed to secure this communication.  
VI. PROPOSED AES IMPLEMENTATION 
 
 As mentioned in section I, the proposed AES 
implementation is divided in two phases: 
A. Phase 1: Designing of an optimized AES core on 
traditional FPGA platform. This phase involves 
following steps: 
1. AES transform operations implementation 
2. Parallel architecture for 128-bytes, 256-bytes 
and 512-bytes encryption. 
3. Parallel architecture for simultaneous 
execution of 128-bytes, 256-bytes and 512- 
bytes. 
B. Phase 2: Import “Phase 1” design into LabVIEW 
FPGA. 
A. Phase 1 
1. AES transform operations implementation  
 The AES transform operations are implemented in the 
sequence as mentioned in section II. The Key can be changed 
at any time during the operation by fulfilling the requirement 
that both sides of transmission should have the same key. 
Initially, plain text and the initial key is XORed by using initial 
“Add Round Key” operation. Meanwhile, the key bytes are 
also applied to the key expansion module to generate the 
respective round keys. The key-expansion module is 
implemented by using the combinational logic. The 
corresponding bytes of the key, which need to be substituted 
(using S-box) are first applied to the BRAMs module; and the 
resultant substituted 4 bytes are applied to the key-expansion 
module. The Key expansion is shown in Fig. 5.  After the 
“Initial Round Key” operation, the 128-bit resultant state of 
the XOR operation is saved into 16 registers, each of 8-bit. 
These registers are updated whenever the intermediate 128-bit 
state is available. The data of these 16 registers is applied to an 
instantiated module of the BRAMs, which is used to 
implement the “Byte Substitution” operation of the AES. This 
instantiated block consists of 8 BRAMs. BRAMs [16] are the 
dedicated memory blocks available inside the FPGAs. The 
number of BRAMs available inside the FPGAs varies 
depending upon the FPGA devices. The BRAMs can be 
configured as a single port/dual port RAM or ROM for 
efficient data storage. Here, BRAMs are generated by using 
the Xilinx CORE Generator tool and configured as a dual-port 
ROM to store the S-box of AES. Using dual-port BRAMs, two 
8-bit look-up values are accessed at a time corresponding to 
the two 8-bit input addresses respectively. Both ports of the 
BRAMs are clocked with the same clock. Here, the reason for 
using 8 dual-port BRAMs is to get the substituted output for 
all the 16-bytes within a single clock cycle. This helps to get 
better throughput and also reduce the number of clock cycles. 
The S-box values are initialized in BRAMs using a coefficient 
(COE) file. The 16 intermediate registers are divided into 
pairs, and each pair of the two 8-bit registers become 
addresses of a single BRAM. In this way, the data of all 
sixteen 8-bit registers become addresses of the BRAMs, while 
the corresponding look-up values are taken from the respective 
output ports of the BRAMs. All the resultant bytes of the 
BRAMs are applied to the “Mix Column” operation in such a 
way that these bytes fulfilled the rotation bit requirement of 
“Shift Row” operation. This technique utilized zero logic 
resources of the FPGA for the execution of the “Shift Row” 
operation and helps to get the better throughput and low area 
consumption. In the “Mix Column” operation, the updated 
state is multiplied with the multiplication matrix [1]. Here, for 
multiplication a shift and XOR method [17] is used. The 
updated bytes of “Mix Column” operation are XORed with the 
round key to implement the “Add Round Key” operation. The 
resultant bytes of “Add Round Key” operation are stored in the 
previously used 16 intermediate registers and remaining 
rounds are completed using an iterative approach. Here, AES 
CTR mode is used to get better throughput. The Pseudocode 
of Fig. 6 is used for implementation of the AES transform 
operations. 
 
 
 
 
 
  
 
 
    
                                  Fig. 5 AES Key Expansion 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
 
 
              Fig. 6 Pseudocode for AES Encryption  
START 
0: If (reset) 
- Reset all outputs/intermediate registers 
- Go to state 0. 
  Else if (enable)    
- Take concatenated input of Nonce and counter 
for encryption 
- Go to state 1 
        Else 
- Go to state 0 and wait for “enable” signal 
1:  If (Key_done)      
- Execute Initial “Add Round Key” operation 
- Go to state 2 
        Else 
- Go to state 1 and wait until completion of the 
key generation scheme. 
 
2: If (rounds < 10)  
   - Execute “Byte substitution” / “Shift Row”  
   -  Go to state 3 
  Else 
- Execute final round (Round without Mix Column) 
- Go to state 5 
 
3:  -       Execute “Mix Column” operation 
- Go to state 4 
 
4:  - Execute “Add Round Key” (using round keys) 
- Increment round counter 
- Go to state 2 
 
5:   -      (resulted cipher) XOR (plain text)  
    -  Go to state 0 
END 
 
 
 
  Key Expansion     
        Scheme 
AES input Key 
  
AES 1st Round Key 
AES last Round Key 
      Key_ Done 
. 
. 
. 
. 
. 
 
2. Parallel architecture for 128-bytes, 256-bytes and    
512-bytes encryption. 
 To encrypt 128-bytes, four blocks of AES are executed in 
parallel; while each block provides encryption of 256-bits. To 
provide 256-bit encryption, pseudocode of Fig. 6 is used two 
times with different concatenated (counter and Nonce) 
combination. It means that at state 5 of Fig. 6, another 
concatenated input is provided and same sequence is repeated 
again. In this way same hardware resources are used for each 
128-bit encryption. 
 Parallel execution of four blocks (each of 256-bit 
encryption), as shown in Fig. 7, provides encryption of 128-
bytes (1024-bit) input. This parallel execution helps to provide 
better throughput. In each AES 128-bit encryption, a unique 
combination of Nonce and counter (each of 64-bit) is used.  
The block diagram of Fig. 7 is further extended to eight and 
sixteen blocks for the encryption of 256-bytes and 512-bytes 
of encryption. 
   
3. Parallel architecture for simultaneous execution of 
128-bytes, 256-bytes and 512- bytes. 
 For simultaneous execution of each set of inputs (128-
bytes, 256-bytes and 512-bytes), these inputs are taken 
separately and applied to the respective encryption blocks, as 
shown in Fig. 8. To provide control over the selection of 
desired input encryption an “Enable” option is provided to 
select each set of inputs. Pseudocode of Fig. 9 is added in Fig. 
6 to make possible this simultaneous execution.  
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
 && -> logical AND gate logic 
 ~ -> Not gate logic 
 
Fig. 9 Pseudocode for simultaneous execution of different set of inputs 
 
B. Phase 2: Import “Phase 1” design into LabVIEW FPGA. 
 As mentioned earlier that this work used both FPGA 
platforms i.e. traditional FPGA and LabVIEW FPGA. 
Proposed implementation in “phase 1” provides an optimized 
high speed AES design. Now, to use this design with 
LabVIEW application (marine application) the “phase 1” core 
is imported into LabVIEW FPGA. For this purpose IPIN node, 
mentioned in section III, is used.  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
If (enable128 && ~enable256 && ~enable512) 
 Execute 128-bytes encryption  
Else if (~enable128   && enable256 && ~enable512) 
       Execute 256-bytes encryption 
Else if (~enable128   && ~ enable256 && enable512) 
  Execute 512-bytes encryption 
Else if (enable128   && enable256 && ~enable512) 
        Execute 128-bytes and 256-bytes encryption 
        Set zero the out of 512-bytes. 
Else if (enable128   && ~enable256 && enable512) 
        Execute 128-bytes and 512-bytes encryption 
  Set zero the out of 256-bytes.  
Else if (~enable128   && enable256 && enable512) 
        Execute 256-bytes and 512-bytes encryption 
        Set zero the out of 128-bytes. 
Else if (enable128   && enable256 && enable512) 
 Execute 128-bytes, 256-bytes and 512-bytes encryption 
Else    Set all encryption output zero 
   Wait for selection of desired encryption 
 
 
 
 
 
 
 
 
Fig. 8 Simultaneous execution of AES 128-bytes, AES 256-bytes and AES 512-bytes Encryption. 
 
Fig. 7 AES Encryption for 128-bytes (1024-bits) input 
 
  Key 
128-bit 
Plain text 
  [0:255] 
  Plain text 
  [256:511] 
  Plain text 
  [512:767] 
   Plain text 
  [768:1023] 
       AES 
Encryption 
       AES 
Encryption 
     AES 
 Encryption 
  
     AES 
Encryption 
Cipher text 
  [0:255] 
 Cipher text 
  [256:511] 
 Cipher text 
  [512:767] 
   Cipher text 
    [768:1023] 
 
AES_512_bytes 
    Encryption 
 
AES_128_bytes 
    Encryption 
 
AES_256_bytes 
    Encryption 
 
      Key 
[0:127] 
       Clk      Reset 
Enable 128-bytes Encryption 
Enable 256-bytes Encryption 
 
Enable 512-bytes Encryption 
 
                Input128-bytes [0:1023] 
            Input256-bytes [0:2047] 
         Input512-bytes [0:4095] 
 
Done 128-bytes Encryption 
Done 256-bytes Encryption 
Done 512-bytes Encryption 
Output128-bytes [0:1023] 
Output256-bytes [0:2047] 
Output512-bytes [0:4095] 
 
Each block of ‘AES Encryption’ 
performs encryption two times. First 
encrypts the initial 128-bit plain text 
and then encrypts remaining 128- bits 
of plain text. In this way 256-bit 
encryption of plain text is performed by 
each block 
 
 
Fig. 10 IPIN Node for simultaneous execution of different set of input bytes 
 
 This IPIN node accepts vhdl or synthesized files (NGC 
file) generated on traditional FPGA platform. Here, HDL 
design was implemented in Verilog; therefore, NGC files are 
generated using Xilinx tool and these NGC files are imported 
into LabVIEW FPGA, where these NGC files are compiled 
and a bit file is generated. The IPIN node executes only within 
single cycle timed loop. Detail steps to import traditional 
design into LabVIEW FPGA is given in [10]. Here, traditional 
design of 128-bytes, 256-bytes and 512-bytes inputs are 
imported individually into IPIN node and also simultaneous 
execution of these inputs is imported separately. The IPIN 
node for simultaneous execution of selected inputs is shown in 
Fig. 10.     
VII. PERFORMANCE RESULTS 
 
 The proposed implementation used Xilinx 14.2 tool as 
traditional FPGA platform to develop/generate NGC files and 
the LabVIEW version of 2017 is used. The NI cRIO-9034 [18] 
is used as LabVIEW FPGA platform. This cRIO-9034 device 
has kintex-7 (XC7K325T) FPGA. 
 Proposed implementations are tested/verified on both 
FPGA platforms; traditional FPGA used ISM simulator and 
the LabVIEW FPGA used FPGA-VI. The supported 
frequencies of targeted LabVIEW FPGA include 40MHz, 
80MHz, 120MHz, 160MHz and 200MHz. The proposed 
implementation is tested on these frequencies to check the 
maximum functional frequency for the proposed AES core. 
Performance results with frequency and resources utilization 
are summarized in Table I. As shown in Table 1, the numbers 
of BRAMs are further optimized during 
implementation/placed & routed phase. 
 The number of clock cycles utilized by each block of Fig. 
7 is 122. As these blocks are executed in parallel; so, 
irrespective of the selected input bytes (128-bytes, 256-bytes 
or 512-bytes) all these input bytes encryption generate the 
ciphertext in 122 clock cycles. The throughput can be 
calculated by using (1). 
 
Throughput = (Block Size) * (Frequency) / (Number of clock cycles)     (1) 
 
TABLE I 
                    PERFORMANCE RESULTS OF PROPOSED AES IMPLEMENTATION 
 
Platform AES Design 
Frequency         
(MHz) 
Area        
(Slices) 
BRAMs 
Throughput      
(Gbps) 
Kintex-7 
128-bytes 200 5110 (10%) 20    (4.5%) 1.67 
256-bytes 200 6768 (13.3%) 36    (8.1%) 3.35 
512-bytes 160 11187(22.0%) 68 (15.3%) 5.37 
Simultaneous 
execution  
(128/256/512) 
 
160 13588 (26.7%) 68 (15.3%) 1.34/2.68/5.37 
 
VIII. CONCLUSION 
 
 This work provides an efficient AES implementation by 
using traditional FPGA and LabVIEW FPGA platforms. 
FPGAs are the best suited hardware platforms for the 
implementation of cryptographic algorithms due to its re-
configurability and high performance feature. The traditional 
FPGA platform is used to write the HDL program to get an 
optimized design and then this design is imported into 
LabVIEW FPGA. In this way an optimized design can be 
easily used with any high level LabVIEW applications.   
 Here, a high speed AES core is proposed to provide 
encryption security for the marine application which involves 
communication between ROV and control station. The 
proposed AES design support individual and simultaneous 
implementation of 128-bytes, 256-bytes and 512-bytes 
encryption using parallel architecture. The pipeline 
architecture was avoided because it requires the buffer storage 
to pass on the output of each stage to the next and this will 
result in more area utilization. The proposed implementation 
can be used to provide security for high speed application, as it 
resulted in Gbps speed. The padding is needed in case of 
encryption of data blocks other than 128-bytes, 256-bytes and 
512 bytes. 
 Future work includes designing of LabVIEW application 
to measure latency (with/without security) between ROV and 
control station. This setup will be tested for latency 
measurement of control as well as video signals during 
transmission over local network and Internet. 
ACKNOWLEDGMENT 
 The authors would like to thank the SFI Centre for Marine 
and Renewable Energy Ireland (MaREI), Grant References: 
12/RC/2302 and 14/SP/2740 for facilitating this work. 
 
 
  
 
REFERENCES 
 
1. NIST, “ADVANCED ENCRYPTION STANDARD (AES),” 2001. 
[Online]. Available: http://csrc.nist.gov/publications/fips/fips197/fips-
197.pdf. [Accessed: 16-Apr-2018]. 
2. K. Lee, “Advanced Encryption Standard (AES) Selection Process – How 
Rijndael Won” March, 2015. [online] Available : 
https://www.usna.edu/Users/math/wdj/_files/documents/sm473-
capstone/Rinjdael-16WeekFinalDraft-KevinLee.pdf [Accessed: 16-Apr-
2018]. 
3. M. Curtin, Cracking the Data Encryption Standard. Springer New York, 
2005. 
4. T. Todman; G. Constantinides; S. Wilton; O. Mencer; W. Luk,; and P. 
Cheung,  “Reconfigurable computing: architectures and design methods,” 
IEE Proceedings: Computers and Digital Techniques, vol. 152, no. 2, 
Mar. 2005, pp. 193–207. 
5. N. Instrument, “FPGAs Are Everywhere – In Design, Test & Control”. 
[Online]. Available: http://www.ni.com/newsletter/50401/en/[Accessed: 
16-Apr-2018]. 
6. V. Sklyarov, I. Skliarova, and A. Sudnitson, “FPGA-based systems in 
information and communication,” 2011 5th Int. Conf. Appl. Inf. 
Commun. Technol., pp. 1–5, 2011. 
7. G. Donzellini and D. Ponta, “From gates to FPGA: Learning digital 
design with Deeds,” Proc. 3rd Interdiscip. Eng. Des. Educ. Conf. IEDEC 
2013, pp. 41–48, 2013. 
8. “FPGA design flow overview,” 2009. [Online]. Available: http://www. 
fpgacentral.com/docs/fpga-tutorial/fpga-design-flow-overview [Accessed: 
16-Apr-2018]. 
9. N. Instrument, “Getting Started With LabVIEW FPGA” 2018. [Online]. 
Available: http://www.ni.com/tutorial/14532/en/ [Accessed: 16-Apr-
2018]. 
10. N. Instrument, “Importing External IP Into LabVIEW FPGA” [Online]. 
Available: http://www.ni.com/tutorial/7444/en/. [Accessed: 16-Apr-2018] 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11. D. technologies, “An introduction to side-channel attacks” [Online]. 
Available: 
http://gauss.ececs.uc.edu/Courses/C653/lectures/SideC/intro.pdf 
[Accessed: 16-Apr-2018] 
12. S. Ravi; A. Raghunathan; P. Kocher and S. Hattangady,  “Security in 
embedded systems: Design challenges,” ACM Transactions on 
Embedded Computing Systems, vol. 3, no. 3, Aug. 2004, pp. 461–491. 
13. P. Hamalainen , Cryptographic Security Design and Hardware 
Architectures for Wireless Local Area Networks, Ph.D. thesis, Tampere 
University of Technology, 2006. 
14. K. Jarvinen, “STUDIES ON HIGH-SPEED HARDWARE 
IMPLEMENTATION OF CRYPTOGRAPHIC ALGORITHMS,” 
Helsinki University of Technology, 2008. 
15. Omerdic, T. Toal, G. Dooly, and A. Kaknjo, Remote Presence: Long 
Endurance Robotic Systems for Routine Inspection of Offshore Subsea 
Oil & Gas Installations and Marine Renewable Energy Devices,” in 
Oceans - St. John’s, 2014, pp. 1–9. 
16. Xilinx, “7 Series FPGAs Memory Resources,” 2014. [Online]. Available: 
http://www.xilinx.com/support/documentation/user_guides/ug473_7Serie
s_Memory_Resources.pdf. [Accessed: 16-Apr-2018] 
17. K. Xintong, “Understanding AES Mix-Columns Transformation 
Calculation.” [Online]. Available: 
http://www.angelfire.com/biz7/atleast/mix_columns.pdf. [Accessed: 16-
Apr-2018]. 
18. N. Instrument, “cRIO-9034” [Online]. Available: http://www.ni.com/en-
ie/support/model.crio-9034.html 
 
 
 
 
 
 
