Search CORE

226 research outputs found

A fully pipelined memoryless 17.8 Gbps AES-128 encryptor

Author: Jorma O. Skyttä
Kimmo U. Järvinen
Matti T. Tommiska
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2003
Field of study

A fully pipelined implementation of the Advanced Encryption Stan-dard encryption algorithm with 128-bit input and key length (AES-128) was implemented on Xilinx ’ Virtex-E and Virtex-II devices. The design is called SIG-AES-E and it implements the S-boxes combinatorially and thus requires no internal memory. It is con-cluded, that SIG-AES-E is faster than other published FPGA-based implementations of the AES-128 encryption algorithm. Categories and Subject Descriptor

CiteSeerX

Crossref

High throughput FPGA Implementation of Advanced Encryption Standard Algorithm

Author: Bri Seddik
Oukili Soufiane
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/03/2017
Field of study

The growth of computer systems and electronic communications and transactions has meant that the need for effective security and reliability of data communication, processing and storage is more important than ever. In this context, cryptography is a high priority research area in engineering. The Advanced Encryption Standard (AES) is a symmetric-key criptographic algorithm for protecting sensitive information and is one of the most widely secure and used algorithm today. High-throughput, low power and compactness have always been topic of interest for implementing this type of algorithm. In this paper, we are interested on the development of high throughput architecture and implementation of AES algorithm, using the least amount of hardware possible. We have adopted a pipeline approach in order to reduce the critical path and achieve competitive performances in terms of throughput and efficiency. This approach is effectively tested on the AES S-Box substitution. The latter is a complex transformation and the key point to improve architecture performances. Considering the high delay and hardware required for this transformation, we proposed 7-stage pipelined S-box by using composite field in order to deal with the critical path and the occupied area resources. In addition, efficient AES key expansion architecture suitable for our proposed pipelined AES is presented. The implementation had been successfully done on Virtex-5 XC5VLX85 and Virtex-6 XC6VLX75T Field Programmable Gate Array (FPGA) devices using Xilinx ISE v14.7. Our AES design achieved a data encryption rate of 108.69 Gbps and used only 6361 slices ressource. Compared to the best previous work, this implementation improves data throughput by 5.6% and reduces the used slices to 77.69%

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

FPGA based technical solutions for high throughput data processing and encryption for 5G communication: A review

Author: Fazio R. de
Soto Carolina Del-Valle
Velazquez R.
Visconti P.
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/08/2021
Field of study

The field programmable gate array (FPGA) devices are ideal solutions for high-speed processing applications, given their flexibility, parallel processing capability, and power efficiency. In this review paper, at first, an overview of the key applications of FPGA-based platforms in 5G networks/systems is presented, exploiting the improved performances offered by such devices. FPGA-based implementations of cloud radio access network (C-RAN) accelerators, network function virtualization (NFV)-based network slicers, cognitive radio systems, and multiple input multiple output (MIMO) channel characterizers are the main considered applications that can benefit from the high processing rate, power efficiency and flexibility of FPGAs. Furthermore, the implementations of encryption/decryption algorithms by employing the Xilinx Zynq Ultrascale+MPSoC ZCU102 FPGA platform are discussed, and then we introduce our high-speed and lightweight implementation of the well-known AES-128 algorithm, developed on the same FPGA platform, and comparing it with similar solutions already published in the literature. The comparison results indicate that our AES-128 implementation enables efficient hardware usage for a given data-rate (up to 28.16 Gbit/s), resulting in higher efficiency (8.64 Mbps/slice) than other considered solutions. Finally, the applications of the ZCU102 platform for high-speed processing are explored, such as image and signal processing, visual recognition, and hardware resource management

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

A High-Throughput Hardware Implementation of NAT Traversal For IPSEC VPN

Author: Long Nguyen Van
Nam Tran Sy
Thuc Hoang Van
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 15/04/2022
Field of study

In this paper, we present a high-throughput FPGA implementation of IPSec core. The core supports both NAT and non-NAT mode and can be used in high speed security gateway devices. Although IPSec ESP is very computing intensive for its cryptography process, our implementation shows that it can achieve high throughput and low lantency. The system is realized on the Zynq XC7Z045 from Xilinx and was verified and tested in practice. Results show that the design can gives a peak throughput of 5.721 Gbps for the IPSec ESP tunnel mode in NAT mode and 7.753 Gbps in non-NAT mode using one single AES encrypt core. We also compare the performance of the core when running in other mode of encryption

International Journal of Communication Networks and Information Security (IJCNIS)

A Hardware Perspective on the ChaCha Ciphers: Scalable Chacha8/12/20 Implementations Ranging from 476 Slices to Bitrates of 175 Gbit/s

Author: Becker Jürgen
Harbaum Tanja
Hofmann Klaus
Pfau Johannes
Reuter Maximilian
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2019
Field of study

AES (Advanced Encryption Standard) accelerators are commonly used in high-throughput applications, but they have notable resource requirements. We investigate replacing the AES cipher with ChaCha ciphers and propose the first ChaCha FPGA implementations optimized for data throughput. In consequence, we compare implementations of three different system architectures and analyze which aspects dominate the performance of those.Our experimental results indicate that a bandwidth of 175 Gbit/s can be reached with as little as 2982 slices, whereas comparable state of the art AES accelerators require 10 times as many slices. Taking advantage of the flexibility inherent in the ChaCha cipher, we also demonstrate how our implementation scales to even higher throughputs or lower resource usage (down to 476 slices), benefiting applications which previously could not employ cryptography because of resource limitations

TUbiblio

KITopen

Recommended from our members

Cryptoraptor : high throughput reconfigurable cryptographic processor for symmetric key encryption and cryptographic hash functions

Author: Sayilar Gokhan
Publication venue
Publication date: 03/02/2015
Field of study

textIn cryptographic processor design, the selection of functional primitives and connection structures between these primitives are extremely crucial to maximize throughput and flexibility. Hence, detailed analysis on the specifications and requirements of existing crypto-systems plays a crucial role in cryptographic processor design. This thesis provides the most comprehensive literature review that we are aware of on the widest range of existing cryptographic algorithms, their specifications, requirements, and hardware structures. In the light of this analysis, it also describes a high performance, low power, and highly flexible cryptographic processor, Cryptoraptor, that is designed to support both today's and tomorrow's encryption standards. To the best of our knowledge, the proposed cryptographic processor supports the widest range of cryptographic algorithms compared to other solutions in the literature and is the only crypto-specific processor targeting the future standards as well. Unlike previous work, we aim for maximum throughput for all known encryption standards, and to support future standards as well. Our 1GHz design achieves a peak throughput of 128Gbps for AES-128 which is competitive with ASIC designs and has 25X and 160X higher throughput per area than CPU and GPU solutions, respectively.Electrical and Computer Engineerin

Texas ScholarWorks

Hardware Design and Implementation of Role-Based Cryptography

Author: Fields Scott Edward
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2005
Field of study

Traditional public key cryptographic methods provide access control to sensitive data by allowing the message sender to grant a single recipient permission to read the encrypted message. The Need2Know® system (N2K) improves upon these methods by providing role-based access control. N2K defines data access permissions similar to those of a multi-user file system, but N2K strictly enforces access through cryptographic standards. Since custom hardware can efficiently implement many cryptographic algorithms and can provide additional security, N2K stands to benefit greatly from a hardware implementation. To this end, the main N2K algorithm, the Key Protection Module (KPM), is being specified in VHDL. The design is being built and tested incrementally: this first phase implements the core control logic of the KPM without integrating its cryptographic sub-modules. Both RTL simulation and formal verification are used to test the design. This is the first N2K implementation in hardware, and it promises to provide an accelerated and secured alternative to the software-based system. A hardware implementation is a necessary step toward highly secure and flexible deployments of the N2K system

University of Tennessee, Knoxville: Trace

Area and Energy Optimizations in ASIC Implementations of AES and PRESENT Block Ciphers

Author: Yu Jenny
Publication venue: 'University of Waterloo'
Publication date: 21/05/2020
Field of study

When small, modern-day devices surface with neoteric features and promise benefits like streamlined business processes, cashierless stores, and autonomous driving, they are all too often accompanied by security risks due to a weak or absent security component. In particular, the lack of data privacy protection is a common concern that can be remedied by implementing encryption. This ensures that data remains undisclosed to unauthorized parties. While having a cryptographic module is often a goal, it is sometimes forfeited because a device's resources do not allow for the conventional cryptographic solutions. Thus, smaller, lower-energy security modules are in demand. Implementing a cipher in hardware as an application-specific integrated circuit (ASIC) will usually achieve better efficiency than alternatives like FPGAs or software, and can help towards goals such as extended battery life and smaller area footprint. The Advanced Encryption Standard (AES) is a block cipher established by the National Institute of Standards and Technology (NIST) in 2001. It has since become the most widely adopted block cipher and is applied in a variety of applications ranging from smartphones to passive RFID tags to high performance microprocessors. PRESENT, published in 2007, is a smaller lightweight block cipher designed for low-power applications. In this study, low-area and low-energy optimizations in ASICs are addressed for AES and PRESENT. In the low-area work, three existing AES encryption cores are implemented, analyzed, and benchmarked using a common fabrication technology (STM 65 nm). The analysis includes an examination of various implementations of internal AES operations and their suitability for different architectural choices. Using our taxonomy of design choices, we designed Quark-AES, a novel 8-bit AES architecture. At 1960 GE, it features a 13% improvement in area and 9% improvement in throughput/area² over the prior smallest design. To illustrate the extent of the variations due to the use of different ASIC libraries, Quark-AES and the three analyzed designs are also synthesized using three additional technologies. Even for the same transistor size, different ASIC libraries produce significantly different area results. To accommodate a variety of applications that seek different levels of tradeoffs in area and throughput, we extend all four designs to 16-bit and 32-bit datawidths. In the low-energy work, round unrolling and glitch filtering are applied together to achieve energy savings. Round unrolling, which applies multiple block cipher rounds in a combinational path, reduces the energy due to registers but increases the glitching energy. Glitch filtering complements round unrolling by reducing the amount of glitches and their associated energy consumption. For unrolled designs of PRESENT and AES, two glitch filtering schemes are assessed. One method uses AND-gates in between combinational rounds while the other used latches. Both methods work by allowing the propagation of signals only after they have stabilized. The experiments assess how energy consumption changes with respect to the degree of unrolling, the glitch filtering scheme, the degree of pipelining, the spacing between glitch filters, and the location of glitch filters when only a limited number of them can be applied due to area constraints. While in PRESENT, the optimal configuration depends on all the variables, in a larger cipher such as AES, the latch-based method consistently offers the most energy savings

University of Waterloo's Institutional Repository

Improving Hardware Implementation of Cryptographic AES Algorithm and the Block Cipher Modes of Operation

Author: Cheng Chu-Wen
Publication venue: ScholarWorks @ UTRGV
Publication date: 01/08/2020
Field of study

With ever increasing Internet traffic, more business and financial transactions are being conducted online. This is even more so during these days of COVID-19 pandemic when traditional businesses such as traditional face to face educational systems have gone online requiring huge amount of data being exchanged over Internet. Increase in the volume of data sent over the Internet has also increased the security vulnerabilities such as challenging the confidentiality of data being sent over the Internet. Due to sheer volume, all data will need to be effectively encrypted. Due to increase in the volume of data, it is also important to have encryption/decryption functions to work at a higher speed to maintain the confidentiality of sensitive data. In this thesis, our goal is to enhance the hardware speed of encryption process of the standard AES scheme and its four variants such as AES-128, AES-192, AES-256 and new AES-512 and implement such functions on an FPGA. We also consider the FPGA implementation of different modes of AES operation. By employing parallelism and pipelining approach, we attempt to speed up various computational components of AES implementations using the Quartus II onto Intel’s FPGA. This approach shows improvement in the response speed, data throughput and latency

Scholarworks@UTRGV Univ. of Texas RioGrande Valley

REAL-TIME ADAPTIVE PULSE COMPRESSION ON RECONFIGURABLE, SYSTEM-ON-CHIP (SOC) PLATFORMS

Author: Suarez Hernan
Publication venue
Publication date: 01/12/2015
Field of study

New radar applications need to perform complex algorithms and process a large quantity of data to generate useful information for the users. This situation has motivated the search for better processing solutions that include low-power high-performance processors, efficient algorithms, and high-speed interfaces. In this work, hardware implementation of adaptive pulse compression algorithms for real-time transceiver optimization is presented, and is based on a System-on-Chip architecture for reconfigurable hardware devices. This study also evaluates the performance of dedicated coprocessors as hardware accelerator units to speed up and improve the computation of computing-intensive tasks such matrix multiplication and matrix inversion, which are essential units to solve the covariance matrix. The tradeoffs between latency and hardware utilization are also presented. Moreover, the system architecture takes advantage of the embedded processor, which is interconnected with the logic resources through high-performance buses, to perform floating-point operations, control the processing blocks, and communicate with an external PC through a customized software interface. The overall system functionality is demonstrated and tested for real-time operations using a Ku-band testbed together with a low-cost channel emulator for different types of waveforms

SHAREOK repository