Search CORE

112 research outputs found

Improving Hardware Implementation of Cryptographic AES Algorithm and the Block Cipher Modes of Operation

Author: Cheng Chu-Wen
Publication venue: ScholarWorks @ UTRGV
Publication date: 01/08/2020
Field of study

With ever increasing Internet traffic, more business and financial transactions are being conducted online. This is even more so during these days of COVID-19 pandemic when traditional businesses such as traditional face to face educational systems have gone online requiring huge amount of data being exchanged over Internet. Increase in the volume of data sent over the Internet has also increased the security vulnerabilities such as challenging the confidentiality of data being sent over the Internet. Due to sheer volume, all data will need to be effectively encrypted. Due to increase in the volume of data, it is also important to have encryption/decryption functions to work at a higher speed to maintain the confidentiality of sensitive data. In this thesis, our goal is to enhance the hardware speed of encryption process of the standard AES scheme and its four variants such as AES-128, AES-192, AES-256 and new AES-512 and implement such functions on an FPGA. We also consider the FPGA implementation of different modes of AES operation. By employing parallelism and pipelining approach, we attempt to speed up various computational components of AES implementations using the Quartus II onto Intel’s FPGA. This approach shows improvement in the response speed, data throughput and latency

Scholarworks@UTRGV Univ. of Texas RioGrande Valley

CUDA capable GPU as an efficient co-processor

Author: Ortega Calle Julián
Publication venue: Escuela de Ingeniería. Departamento de Ingeniería de Sistemas
Publication date: 04/06/2014
Field of study

Repositorio Institucional Universidad EAFIT

High throughput FPGA Implementation of Advanced Encryption Standard Algorithm

Author: Bri Seddik
Oukili Soufiane
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/03/2017
Field of study

The growth of computer systems and electronic communications and transactions has meant that the need for effective security and reliability of data communication, processing and storage is more important than ever. In this context, cryptography is a high priority research area in engineering. The Advanced Encryption Standard (AES) is a symmetric-key criptographic algorithm for protecting sensitive information and is one of the most widely secure and used algorithm today. High-throughput, low power and compactness have always been topic of interest for implementing this type of algorithm. In this paper, we are interested on the development of high throughput architecture and implementation of AES algorithm, using the least amount of hardware possible. We have adopted a pipeline approach in order to reduce the critical path and achieve competitive performances in terms of throughput and efficiency. This approach is effectively tested on the AES S-Box substitution. The latter is a complex transformation and the key point to improve architecture performances. Considering the high delay and hardware required for this transformation, we proposed 7-stage pipelined S-box by using composite field in order to deal with the critical path and the occupied area resources. In addition, efficient AES key expansion architecture suitable for our proposed pipelined AES is presented. The implementation had been successfully done on Virtex-5 XC5VLX85 and Virtex-6 XC6VLX75T Field Programmable Gate Array (FPGA) devices using Xilinx ISE v14.7. Our AES design achieved a data encryption rate of 108.69 Gbps and used only 6361 slices ressource. Compared to the best previous work, this implementation improves data throughput by 5.6% and reduces the used slices to 77.69%

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Parallel arithmetic encryption for high-bandwidth communications on multicore/GPGPU platforms.

Author: Alali Mohamed
Jacquin Ludovic
Roca Vincent
Roch Jean-Louis
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

International audienceIn this work we study the feasibility of high-bandwidth, secure communications on generic machines equipped with the latest CPUs and General-Purpose Graphical Processing Units (GPGPU). We first analyze the suitability of current Nehalem CPU architectures. We show in particular that high performance CPUs are not sufficient by themselves to reach our performance objectives, and that encryption is the main bottleneck. Therefore we also consider the use of GPGPU, and more particularly we measure the bandwidth of the AES ciphering on CUDA. These tests lead us to the conclusion that finding an appropriate solution is extremely difficult

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Cryptographic algorithm acceleration using CUDA enabled GPUs in typical system configurations

Author: Bobrov Maksim
Publication venue: RIT Scholar Works
Publication date: 01/08/2010
Field of study

The need to encrypt data is becoming more and more necessary. As the size of datasets continues to grow, the speed of encryption must increase to keep up or it will become a bottleneck. CUDA GPUs have been shown to offer performance improvements versus conventional CPUs for some data-intensive problems. This thesis evaluates the applicability of CUDA GPUs in accelerating the execution of cryptographic algorithms, which are increasingly used for growing amounts of data and thus will require significantly faster encryption and hashing throughput. Specifically, the CUDA environment was used to implement and experiment with three distinct cryptographic algorithms -- AES, SHA-2, and Keccak -- in order to show the applicability for various cryptographic algorithm classes. They were implemented in a system that emulates the conditions present in a real world environment, and the effects of offloading these tasks from the CPU to the GPU were assessed. Speedups up to 2.6x relative to the CPU were seen for single-kernel AES, but SHA-2 and Keccak did not perform as well as on the GPU as on the CPU. Multi-kernel AES saw speedups over single-kernel AES up to 1.4x, 1.65x, and 1.8x for two, three, and four kernels, respectively. This translates to speedups between 3.6x and 4.7x over CPU implementations of AES. Introducing a CPU load had a minimal effect on throughput whereas a GPU load was seen to decrease throughput by as much as 4%. Overall, CUDA GPUs appear to have potential for improving encryption throughputs if a parallelizable algorithm is selected

RIT Scholar Works

High-performance FPGA architecture for data streams processing on example of IPsec gateway

Author: Korona Mateusz
Rawski Mariusz
Skowron Krzysztof
Trzepiński Mateusz
Publication venue: Electronics and Telecommunications Committee
Publication date: 20/07/2018
Field of study

In modern digital world, there is a strong demand for efficient data streams processing methods. One of application areas is cybersecurity — IPsec is a suite of protocol that adds security to communications at the IP level. This paper presents principles of high-performance FPGA architecture for data streams processing on example of IPsec gateway implementation. Efficiency of the proposed solution allows to use it in networks with data rates of several Gbit/s

International Journal of Electronics and Telecommunications (Warsaw University of Technology)

GPU-based Private Information Retrieval for On-Device Machine Learning Inference

Author: Brooks David
Gupta Udit
Johnson Jeff
Lai Liangzhen
Lam Maximilian
Lee Hsien-Hsin S.
Leontiadis Ilias
Li Yang
Maeng Kiwan
Reddi Vijay Janapa
Rhu Minsoo
Suh G. Edward
Wei Gu-Yeon
Xiong Wenjie
Publication venue
Publication date: 25/09/2023
Field of study

On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the order of 1-10 GBs of data, making them impractical to store on-device. To overcome this barrier, we propose the use of private information retrieval (PIR) to efficiently and privately retrieve embeddings from servers without sharing any private information. As off-the-shelf PIR algorithms are usually too computationally intensive to directly use for latency-sensitive inference tasks, we 1) propose novel GPU-based acceleration of PIR, and 2) co-design PIR with the downstream ML application to obtain further speedup. Our GPU acceleration strategy improves system throughput by more than

20 \times

over an optimized CPU PIR implementation, and our PIR-ML co-design provides an over

5 \times

additional throughput improvement at fixed model quality. Together, for various on-device ML applications such as recommendation and language modeling, our system on a single V100 GPU can serve up to

100,000

queries per second -- a

>100 \times

throughput improvement over a CPU-based baseline -- while maintaining model accuracy

arXiv.org e-Print Archive

Efficient Design Strategies Based on the AES Round Function

Author: Ivica Nikolic
Jérémy Jean
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 17/03/2016
Field of study

We show several constructions based on the AES round function that can be used as building blocks for MACs and authenticated encryption schemes. They are found by a search of the space of all secure constructions based on an efficient design strategy that has been shown to be one of the most optimal among all the considered. We implement the constructions on the latest Intel\u27s processors. Our benchmarks show that on Intel Skylake the smallest construction runs at 0.188 c/B, while the fastest at only 0.125 c/B, i.e. five times faster than AES-128

Cryptology ePrint Archive