Search CORE

408 research outputs found

Recommended from our members

Cryptoraptor : high throughput reconfigurable cryptographic processor for symmetric key encryption and cryptographic hash functions

Author: Sayilar Gokhan
Publication venue
Publication date: 03/02/2015
Field of study

textIn cryptographic processor design, the selection of functional primitives and connection structures between these primitives are extremely crucial to maximize throughput and flexibility. Hence, detailed analysis on the specifications and requirements of existing crypto-systems plays a crucial role in cryptographic processor design. This thesis provides the most comprehensive literature review that we are aware of on the widest range of existing cryptographic algorithms, their specifications, requirements, and hardware structures. In the light of this analysis, it also describes a high performance, low power, and highly flexible cryptographic processor, Cryptoraptor, that is designed to support both today's and tomorrow's encryption standards. To the best of our knowledge, the proposed cryptographic processor supports the widest range of cryptographic algorithms compared to other solutions in the literature and is the only crypto-specific processor targeting the future standards as well. Unlike previous work, we aim for maximum throughput for all known encryption standards, and to support future standards as well. Our 1GHz design achieves a peak throughput of 128Gbps for AES-128 which is competitive with ASIC designs and has 25X and 160X higher throughput per area than CPU and GPU solutions, respectively.Electrical and Computer Engineerin

Texas ScholarWorks

Baseband analog front-end and digital back-end for reconfigurable multi-standard terminals

Author: Baschirotto
Campi
Castello
Cesura
Guerrieri
Lavagno Luciano
Lodi
Malcovati
Toma
Publication venue
Publication date: 01/01/2006
Field of study

Multimedia applications are driving wireless network operators to add high-speed data services such as Edge (E-GPRS), WCDMA (UMTS) and WLAN (IEEE 802.11a,b,g) to the existing GSM network. This creates the need for multi-mode cellular handsets that support a wide range of communication standards, each with a different RF frequency, signal bandwidth, modulation scheme etc. This in turn generates several design challenges for the analog and digital building blocks of the physical layer. In addition to the above-mentioned protocols, mobile devices often include Bluetooth, GPS, FM-radio and TV services that can work concurrently with data and voice communication. Multi-mode, multi-band, and multi-standard mobile terminals must satisfy all these different requirements. Sharing and/or switching transceiver building blocks in these handsets is mandatory in order to extend battery life and/or reduce cost. Only adaptive circuits that are able to reconfigure themselves within the handover time can meet the design requirements of a single receiver or transmitter covering all the different standards while ensuring seamless inter-interoperability. This paper presents analog and digital base-band circuits that are able to support GSM (with Edge), WCDMA (UMTS), WLAN and Bluetooth using reconfigurable building blocks. The blocks can trade off power consumption for performance on the fly, depending on the standard to be supported and the required QoS (Quality of Service) leve

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Hardware Design and Implementation of Role-Based Cryptography

Author: Fields Scott Edward
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2005
Field of study

Traditional public key cryptographic methods provide access control to sensitive data by allowing the message sender to grant a single recipient permission to read the encrypted message. The Need2Know® system (N2K) improves upon these methods by providing role-based access control. N2K defines data access permissions similar to those of a multi-user file system, but N2K strictly enforces access through cryptographic standards. Since custom hardware can efficiently implement many cryptographic algorithms and can provide additional security, N2K stands to benefit greatly from a hardware implementation. To this end, the main N2K algorithm, the Key Protection Module (KPM), is being specified in VHDL. The design is being built and tested incrementally: this first phase implements the core control logic of the KPM without integrating its cryptographic sub-modules. Both RTL simulation and formal verification are used to test the design. This is the first N2K implementation in hardware, and it promises to provide an accelerated and secured alternative to the software-based system. A hardware implementation is a necessary step toward highly secure and flexible deployments of the N2K system

University of Tennessee, Knoxville: Trace

Within-Die Delay Variation Measurement And Analysis For Emerging Technologies Using An Embedded Test Structure

Author: saqib fareena
Publication venue: UNM Digital Repository
Publication date: 12/09/2014
Field of study

Both random and systematic within-die process variations (PV) are growing more severe with shrinking geometries and increasing die size. Escalation in the variations in delay and power with reductions in feature size places higher demands on the accuracy of variation models. Their availability can be used to improve yield, and the corresponding profitability and product quality of the fabricated integrated circuits (ICs). Sources of within-die variations include optical source limitations, and layout-based systematic effects (pitch, line-width variability, and microscopic etch loading). Unfortunately, accurate models of within-die PVs are becoming more difficult to derive because of their increasingly sensitivity to design-context. Embedded test structures (ETS) continue to play an important role in the development of models of PVs and as a mechanism to improve correlations between hardware and models. Variations in path delays are increasing with scaling, and are increasingly affected by neighborhood\u27 interactions. In order to fully characterize within-die variations, delays must be measured in the context of actual core-logic macros. Doing so requires the use of an embedded test structure, as opposed to traditional scribe line test structures such as ring oscillators (RO). Accurate measurements of within-die variations can be used, e.g., to better tune models to actual hardware (model-to-hardware correlations). In this research project, I propose an embedded test structure called REBEL (Regional dELay BEhavior) that is designed to measure path delays in a minimally invasive fashion; and its architecture measures the path delays more accurately. Design for manufacture-ability (DFM) analysis is done on the on 90 nm ASIC chips and 28nm Zynq 7000 series FPGA boards. I present ASIC results on within-die path delay variations in a floating-point unit (FPU) fabricated in IBM\u27s 90 nm technology, with 5 pipeline stages, used as a test vehicle in chip experiments carried out at nine different temperature/voltage (TV) corners. Also experimental data has been analyzed for path delay variations in short vs long paths. FPGA results on within-die variation and die-to-die variations on Advanced Encryption System (AES) using single pipelined stage are also presented. Other analysis that have been performed on the calibrated path delays are Flip Flop propagation delays for both rising and falling edge (tpHL and tpLH), uncertainty analysis, path distribution analysis, short versus long path variations and mid-length path within-die variation. I also analyze the impact on delay when the chips are subjected to industrial-level temperature and voltage variations. From the experimental results, it has been established that the proposed REBEL provides capabilities similar to an off-chip logic analyzer, i.e., it is able to capture the temporal behavior of the signal over time, including any static and dynamic hazards that may occur on the tested path. The ASIC results further show that path delays are correlated to the launch-capture (LC) interval used to time them. Therefore, calibration as proposed in this work must be carried out in order to obtain an accurate analysis of within-die variations. Results on ASIC chips show that short paths can vary up to 35% on average, while long paths vary up to 20% at nominal temperature and voltage. A similar trend occurs for within-die variations of mid-length paths where magnitudes reduced to 20% and 5%, respectively. The magnitude of delay variations in both these analyses increase as temperature and voltage are changed to increase performance. The high level of within-die delay variations are undesirable from a design perspective, but they represent a rich source of entropy for applications that make use of \u27secrets\u27 such as authentication, hardware metering and encryption. Physical unclonable functions (PUFs) are a class of primitives that leverage within-die-variations as a means of generating random bit strings for these types of applications, including hardware security and trust. Zynq FPGAs Die-to-Die and within-die variation study shows that on average there is 5% of within-Die variation and the range of die-to-Die variation can go upto 3ns. The die-to-Die variations can be explored in much further detail to study the variations spatial dependance. Additionally, I also carried out research in the area data mining to cater for big data by focusing the work on decision tree classification (DTC) to speed-up the classification step in hardware implementation. For this purpose, I devised a pipelined architecture for the implementation of axis parallel binary decision tree classification for meeting up with the requirements of execution time and minimal resource usage in terms of area. The motivation for this work is that analyzing larger data-sets have created abundant opportunities for algorithmic and architectural developments, and data-mining innovations, thus creating a great demand for faster execution of these algorithms, leading towards improving execution time and resource utilization. Decision trees (DT) have since been implemented in software programs. Though, the software implementation of DTC is highly accurate, the execution times and the resource utilization still require improvement to meet the computational demands in the ever growing industry. On the other hand, hardware implementation of DT has not been thoroughly investigated or reported in detail. Therefore, I propose a hardware acceleration of pipelined architecture that incorporates the parallel approach in acquiring the data by having parallel engines working on different partitions of data independently. Also, each engine is processing the data in a pipelined fashion to utilize the resources more efficiently and reduce the time for processing all the data records/tuples. Experimental results show that our proposed hardware acceleration of classification algorithms has increased throughput, by reducing the number of clock cycles required to process the data and generate the results, and it requires minimal resources hence it is area efficient. This architecture also enables algorithms to scale with increasingly large and complex data sets. We developed the DTC algorithm in detail and explored techniques for adapting it to a hardware implementation successfully. This system is 3.5 times faster than the existing hardware implementation of classification.\u2

High-performance AES-128 algorithm implementation by FPGA-based SoC for 5G communications

Author
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 24/03/2021
Field of study

none4siIn this research work, a fast and lightweight AES-128 cypher based on the Xilinx ZCU102 FPGA board is presented, suitable for 5G communications. In particular, both encryption and decryption algorithms have been developed using a pipelined approach, so enabling the simultaneous processing of the rounds on multiple data packets at each clock cycle. Both the encryption and decryption systems support an operative frequency up to 220 MHz, reaching 28.16 Gbit/s maximum data throughput; besides, the encryption and decryption phases last both only ten clock periods. To guarantee the interoperability of the developed encryption/decryption system with the other sections of the 5G communication apparatus, synchronization and control signals have been integrated. The encryption system uses only 1631 CLBs, whereas the decryption one only 3464 CLBs, ascribable, mainly, to the Inverse Mix Columns step. The developed cypher shows higher efficiency (8.63 Mbps/slice) than similar solutions present in literature.openP.Visconti, R. Velazquez, S. Capoccia, R. de FazioVisconti, P.; Velazquez, R.; Capoccia, S.; de Fazio, R

Archivio Istituzionale della Ricerca- Università del Salento

High-performance AES-128 algorithm implementation by FPGA-based SoC for 5G communications

Author: Capoccia Stefano
de Fazio Roberto
Velazquez Ramiro
Visconti Paolo
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/10/2021
Field of study

In this research work, a fast and lightweight AES-128 cypher based on the Xilinx ZCU102 FPGA board is presented, suitable for 5G communications. In particular, both encryption and decryption algorithms have been developed using a pipelined approach, so enabling the simultaneous processing of the rounds on multiple data packets at each clock cycle. Both the encryption and decryption systems support an operative frequency up to 220 MHz, reaching 28.16 Gbit/s maximum data throughput; besides, the encryption and decryption phases last both only ten clock periods. To guarantee the interoperability of the developed encryption/decryption system with the other sections of the 5G communication apparatus, synchronization and control signals have been integrated. The encryption system uses only 1631 CLBs, whereas the decryption one only 3464 CLBs, ascribable, mainly, to the Inverse Mix Columns step. The developed cypher shows higher efficiency (8.63 Mbps/slice) than similar solutions present in literature

ZENODO

Institute of Advanced Engineering and Science

Area-throughput trade-offs for SHA-1 and SHA-256 hash functions’ pipelined designs

Author: Athanasiou George S.
Goutis Costas E.
Kelefouras Vasileios
Michail Harris E.
Stouraitis Thanos
Theodoridis George
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 10/12/2015
Field of study

High-throughput designs of hash functions are strongly demanded due to the need for security in every transmitted packet of worldwide e-transactions. Thus, optimized and non-optimized pipelined architectures have been proposed raising, however, important questions. Which is the optimum number of the pipeline stages? Is it worth to develop optimized designs or could the same results be achieved by increasing only the pipeline stages of the non-optimized designs? The paper answers the above questions studying extensively many pipelined architectures of SHA-1 and SHA-256 hashes, implemented in FPGAs, in terms of throughput/area (T/A) factor. Also, guides for developing efficient security schemes designs are provided. Read More: https://www.worldscientific.com/doi/abs/10.1142/S021812661650032

Ktisis

Sheffield Hallam University Research Archive

Plymouth Electronic Archive and Research Library

High-Performance VLSI Architectures for Lattice-Based Cryptography

Author: Tan Weihang
Publication venue: Clemson University Libraries
Publication date: 01/12/2022
Field of study

Lattice-based cryptography is a cryptographic primitive built upon the hard problems on point lattices. Cryptosystems relying on lattice-based cryptography have attracted huge attention in the last decade since they have post-quantum-resistant security and the remarkable construction of the algorithm. In particular, homomorphic encryption (HE) and post-quantum cryptography (PQC) are the two main applications of lattice-based cryptography. Meanwhile, the efficient hardware implementations for these advanced cryptography schemes are demanding to achieve a high-performance implementation. This dissertation aims to investigate the novel and high-performance very large-scale integration (VLSI) architectures for lattice-based cryptography, including the HE and PQC schemes. This dissertation first presents different architectures for the number-theoretic transform (NTT)-based polynomial multiplication, one of the crucial parts of the fundamental arithmetic for lattice-based HE and PQC schemes. Then a high-speed modular integer multiplier is proposed, particularly for lattice-based cryptography. In addition, a novel modular polynomial multiplier is presented to exploit the fast finite impulse response (FIR) filter architecture to reduce the computational complexity of the schoolbook modular polynomial multiplication for lattice-based PQC scheme. Afterward, an NTT and Chinese remainder theorem (CRT)-based high-speed modular polynomial multiplier is presented for HE schemes whose moduli are large integers

Clemson University: TigerPrints