25 research outputs found
Physical Time-Varying Transfer Functions as Generic Low-Overhead Power-SCA Countermeasure
Mathematically-secure cryptographic algorithms leak significant side channel
information through their power supplies when implemented on a physical
platform. These side channel leakages can be exploited by an attacker to
extract the secret key of an embedded device. The existing state-of-the-art
countermeasures mainly focus on the power balancing, gate-level masking, or
signal-to-noise (SNR) reduction using noise injection and signature
attenuation, all of which suffer either from the limitations of high power/area
overheads, performance degradation or are not synthesizable. In this article,
we propose a generic low-overhead digital-friendly power SCA countermeasure
utilizing physical Time-Varying Transfer Functions (TVTF) by randomly shuffling
distributed switched capacitors to significantly obfuscate the traces in the
time domain. System-level simulation results of the TVTF-AES implemented in
TSMC 65nm CMOS technology show > 4000x MTD improvement over the unprotected
implementation with nearly 1.25x power and 1.2x area overheads, and without any
performance degradation
High Efficiency Power Side-Channel Attack Immunity using Noise Injection in Attenuated Signature Domain
With the advancement of technology in the last few decades, leading to the
widespread availability of miniaturized sensors and internet-connected things
(IoT), security of electronic devices has become a top priority. Side-channel
attack (SCA) is one of the prominent methods to break the security of an
encryption system by exploiting the information leaked from the physical
devices. Correlational power attack (CPA) is an efficient power side-channel
attack technique, which analyses the correlation between the estimated and
measured supply current traces to extract the secret key. The existing
countermeasures to the power attacks are mainly based on reducing the SNR of
the leaked data, or introducing large overhead using techniques like power
balancing. This paper presents an attenuated signature AES (AS-AES), which
resists SCA with minimal noise current overhead. AS-AES uses a shunt
low-drop-out (LDO) regulator to suppress the AES current signature by 400x in
the supply current traces. The shunt LDO has been fabricated and validated in
130 nm CMOS technology. System-level implementation of the AS-AES along with
noise injection, shows that the system remains secure even after 50K
encryptions, with 10x reduction in power overhead compared to that of noise
addition alone.Comment: IEEE International Symposium on Hardware Oriented Security and Trust
(HOST) 201
Optimized Hardware Implementations of Lightweight Cryptography
Radio frequency identification (RFID) is a key technology for the Internet of Things era. One important advantage of RFID over barcodes is that line-of-sight is not required between readers and tags. Therefore, it is widely used to perform automatic and unique identification of objects in various applications, such as product tracking, supply chain management, and animal identification. Due to the vulnerabilities of wireless communication between RFID readers and tags, security and privacy issues are significant challenges. The most popular passive RFID protocol is the Electronic Product Code (EPC) standard. EPC tags have many constraints on power consumption, memory, and computing capability. The field of lightweight cryptography was created to provide secure, compact, and flexible algorithms and protocols suitable for applications where the traditional cryptographic primitives, such as AES, are impractical. In these lightweight algorithms, tradeoffs are made between security, area/power consumption, and throughput.
In this thesis, we focus on the hardware implementations and optimizations of lightweight cryptography and present the Simeck block cipher family, the WG-8 stream cipher, the Warbler pseudorandom number generator (PRNG), and the WGLCE cryptographic engine.
Simeck is a new family of lightweight block ciphers. Simeck takes advantage of the good components and design ideas of the Simon and Speck block ciphers and it has three instances with different block and key sizes. We provide an extensive exploration of different hardware architectures in ASICs and show that Simeck is smaller than Simon in terms of area and power consumption.
For the WG-8 stream cipher, we explore four different approaches for the WG transformation module, where one takes advantage of constant arrays and the other three benefit from the tower field constructions of the finite field \F_{2^8} and also efficient basis conversion matrices. The results in FPGA and ASICs show that the constant arrays based method is the best option. We also propose a hybrid design to improve the throughput with a little additional hardware.
For the Warbler PRNG, we present the first detailed and smallest hardware implementations and optimizations. The results in ASICs show that the area of Warbler with throughput of 1 bit per 5 clock cycles (1/5 bpc) is smaller than that of other PRNGs and is in fact smaller than that of most of the lightweight primitives. We also optimize and improve the throughput from 1/5 bpc to 1 bpc with a little additional area and power consumption.
Finally, we propose a cryptographic engine WGLCE for passive RFID systems. We merge the Warbler PRNG and WG-5 stream cipher together by reusing the finite state machine for both of them. Therefore, WGLCE can provide data confidentiality and generate pseudorandom numbers. After investigating the design rationales and hardware architectures, our results in ASICs show that WGLCE meets the constraints of passive RFID systems
Energy Efficient Hardware Design for Securing the Internet-of-Things
The Internet of Things (IoT) is a rapidly growing field that holds potential to transform our everyday lives by placing tiny devices and sensors everywhere. The ubiquity and scale of IoT devices require them to be extremely energy efficient. Given the physical exposure to malicious agents, security is a critical challenge within the constrained resources. This dissertation presents energy-efficient hardware designs for IoT security.
First, this dissertation presents a lightweight Advanced Encryption Standard (AES) accelerator design. By analyzing the algorithm, a novel method to manipulate two internal steps to eliminate storage registers and replace flip-flops with latches to save area is discovered. The proposed AES accelerator achieves state-of-art area and energy efficiency.
Second, the inflexibility and high Non-Recurring Engineering (NRE) costs of Application-Specific-Integrated-Circuits (ASICs) motivate a more flexible solution. This dissertation presents a reconfigurable cryptographic processor, called Recryptor, which achieves performance and energy improvements for a wide range of security algorithms across public key/secret key cryptography and hash functions. The proposed design employs circuit techniques in-memory and near-memory computing and is more resilient to power analysis attack. In addition, a simulator for in-memory computation is proposed. It is of high cost to design and evaluate new-architecture like in-memory computing in Register-transfer level (RTL). A C-based simulator is designed to enable fast design space exploration and large workload simulations. Elliptic curve arithmetic and Galois counter mode are evaluated in this work.
Lastly, an error resilient register circuit, called iRazor, is designed to tolerate unpredictable variations in manufacturing process operating temperature and voltage of VLSI systems. When integrated into an ARM processor, this adaptive approach outperforms competing industrial techniques such as frequency binning and canary circuits in performance and energy.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147546/1/zhyiqun_1.pd
FAC-V: an FPGA-Based AES Coprocessor for RISC-V
In the new Internet of Things (IoT) era, embedded Field-Programmable Gate Array (FPGA)
technology is enabling the deployment of custom-tailored embedded IoT solutions for handling
different application requirements and workloads. Combined with the open RISC-V Instruction Set
Architecture (ISA), the FPGA technology provides endless opportunities to create reconfigurable IoT
devices with different accelerators and coprocessors tightly and loosely coupled with the processor.
When connecting IoT devices to the Internet, secure communications and data exchange are major
concerns. However, adding security features requires extra capabilities from the already resource constrained IoT devices. This article presents the FAC-V coprocessor, which is an FPGA-based
solution for an RISC-V processor that can be deployed following two different coupling styles. FAC-V
implements in hardware the Advanced Encryption Standard (AES), one of the most widely used
cryptographic algorithms in IoT low-end devices, at the cost of few FPGA resources. The conducted
experiments demonstrate that FAC-V can achieve performance improvements of several orders of
magnitude when compared to the software-only AES implementation; e.g., encrypting a message of
16 bytes with AES-256 can reach a performance gain of around 8000Ć with an energy consumption
of 0.1 ĀµJ
Recommended from our members
Modeling attack resistant strong physical unclonable functions : design and applications
Physical unclonable functions (PUFs) have great promise as hardware authentication primitives due to their physical unclonability, high resistance to reverse engineering, and difficulty of mathematical cloning. Strong PUFs are distinguished by an exponentially large number of challenge-response pairs (CRPs), in contrast with weak PUFs that have a smaller CRP set. Because the adversary cannot create an enumeration clone by recording all CRPs even when in physical possession of a PUF, strong PUFs enable secure direct authentication, that does not require cryptography and are thus attractive to low-energy and IoT applications. The first contribution of this dissertation is the design of a strong silicon PUF resistant to machine learning (ML) attacks. For a strong PUF to be an effective security primitive, the CRPs need to be unpredictable: given a set of known CRPs, it should be difficult to predict the unobserved CRPs. Otherwise, an adversary can succeed in an attack based on building a model of the PUF. Early strong PUFs have shown vulnerability to ML based attacks. We take advantage of the strongly nonlinear I -- V property of MOSFETs operating in subthreshold region to introduce a highly unpredictable PUF. The PUF, termed the subthreshold current array PUF (SCA-PUF), consists of a pair of two-dimensional transistor arrays, a circuit stabilizing the PUF output, and a low-offset comparator. The proposed 65-bit SCA-PUF is fabricated in a 130nm process and allows 2ā¶āµ CRPs. It consumes 68nW and 11pJ/bit while exhibiting high uniqueness, uniformity, and randomness. It achieves bit error rate (BER) of 5.8% for the temperature range of -20 to +80Ā°C and supply voltage variation of Ā±10%. A calibration-based CRP selection method is developed to improve BER to 0.4% with a 42% loss of CRPs. When subjected to ML attacks, the prediction error stays over 40% on 10ā“ training points, which shows negligible loss in PUF unpredictability and about 100X higher resilience than the 65-bit arbiter PUF, 3-XOR PUF, and 3-XOR lightweight PUF. The second contribution is the application of a strong PUF in a secure key update scheme. Side-channel attacks on cryptographic implementations threaten system security via the loss of the secret key. The adversary can recover the key by analyzing side-channel analog behavior of a cryptographic device, such as power consumption. Fresh re-keying techniques aim to mitigate these attacks by regularly updating the key, so that the side-channel exposure of each key is minimized. Existing key update schemes generate fresh keys by processing a root key using arithmetic operations. Unfortunately, such techniques have been demonstrated to also be vulnerable to side-channel attacks. We propose a novel approach to fresh re-keying that replaces the arithmetic key update function with a strong PUF. We show that the security of our scheme hinges on the resilience of the PUF to a power side-channel attack and propose a realization based on the SCA-PUF. We show that the SCA-PUF is resistant to simple power analysis and a modeling attack that uses ML on the power side-channel. We target an insecure device and secure server encryption scenario for which we provide an efficient and scalable method of PUF enrollment. Finally, we develop an end-to-end encryption system with PUF-based fresh re-keying, using a reverse fuzzy extractor construction. The third contribution is the implementation of a strong PUF provably secure against ML attacks. The security is derived from cryptographic hardness of learning decryption functions of semantically secure public-key cryptosystems within the probably approximately correct framework. The proposed PUF, termed the lattice PUF, compactly realizes the decryption function of the learning-with-errors (LWE) public-key cryptosystem as the core block. The lattice PUF is lightweight and fully digital. It is constructed using a weak PUF, as a physically obfuscated key (POK), an LWE decryption function block, a pseudo-random number generator in the form of a linear-feedback shift register (LFSR), a self-incrementing counter, and a control block. The POK provides the secret key of the LWE decryption function. A fuzzy extractor is utilized to ensure stability of the POK. The proposed lattice PUF significantly improves upon a direct implementation of LWE decryption function in terms of challenge transfer cost by exploiting distributional relaxations allowed by recent work in space-efficient LWEs. Specifically, only a small challenge-seed is transmitted while the full-length challenge is re-generated by the LFSR resulting in a 100X reduction of communication cost. To prevent an active attack in which arbitrary challenges can be submitted, the value of a self-incrementing counter is embedded into the challenge seed. We construct a lattice PUF that realizes a challenge-response pair space of size 2Ā¹Ā³ā¶, requires 1160 POK bits, and guarantees 128-bit ML resistance. Assuming a bit error rate of 5% for SRAM-based POK, 6.5K SRAM cells are needed. The PUF shows excellent uniformity, uniqueness, and reliability. We implement the PUF on a Spartan 6 FPGA. It requires only 45 slices for the lattice PUF proper and 233 slices for the fuzzy extractorElectrical and Computer Engineerin
Optimized hardware implementations of cryptography algorithms for resource-constraint IoT devices and high-speed applications
The advent of technologies, including the Internet and smartphones, has made peopleās lives easier. Nowadays, people get used to digital applications for e-business, communicating with others, and sending or receiving sensitive messages. Sending secure data across the private network or the Internet is an open concern for every person. Cryptography plays an important role in privacy, security, and confidentiality against adversaries. Public-key cryptography (PKC) is one of the cryptography techniques that provides security over a large network, such as the Internet of Things (IoT).
The classical PKCs, such as Elliptic Curve Cryptography (ECC) and Rivest-Shamir-Adleman (RSA), are based on the hardness of certain number theoretic problems. According to Shorās algorithm, these algorithms can be solved very efficiently on a quantum computer, and cryptography algorithms will be insecure and weak as quantum computers increase in number. Based on NIST, Lattice-based cryptography (LBC) is one of the accepted quantum-resistant public-key cryptography. Different variants of LBC include Learning With Error (LWE), Ring Learning With Error (Ring-LWE), Binary Ring Learning with Error (Ring-Bin LWE), and etc. AES is also one of the secure cryptography algorithm that has been widely used in different applications and platforms. Also, AES-256 is secure against quantum attack.
It is very important to design a crypto-system based on the need and application. In general, each network has three different layers; cloud, edge, and end-node. The cloud and edge layer require to have a high-speed crypto-system, as it is used in high-traffic application to encrypt and decrypt data. Unfortunately, most of the end-node devices are resource-constraint and do not have enough area for security guard. Providing end-to-end security is vital for every network. To mitigate this issue, designing and implementing a lightweight cryto-system for resource-constraint devices is necessary.
In this thesis, a high-throughput FPGA implementation of AES algorithm for high-traffic edge applications is introduced. To achieve this goal, some part of the algorithm has been modified to balance the latency. Inner and outer pipelining techniques and loop-unrolling have been employed. The proposed high-speed implementation of AES achieves a throughput of 79.7Gbps, FPGA efficiency of 13.3 Mbps/slice, and frequency of 622.4MHz. Compared to the state-of-the-art work, the proposed design has improved data throughput by 8.02% and FPGA-Eff by 22.63%.
Moreover, a lightweight architecture of AES for resource-constraint devices is designed and implemented on FPGA and ASIC. Each module of the architecture is specified in which occupied less area; and some units are shared among different phases. To reduce the power consumption clock gating technique is applied. Application-specific integrated circuit (ASIC) implementation results show a respective improvement in the area over the previous similar works from 35% to 2.4%. Based on the results and NIST report, the proposed design is a suitable crypto-system for tiny devices and can be supplied by low-power devices.
Furthermore, two lightweight crypto-systems based on Binary Ring-LWE are presented for IoT end-node devices. For one of them, a novel column-based multiplication is introduced. To execute the column-based multiplication only one register is employed to store the intermediate results. The multiplication unit for the other Binary Ring-LWE design is optimized in which the multiplication is executed in less clock cycles. Moreover, to increase the security for end-node devices, the fault resiliency architecture has been designed and applied to the architecture of Binary Ring-LWE. Based on the implementation results and NIST report, the proposed Binary Ring-LWE designs is a suitable crypto-system form resource-constraint devices
Recommended from our members
ENABLING IOT AUTHENTICATION, PRIVACY AND SECURITY VIA BLOCKCHAIN
Although low-power and Internet-connected gadgets and sensors are increasingly integrated into our lives, the optimal design of these systems remains an issue. In particular, authentication, privacy, security, and performance are critical success factors. Furthermore, with emerging research areas such as autonomous cars, advanced manufacturing, smart cities, and building, usage of the Internet of Things (IoT) devices is expected to skyrocket. A single compromised node can be turned into a malicious one that brings down whole systems or causes disasters in safety-critical applications. This dissertation addresses the critical problems of (i) device management, (ii) data management, and (iii) service management in IoT systems. In particular, we propose an integrated platform solution for IoT device authentication, data privacy, and service security via blockchain-based smart contracts. We ensure IoT device authentication by blockchain-based IC traceability system, from its fabrication to its end-of-life, allowing both the supplier and a potential customer to verify an ICās provenance. Results show that our proposed consortium blockchain framework implementation in Hyperledger Fabric for IC traceability achieves a throughput of 35 transactions per second (tps). To corroborate the blockchain information, we authenticate the IC securely and uniquely with an embedded Physically Unclonable Function (PUF). For reliable Weak PUF-based authentication, our proposed accelerated aging technique reduces the cumulative burn-in cost by ā¼ 56%. We also propose a blockchain-based solution to integrate the privacy of data generated from the IoT devices by giving users control of their privacy. The smart contract controlled trust-base ensures that the users have private access to their IoT devices and data. We then propose a remote configuration of IC features via smart contracts, where an IC can be programmed repeatedly and securely. This programmability will enable users to upgrade IC features or rent upgraded IC features for a fixed period after users have purchased the IC. We tailor the hardware to meet the blockchain performance. Our on-die hardware module design enforces the hardware configurationās secure execution and uses only 2,844 slices in the Xilinx Zedboard Zynq Evaluation board. The blockchain framework facilitates decentralized IoT, where interacting devices are empowered to execute digital contracts autonomously