Search CORE

13 research outputs found

Automated Design Space Exploration and Datapath Synthesis for Finite Field Arithmetic with Applications to Lightweight Cryptography

Author: Zidaric Nusa
Publication venue: 'University of Waterloo'
Publication date: 11/05/2020
Field of study

Today, emerging technologies are reaching astronomical proportions. For example, the Internet of Things has numerous applications and consists of countless different devices using different technologies with different capabilities. But the one invariant is their connectivity. Consequently, secure communications, and cryptographic hardware as a means of providing them, are faced with new challenges. Cryptographic algorithms intended for hardware implementations must be designed with a good trade-off between implementation efficiency and sufficient cryptographic strength. Finite fields are widely used in cryptography. Examples of algorithm design choices related to finite field arithmetic are the field size, which arithmetic operations to use, how to represent the field elements, etc. As there are many parameters to be considered and analyzed, an automation framework is needed. This thesis proposes a framework for automated design, implementation and verification of finite field arithmetic hardware. The underlying motif throughout this work is “math meets hardware”. The automation framework is designed to bring the awareness of underlying mathematical structures to the hardware design flow. It is implemented in GAP, an open source computer algebra system that can work with finite fields and has symbolic computation capabilities. The framework is roughly divided into two phases, the architectural decisions and the automated design genera- tion. The architectural decisions phase supports parameter search and produces a list of candidates. The automated design generation phase is invoked for each candidate, and the generated VHDL files are passed on to conventional synthesis tools. The candidates and their implementation results form the design space, and the framework allows rapid design space exploration in a systematic way. In this thesis, design space exploration is focused on finite field arithmetic. Three distinctive features of the proposed framework are the structure of finite fields, tower field support, and on the fly submodule generation. Each finite field used in the design is represented as both a field and its corresponding vector space. It is easy for a designer to switch between fields and vector spaces, but strict distinction of the two is necessary for hierarchical designs. When an expression is defined over an extension field, the top-level module contains element signals and submodules for arithmetic operations on those signals. The submodules are generated with corresponding vector signals and the arithmetic operations are now performed on the coordinates. For tower fields, the submodules are generated for the subfield operations, and the design is generated in a top-down fashion. The binding of expressions to the appropriate finite fields or vector spaces and a set of customized methods allow the on the fly generation of expressions for implementation of arithmetic operations, and hence submodule generation. In the light of NIST Lightweight Cryptography Project (LWC), this work focuses mainly on small finite fields. The thesis illustrates the impact of hardware implementation results during the design process of WAGE, a Round 2 candidate in the NIST LWC standardization competition. WAGE is a hardware oriented authenticated encryption scheme. The parameter selection for WAGE was aimed at balancing the security and hardware implementation area, using hardware implementation results for many design decisions, for example field size, representation of field elements, etc. In the proposed framework, the components of WAGE are used as an example to illustrate different automation flows and demonstrate the design space exploration on a real-world algorithm

University of Waterloo's Institutional Repository

Optimized Hardware Implementations of Lightweight Cryptography

Author: Yang Gangqiang
Publication venue: 'University of Waterloo'
Publication date: 05/01/2017
Field of study

Radio frequency identification (RFID) is a key technology for the Internet of Things era. One important advantage of RFID over barcodes is that line-of-sight is not required between readers and tags. Therefore, it is widely used to perform automatic and unique identification of objects in various applications, such as product tracking, supply chain management, and animal identification. Due to the vulnerabilities of wireless communication between RFID readers and tags, security and privacy issues are significant challenges. The most popular passive RFID protocol is the Electronic Product Code (EPC) standard. EPC tags have many constraints on power consumption, memory, and computing capability. The field of lightweight cryptography was created to provide secure, compact, and flexible algorithms and protocols suitable for applications where the traditional cryptographic primitives, such as AES, are impractical. In these lightweight algorithms, tradeoffs are made between security, area/power consumption, and throughput. In this thesis, we focus on the hardware implementations and optimizations of lightweight cryptography and present the Simeck block cipher family, the WG-8 stream cipher, the Warbler pseudorandom number generator (PRNG), and the WGLCE cryptographic engine. Simeck is a new family of lightweight block ciphers. Simeck takes advantage of the good components and design ideas of the Simon and Speck block ciphers and it has three instances with different block and key sizes. We provide an extensive exploration of different hardware architectures in ASICs and show that Simeck is smaller than Simon in terms of area and power consumption. For the WG-8 stream cipher, we explore four different approaches for the WG transformation module, where one takes advantage of constant arrays and the other three benefit from the tower field constructions of the finite field \F_{2^8} and also efficient basis conversion matrices. The results in FPGA and ASICs show that the constant arrays based method is the best option. We also propose a hybrid design to improve the throughput with a little additional hardware. For the Warbler PRNG, we present the first detailed and smallest hardware implementations and optimizations. The results in ASICs show that the area of Warbler with throughput of 1 bit per 5 clock cycles (1/5 bpc) is smaller than that of other PRNGs and is in fact smaller than that of most of the lightweight primitives. We also optimize and improve the throughput from 1/5 bpc to 1 bpc with a little additional area and power consumption. Finally, we propose a cryptographic engine WGLCE for passive RFID systems. We merge the Warbler PRNG and WG-5 stream cipher together by reusing the finite state machine for both of them. Therefore, WGLCE can provide data confidentiality and generate pseudorandom numbers. After investigating the design rationales and hardware architectures, our results in ASICs show that WGLCE meets the constraints of passive RFID systems

University of Waterloo's Institutional Repository

Efficient Hardware Implementations of the Warbler Pseudorandom Number Generator

Author: Gangqiang Yang
Guang Gong
Mark D. Aagaard
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 10/08/2015
Field of study

Pseudorandom number generators (PRNGs) are very important for EPC Class 1 Generation 2 (EPC C1 G2) Radio Frequency Identification (RFID) systems. A PRNG is able to provide a 16-bit random number that is used in many commands of the EPC C1 G2 standard, and it can also be used in future security extensions of the EPC C1 G2 standard, such as mutual authentication protocols between the readers and tags. In this paper, we investigate efficient ASIC hardware implementations of Warbler (a lightweight PRNG), and demonstrate that Warbler can meet the area and power consumption requirements in passive RFID systems. Warbler is built upon three nonlinear feedback shift registers (NLFSRs) and four WG-5 transformation modules. We employ two design options to implement Warbler and three different compilation methods to further optimize the area, maximum operating frequency, and power consumption. We can achieve an area of 498 GEs after the place and route phase in a CMOS 65nm ASIC, with a maximum frequency of 1430 MHz and a total power consumption of 1.239uW at 100 KHz. Accordingly, an area of 534 GEs after the place and route phase, with a maximum frequency of 250 MHz and a total power consumption of 0.296 uW at 100 KHz can be obtained in a CMOS 130nm ASIC. Our results show that the LFSR counter based design is better than the binary counter-based one in terms of area and power consumption. In addition, we show that the areas of WG-5 transformation look-up tables depend on the specific decimation values

CiteSeerX

Cryptology ePrint Archive

Hardware Implementations of the Lightweight Welch-Gong Stream Cipher Family using Polynomial Bases

Author: Sattarov Marat
Publication venue: 'University of Waterloo'
Publication date: 25/01/2019
Field of study

In this thesis we develop a parametrized generic hardware implementation for the Welch-Gong (WG) stream cipher family for low power and low cost applications. WG stream ciphers operate over finite fields, and are comprised of Linear Feedback Shift Register (LFSR) and non-linear WG transformation as filtering function. These stream ciphers provide mathematically proven keystream properties. We begin with design of individual components that perform cryptographic functions. Then we construct WG transformation using these components and perform analysis of dependency between de- sign parameters and circuit area pre place-and-route for ASIC and two FPGAs. We also explored a second implementation approach that uses constant arrays or lookup tables generated with GAP by Zidaric. Finally, instances of the complete cipher of different sizes from WG-5 to WG-16 that output from 1 to 32 bits / cycle are shown, and their performance and area is analyzed for 65nm CMOS technology post place-and-route

University of Waterloo's Institutional Repository

Hardware Implementations of the WG-16 Stream Cipher with Composite Field Arithmetic

Author: Zidaric Nusa
Publication venue: 'University of Waterloo'
Publication date: 18/09/2014
Field of study

The WG stream cipher family consists of stream ciphers based on the Welch-Gong (WG) transformations that are used as a nonlinear filter applied to the output of a linear feedback shift register (LFSR). The aim of this thesis is an exploration of the design space of the WG-16 stream cipher. Five different representations of the field elements were analyzed, namely the polynomial basis representation, the normal basis representation and three isomorphic tower field constructions of F216: F(((22)2)2)2, F(24)4 and F(28)2. Each design option begins with an in-depth description of different field constructions and their impact on the top-level WG transformation circuit. Normal basis representation of elements for each level of the tower was chosen for field constructions F(((22)2)2)2 and F(24)4, and a mixed basis, with polynomial basis for the lower and normal basis for the higher level of the tower for F(28)2. Representation of field elements affects the field arithmetic, which in turn affects the entire design. Targeting high throughput, pipelined architectures were developed, and pipelining was based on the particular field construction: each extension over the prime field offers a new pipelining possibility. Pipelining at a lower level of the tower field reduces the clock period. Most flexible pipelining options are possible for F(((22)2)2)2, a highly regular construction, which permits an algebraic optimization of the WG transformation resulting in two multiplications being removed. High speed, achieved by adequate pipelining granularity, and smaller area due to removed multipliers deem the F(((22)2)2)2 to be the most suitable field construction for the implementation of WG-16. The best WG-16 modules achieve a throughput of 222 Mbit/s with 476 slices used on the Xilinx Spartan-6 FPGA device xc6slx9 (using Xilinx Synthesis Tool (XST) for synthesis and ISE for implementation [47]) and a throughput of 529 Mbit/s with area cost of 12215 GEs for ASIC implementation, using the 65 nm CMOS technology (using Synopsys Design Compiler for synthesis [45] and Cadence SoC Encounter to complete the Place-and-Route phase)

University of Waterloo's Institutional Repository

Security of Ubiquitous Computing Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/02/2021
Field of study

The chapters in this open access book arise out of the EU Cost Action project Cryptacus, the objective of which was to improve and adapt existent cryptanalysis methodologies and tools to the ubiquitous computing framework. The cryptanalysis implemented lies along four axes: cryptographic models, cryptanalysis of building blocks, hardware and software security engineering, and security assessment of real-world systems. The authors are top-class researchers in security and cryptography, and the contributions are of value to researchers and practitioners in these domains. This book is open access under a CC BY license

Directory of Open Access Books (DOAB)

Security of Ubiquitous Computing Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library

Efficient and Secure Implementations of Lightweight Symmetric Cryptographic Primitives

Author: Dinu Dumitru-Daniel
Publication venue: University of Luxembourg, Luxembourg, Luxembourg
Publication date: 29/11/2017
Field of study

This thesis is devoted to efficient and secure implementations of lightweight symmetric cryptographic primitives for resource-constrained devices such as wireless sensors and actuators that are typically deployed in remote locations. In this setting, cryptographic algorithms must consume few computational resources and withstand a large variety of attacks, including side-channel attacks. The first part of this thesis is concerned with efficient software implementations of lightweight symmetric algorithms on 8, 16, and 32-bit microcontrollers. A first contribution of this part is the development of FELICS, an open-source benchmarking framework that facilitates the extraction of comparative performance figures from implementations of lightweight ciphers. Using FELICS, we conducted a fair evaluation of the implementation properties of 19 lightweight block ciphers in the context of two different usage scenarios, which are representatives for common security services in the Internet of Things (IoT). This study gives new insights into the link between the structure of a cryptographic algorithm and the performance it can achieve on embedded microcontrollers. Then, we present the SPARX family of lightweight ciphers and describe the impact of software efficiency in the process of shaping three instances of the family. Finally, we evaluate the cost of the main building blocks of symmetric algorithms to determine which are the most efficient ones. The contributions of this part are particularly valuable for designers of lightweight ciphers, software and security engineers, as well as standardization organizations. In the second part of this work, we focus on side-channel attacks that exploit the power consumption or the electromagnetic emanations of embedded devices executing unprotected implementations of lightweight algorithms. First, we evaluate different selection functions in the context of Correlation Power Analysis (CPA) to infer which operations are easy to attack. Second, we show that most implementations of the AES present in popular open-source cryptographic libraries are vulnerable to side-channel attacks such as CPA, even in a network protocol scenario where the attacker has limited control of the input. Moreover, we describe an optimal algorithm for recovery of the master key using CPA attacks. Third, we perform the first electromagnetic vulnerability analysis of Thread, a networking stack designed to facilitate secure communication between IoT devices. The third part of this thesis lies in the area of side-channel countermeasures against power and electromagnetic analysis attacks. We study efficient and secure expressions that compute simple bitwise functions on Boolean shares. To this end, we describe an algorithm for efficient search of expressions that have an optimal cost in number of elementary operations. Then, we introduce optimal expressions for first-order Boolean masking of bitwise AND and OR operations. Finally, we analyze the performance of three lightweight block ciphers protected using the optimal expressions

Open Repository and Bibliography - Luxembourg

Kommunikation und Bildverarbeitung in der Automation

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

In diesem Open-Access-Tagungsband sind die besten Beiträge des 9. Jahreskolloquiums "Kommunikation in der Automation" (KommA 2018) und des 6. Jahreskolloquiums "Bildverarbeitung in der Automation" (BVAu 2018) enthalten. Die Kolloquien fanden am 20. und 21. November 2018 in der SmartFactoryOWL, einer gemeinsamen Einrichtung des Fraunhofer IOSB-INA und der Technischen Hochschule Ostwestfalen-Lippe statt. Die vorgestellten neuesten Forschungsergebnisse auf den Gebieten der industriellen Kommunikationstechnik und Bildverarbeitung erweitern den aktuellen Stand der Forschung und Technik. Die in den Beiträgen enthaltenen anschaulichen Beispiele aus dem Bereich der Automation setzen die Ergebnisse in den direkten Anwendungsbezug

OAPEN Library

Secure and Efficient Comparisons between Untrusted Parties

Author: Beck Martin
Publication venue
Publication date: 11/09/2018
Field of study

A vast number of online services is based on users contributing their personal information. Examples are manifold, including social networks, electronic commerce, sharing websites, lodging platforms, and genealogy. In all cases user privacy depends on a collective trust upon all involved intermediaries, like service providers, operators, administrators or even help desk staff. A single adversarial party in the whole chain of trust voids user privacy. Even more, the number of intermediaries is ever growing. Thus, user privacy must be preserved at every time and stage, independent of the intrinsic goals any involved party. Furthermore, next to these new services, traditional offline analytic systems are replaced by online services run in large data centers. Centralized processing of electronic medical records, genomic data or other health-related information is anticipated due to advances in medical research, better analytic results based on large amounts of medical information and lowered costs. In these scenarios privacy is of utmost concern due to the large amount of personal information contained within the centralized data. We focus on the challenge of privacy-preserving processing on genomic data, specifically comparing genomic sequences. The problem that arises is how to efficiently compare private sequences of two parties while preserving confidentiality of the compared data. It follows that the privacy of the data owner must be preserved, which means that as little information as possible must be leaked to any party participating in the comparison. Leakage can happen at several points during a comparison. The secured inputs for the comparing party might leak some information about the original input, or the output might leak information about the inputs. In the latter case, results of several comparisons can be combined to infer information about the confidential input of the party under observation. Genomic sequences serve as a use-case, but the proposed solutions are more general and can be applied to the generic field of privacy-preserving comparison of sequences. The solution should be efficient such that performing a comparison yields runtimes linear in the length of the input sequences and thus producing acceptable costs for a typical use-case. To tackle the problem of efficient, privacy-preserving sequence comparisons, we propose a framework consisting of three main parts. a) The basic protocol presents an efficient sequence comparison algorithm, which transforms a sequence into a set representation, allowing to approximate distance measures over input sequences using distance measures over sets. The sets are then represented by an efficient data structure - the Bloom filter -, which allows evaluation of certain set operations without storing the actual elements of the possibly large set. This representation yields low distortion for comparing similar sequences. Operations upon the set representation are carried out using efficient, partially homomorphic cryptographic systems for data confidentiality of the inputs. The output can be adjusted to either return the actual approximated distance or the result of an in-range check of the approximated distance. b) Building upon this efficient basic protocol we introduce the first mechanism to reduce the success of inference attacks by detecting and rejecting similar queries in a privacy-preserving way. This is achieved by generating generalized commitments for inputs. This generalization is done by treating inputs as messages received from a noise channel, upon which error-correction from coding theory is applied. This way similar inputs are defined as inputs having a hamming distance of their generalized inputs below a certain predefined threshold. We present a protocol to perform a zero-knowledge proof to assess if the generalized input is indeed a generalization of the actual input. Furthermore, we generalize a very efficient inference attack on privacy-preserving sequence comparison protocols and use it to evaluate our inference-control mechanism. c) The third part of the framework lightens the computational load of the client taking part in the comparison protocol by presenting a compression mechanism for partially homomorphic cryptographic schemes. It reduces the transmission and storage overhead induced by the semantically secure homomorphic encryption schemes, as well as encryption latency. The compression is achieved by constructing an asymmetric stream cipher such that the generated ciphertext can be converted into a ciphertext of an associated homomorphic encryption scheme without revealing any information about the plaintext. This is the first compression scheme available for partially homomorphic encryption schemes. Compression of ciphertexts of fully homomorphic encryption schemes are several orders of magnitude slower at the conversion from the transmission ciphertext to the homomorphically encrypted ciphertext. Indeed our compression scheme achieves optimal conversion performance. It further allows to generate keystreams offline and thus supports offloading to trusted devices. This way transmission-, storage- and power-efficiency is improved. We give security proofs for all relevant parts of the proposed protocols and algorithms to evaluate their security. A performance evaluation of the core components demonstrates the practicability of our proposed solutions including a theoretical analysis and practical experiments to show the accuracy as well as efficiency of approximations and probabilistic algorithms. Several variations and configurations to detect similar inputs are studied during an in-depth discussion of the inference-control mechanism. A human mitochondrial genome database is used for the practical evaluation to compare genomic sequences and detect similar inputs as described by the use-case. In summary we show that it is indeed possible to construct an efficient and privacy-preserving (genomic) sequences comparison, while being able to control the amount of information that leaves the comparison. To the best of our knowledge we also contribute to the field by proposing the first efficient privacy-preserving inference detection and control mechanism, as well as the first ciphertext compression system for partially homomorphic cryptographic systems

Technische Universität Dresden: Qucosa