28 research outputs found

    Mixed radix design flow for security applications

    Get PDF
    The purpose of secure devices, such as smartcards, is to protect sensitive information against software and hardware attacks. Implementation of the appropriate protection techniques often implies non-standard methods that are not supported by the conventional design tools. In the recent decade the designers of secure devices have been working hard on customising the workflow. The presented research aims at collecting the up-to-date experiences in this area and create a generic approach to the secure design flow that can be used as guidance by engineers. Well-known countermeasures to hardware attacks imply the use of specific signal encodings. Therefore, multi-valued logic has been considered as a primary aspect of the secure design. The choice of radix is crucial for multi-valued logic synthesis. Practical examples reveal that it is not always possible to find the optimal radix when taking into account actual physical parameters of multi-valued operations. In other words, each radix has its advantages and disadvantages. Our proposal is to synthesise logic in different radices, so it could benefit from their combination. With respect to the design opportunities of the existing tools and the possibilities of developing new tools that would fill the gaps in the flow, two distinct design approaches have been formed: conversion driven design and pre-synthesis. The conversion driven design approach takes the outputs of mature and time-proven electronic design automation (EDA) synthesis tools to generate mixed radix datapath circuits in an endeavour to investigate the added relative advantages or disadvantages. An algorithm underpinning the approach is presented and formally described together with secure gate-level implementations. The obtained results are reported showing an increase in power consumption, thus giving further motivation for the second approach. The pre-synthesis approach is aimed at improving the efficiency by using multivalued logic synthesis techniques to produce an abstract component-level circuit before mapping it into technology libary. Reed-Muller expansions over Galois field arithmetic have been chosen as a theoretical foundation for this approach. In order to enable the combination of radices at the mathematical level, the multi-valued Reed-Muller expansions have been developed into mixed radix Reed-Muller expansions. The goals of the work is to estimate the potential of the new approach and to analyse its impact on circuit parameters down to the level of physical gates. The benchmark results show the approach extends the search space for optimisation and provides information on how the implemented functions are related to different radices. The theory of two-level radix models and corresponding computation methods are the primary theoretical contribution. It has been implemented in RMMixed tool and interfaced to the standard EDA tools to form a complete security-aware design flow.EThOS - Electronic Theses Online ServiceEPSRCGBUnited Kingdo

    A VLSI synthesis of a Reed-Solomon processor for digital communication systems

    Get PDF
    The Reed-Solomon codes have been widely used in digital communication systems such as computer networks, satellites, VCRs, mobile communications and high- definition television (HDTV), in order to protect digital data against erasures, random and burst errors during transmission. Since the encoding and decoding algorithms for such codes are computationally intensive, special purpose hardware implementations are often required to meet the real time requirements. -- One motivation for this thesis is to investigate and introduce reconfigurable Galois field arithmetic structures which exploit the symmetric properties of available architectures. Another is to design and implement an RS encoder/decoder ASIC which can support a wide family of RS codes. -- An m-programmable Galois field multiplier which uses the standard basis representation of the elements is first introduced. It is then demonstrated that the exponentiator can be used to implement a fast inverter which outperforms the available inverters in GF(2m). Using these basic structures, an ASIC design and synthesis of a reconfigurable Reed-Solomon encoder/decoder processor which implements a large family of RS codes is proposed. The design is parameterized in terms of the block length n, Galois field symbol size m, and error correction capability t for the various RS codes. The design has been captured using the VHDL hardware description language and mapped onto CMOS standard cells available in the 0.8-µm BiCMOS design kits for Cadence and Synopsys tools. The experimental chip contains 218,206 logic gates and supports values of the Galois field symbol size m = 3,4,5,6,7,8 and error correction capability t = 1,2,3, ..., 16. Thus, the block length n is variable from 7 to 255. Error correction t and Galois field symbol size m are pin-selectable. -- Since low design complexity and high throughput are desired in the VLSI chip, the algebraic decoding technique has been investigated instead of the time or transform domain. The encoder uses a self-reciprocal generator polynomial which structures the codewords in a systematic form. At the beginning of the decoding process, received words are initially stored in the first-in-first-out (FIFO) buffer as they enter the syndrome module. The Berlekemp-Massey algorithm is used to determine both the error locator and error evaluator polynomials. The Chien Search and Forney's algorithms operate sequentially to solve for the error locations and error values respectively. The error values are exclusive or-ed with the buffered messages in order to correct the errors, as the processed data leave the chip

    Private and Public-Key Side-Channel Threats Against Hardware Accelerated Cryptosystems

    Get PDF
    Modern side-channel attacks (SCA) have the ability to reveal sensitive data from non-protected hardware implementations of cryptographic accelerators whether they be private or public-key systems. These protocols include but are not limited to symmetric, private-key encryption using AES-128, 192, 256, or public-key cryptosystems using elliptic curve cryptography (ECC). Traditionally, scalar point (SP) operations are compelled to be high-speed at any cost to reduce point multiplication latency. The majority of high-speed architectures of contemporary elliptic curve protocols rely on non-secure SP algorithms. This thesis delivers a novel design, analysis, and successful results from a custom differential power analysis attack on AES-128. The resulting SCA can break any 16-byte master key the sophisticated cipher uses and it\u27s direct applications towards public-key cryptosystems will become clear. Further, the architecture of a SCA resistant scalar point algorithm accompanied by an implementation of an optimized serial multiplier will be constructed. The optimized hardware design of the multiplier is highly modular and can use either NIST approved 233 & 283-bit Kobliz curves utilizing a polynomial basis. The proposed architecture will be implemented on Kintex-7 FPGA to later be integrated with the ARM Cortex-A9 processor on the Zynq-7000 AP SoC (XC7Z045) for seamless data transfer and analysis of the vulnerabilities SCAs can exploit

    Improving Safety of an Automotive AES-GCM Core and its Impact on Side-Channel Protection

    Get PDF
    O incremento do número de componentes eletrónicos e o correspondente aumento do fluxo de dados no setor automóvel levou a uma preocupação crescente com a garantia de segurança dos sistemas eletrónicos, especialmente em sistemas críticos cuja violação seja passível de colocar em causa a integridade do sistema e a segurança das pessoas. A utilização de sistemas que implementam o Advanced Encryption Standard (AES) foi vista como uma solução para este problema, impedindo o acesso indevido aos dados dos veículos, através da sua encriptação. O algoritmo AES não possui atualmente nenhuma vulnerabilidade efetiva, mas o mesmo não acontece com as suas implementações, as quais estão sujeitas a ataques ditos side-channel, onde informações que resultam da operação destas implementações são exploradas na tentativa de descobrir os dados encriptados. A aplicação de núcleos IP no setor automóvel requer que as suas implementações cumpram a norma ISO-26262 de forma a garantir que a sua operação não compromete a segurança do veículo e dos ocupantes. Este cumprimento implica alterações na arquitetura dos sistemas que podem influenciar as características de operação que são normalmente exploradas em ataques para obter informação que eventualmente permita ganhar conhecimento sobre os dados encriptados. Assim, o desenvolvimento das componentes de segurança, na perspetiva da segurança informática da informação e no que se refere à segurança de operação do veículo e dos seus ocupantes, que são ainda consideradas como componentes independentes, podem na verdade estar relacionadas, já que as melhorias introduzidas para incrementar a resiliência a falhas e consequentemente a integridade de operação dos sistemas, podem aumentar a fragilidade do sistema a ataques que comprometam a segurança informática dos dados. O presente trabalho tem como objetivo desenvolver uma arquitetura capaz de atingir as métricas para o nível mais alto de certificação em segurança de acordo com a norma ISSO-26262 (certificação ASIL-D), a partir de uma arquitetura já existente, e comparar as duas arquiteturas em termos de vulnerabilidade a ataques ditos side-channel que exploram o seu consumo de potência dinâmica. Os resultados demonstram que para a arquitetura ASIL-D a identificação de pontos de interesse e de dados relevantes no consumo de potência é mais evidente, o que sugere existir uma maior vulnerabilidade da arquitetura desenvolvida a ataques informáticos desenvolvidos por esse processo.The increase in electronic components and the corresponding increment in the data flow among electronic systems in automotive applications made security one of the main concerns in this sector. The use of IP cores that implement the Advanced Encryption Standard (AES) was seen as a solution to this problem, preventing improper access to vehicle data, through its encryption. The AES algorithm does not currently have any effective vulnerability, but the same does not happen with its implementations, which are subject to side-channel attacks, where information that results from the operation of these implementations is exploited in an attempt to discover the encrypted data. The application of IP cores in the automotive sector requires that the implementations comply with the ISO-26262 standard in order to ensure that their operation does not compromise the vehicle's safety. This compliment implies changes in the core architecture that can influence the characteristics of operation that are normally exploited in attacks. Thus, the development of safety and security components in the automotive sector, which are still considered as independent processes, may be related because safety improvements may cause changes in the system's vulnerability to attacks that can compromise its security. This work aims to develop an architecture capable of reaching the metrics for the highest level of safety certification (ASIL-D), based on an existing architecture, and compare the two architectures in terms of vulnerability to side-channel attacks that exploit their dynamic power consumption. The results show that for the ASIL-D architecture, the identification of points of interest and relevant data on the power consumption traces is more evident, which suggests greater effectiveness of the attacks performed in this architecture

    Energy Efficient Hardware Design for Securing the Internet-of-Things

    Full text link
    The Internet of Things (IoT) is a rapidly growing field that holds potential to transform our everyday lives by placing tiny devices and sensors everywhere. The ubiquity and scale of IoT devices require them to be extremely energy efficient. Given the physical exposure to malicious agents, security is a critical challenge within the constrained resources. This dissertation presents energy-efficient hardware designs for IoT security. First, this dissertation presents a lightweight Advanced Encryption Standard (AES) accelerator design. By analyzing the algorithm, a novel method to manipulate two internal steps to eliminate storage registers and replace flip-flops with latches to save area is discovered. The proposed AES accelerator achieves state-of-art area and energy efficiency. Second, the inflexibility and high Non-Recurring Engineering (NRE) costs of Application-Specific-Integrated-Circuits (ASICs) motivate a more flexible solution. This dissertation presents a reconfigurable cryptographic processor, called Recryptor, which achieves performance and energy improvements for a wide range of security algorithms across public key/secret key cryptography and hash functions. The proposed design employs circuit techniques in-memory and near-memory computing and is more resilient to power analysis attack. In addition, a simulator for in-memory computation is proposed. It is of high cost to design and evaluate new-architecture like in-memory computing in Register-transfer level (RTL). A C-based simulator is designed to enable fast design space exploration and large workload simulations. Elliptic curve arithmetic and Galois counter mode are evaluated in this work. Lastly, an error resilient register circuit, called iRazor, is designed to tolerate unpredictable variations in manufacturing process operating temperature and voltage of VLSI systems. When integrated into an ARM processor, this adaptive approach outperforms competing industrial techniques such as frequency binning and canary circuits in performance and energy.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147546/1/zhyiqun_1.pd

    Cryptographic key distribution in wireless sensor networks: a hardware perspective

    Get PDF
    In this work the suitability of different methods of symmetric key distribution for application in wireless sensor networks are discussed. Each method is considered in terms of its security implications for the network. It is concluded that an asymmetric scheme is the optimum choice for key distribution. In particular, Identity-Based Cryptography (IBC) is proposed as the most suitable of the various asymmetric approaches. A protocol for key distribution using identity based Non-Interactive Key Distribution Scheme (NIKDS) and Identity-Based Signature (IBS) scheme is presented. The protocol is analysed on the ARM920T processor and measurements were taken for the run time and energy of its components parts. It was found that the Tate pairing component of the NIKDS consumes significants amounts of energy, and so it should be ported to hardware. An accelerator was implemented in 65nm Complementary Metal Oxide Silicon (CMOS) technology and area, timing and energy figures have been obtained for the design. Initial results indicate that a hardware implementation of IBC would meet the strict energy constraint of a wireless sensor network node

    Comparative Analysis of Different Aes Implementations for 65nm Technologies

    Get PDF
    Encryption has a strong presence in today's digital electronics with the frequent transmission and storage of sensitive data. NIST selected AES as the standard encryption algorithm and it is commonly used as a fast solution to secure data. When designing VLSI systems, the task of balancing the area, power, and speed is a challenge; hardware encryption is no different. System requirements drive certain performance parameters to the forefront, identifying how to alter a design implementations to meet performance requirements is not always apparent. Multiple resources in this research field have identified AES algorithm features of interest and discussed their impact a few of the design trade spaces, however, a single comparative analysis was lacking. This thesis explores six different AES features key size, mode specificity, round key storage, round unraveling, SBOX implementation, and pipelining. A summarized view of the resulting designs allow readers to quickly analyze how each of the six features impacts speed, power, area, latency and throughput using 65nm process.Electrical Engineerin

    Emerging Design Methodology And Its Implementation Through Rns And Qca

    Get PDF
    Digital logic technology has been changing dramatically from integrated circuits, to a Very Large Scale Integrated circuits (VLSI) and to a nanotechnology logic circuits. Research focused on increasing the speed and reducing the size of the circuit design. Residue Number System (RNS) architecture has ability to support high speed concurrent arithmetic applications. To reduce the size, Quantum-Dot Cellular Automata (QCA) has become one of the new nanotechnology research field and has received a lot of attention within the engineering community due to its small size and ultralow power. In the last decade, residue number system has received increased attention due to its ability to support high speed concurrent arithmetic applications such as Fast Fourier Transform (FFT), image processing and digital filters utilizing the efficiencies of RNS arithmetic in addition and multiplication. In spite of its effectiveness, RNS has remained more an academic challenge and has very little impact in practical applications due to the complexity involved in the conversion process, magnitude comparison, overflow detection, sign detection, parity detection, scaling and division. The advancements in very large scale integration technology and demand for parallelism computation have enabled researchers to consider RNS as an alternative approach to high speed concurrent arithmetic. Novel parallel - prefix structure binary to residue number system conversion method and RNS novel scaling method are presented in this thesis. Quantum-dot cellular automata has become one of the new nanotechnology research field and has received a lot of attention within engineering community due to its extremely small feature size and ultralow power consumption compared to COMS technology. Novel methodology for generating QCA Boolean circuits from multi-output Boolean circuits is presented. Our methodology takes as its input a Boolean circuit, generates simplified XOR-AND equivalent circuit and output an equivalent majority gate circuits. During the past decade, quantum-dot cellular automata showed the ability to implement both combinational and sequential logic devices. Unlike conventional Boolean AND-OR-NOT based circuits, the fundamental logical device in QCA Boolean networks is majority gate. With combining these QCA gates with NOT gates any combinational or sequential logical device can be constructed from QCA cells. We present an implementation of generalized pipeline cellular array using quantum-dot cellular automata cells. The proposed QCA pipeline array can perform all basic operations such as multiplication, division, squaring and square rooting. The different mode of operations are controlled by a single control line

    Parallel Multiplier Designs for the Galois/Counter Mode of Operation

    Get PDF
    The Galois/Counter Mode of Operation (GCM), recently standardized by NIST, simultaneously authenticates and encrypts data at speeds not previously possible for both software and hardware implementations. In GCM, data integrity is achieved by chaining Galois field multiplication operations while a symmetric key block cipher such as the Advanced Encryption Standard (AES), is used to meet goals of confidentiality. Area optimization in a number of proposed high throughput GCM designs have been approached through implementing efficient composite Sboxes for AES. Not as much work has been done in reducing area requirements of the Galois multiplication operation in the GCM which consists of up to 30% of the overall area using a bruteforce approach. Current pipelined implementations of GCM also have large key change latencies which potentially reduce the average throughput expected under traditional internet traffic conditions. This thesis aims to address these issues by presenting area efficient parallel multiplier designs for the GCM and provide an approach for achieving low latency key changes. The widely known Karatsuba parallel multiplier (KA) and the recently proposed Fan-Hasan multiplier (FH) were designed for the GCM and implemented on ASIC and FPGA architectures. This is the first time these multipliers have been compared with a practical implementation, and the FH multiplier showed note worthy improvements over the KA multiplier in terms of delay with similar area requirements. Using the composite Sbox, ASIC designs of GCM implemented with subquadratic multipliers are shown to have an area savings of up to 18%, without affecting the throughput, against designs using the brute force Mastrovito multiplier. For low delay LUT Sbox designs in GCM, although the subquadratic multipliers are a part of the critical path, implementations with the FH multiplier showed the highest efficiency in terms of area resources and throughput over all other designs. FPGA results similarly showed a significant reduction in the number of slices using subquadratic multipliers, and the highest throughput to date for FPGA implementations of GCM was also achieved. The proposed reduced latency key change design, which supports all key types of AES, showed a 20% improvement in average throughput over other GCM designs that do not use the same techniques. The GCM implementations provided in this thesis provide some of the most area efficient, yet high throughput designs to date
    corecore