26 research outputs found

    A Power-Gated 8-Transistor Physically Unclonable Function Accelerates Evaluation Speeds

    Get PDF
    \ua9 2023 by the authors.The proposed 8-Transistor (8T) Physically Unclonable Function (PUF), in conjunction with the power gating technique, can significantly accelerate a single evaluation cycle more than 100,000 times faster than a 6-Transistor (6T) Static Random-Access Memory (SRAM) PUF. The 8T PUF is built to swiftly eliminate data remanence and maximise physical mismatch. Moreover, a two-phase power gating module is devised to provide controllable power on/off cycles for the chosen PUF clusters in order to facilitate fast statistical measurements and curb the in-rush current. The architecture and hardware implementation of the power-gated PUF are developed to accommodate fast multiple evaluations of PUF Responses. The fast speed enables a new data processing method, which coordinates Dark-bit masking and Multiple Temporal Majority Voting (TMV) in different Process, Voltage and Temperature (PVT) corners or during field usage, hence greatly reducing the Bit Error Rate (BER) and the hardware penalty for error correction. The designs are based on the UMC 65 nm technology and aim to tape out an Application-Specific Integrated Circuit (ASIC) chip. Post-layout Monte Carlo (MC) simulations are performed with Cadence, and the extracted PUF Responses are processed with Matlab to evaluate the 8T PUF performance and statistical metrics for subsequent inclusion in PUF Responses, which comprise the novelty of this approach

    Within-Die Delay Variation Measurement And Analysis For Emerging Technologies Using An Embedded Test Structure

    Get PDF
    Both random and systematic within-die process variations (PV) are growing more severe with shrinking geometries and increasing die size. Escalation in the variations in delay and power with reductions in feature size places higher demands on the accuracy of variation models. Their availability can be used to improve yield, and the corresponding profitability and product quality of the fabricated integrated circuits (ICs). Sources of within-die variations include optical source limitations, and layout-based systematic effects (pitch, line-width variability, and microscopic etch loading). Unfortunately, accurate models of within-die PVs are becoming more difficult to derive because of their increasingly sensitivity to design-context. Embedded test structures (ETS) continue to play an important role in the development of models of PVs and as a mechanism to improve correlations between hardware and models. Variations in path delays are increasing with scaling, and are increasingly affected by neighborhood\u27 interactions. In order to fully characterize within-die variations, delays must be measured in the context of actual core-logic macros. Doing so requires the use of an embedded test structure, as opposed to traditional scribe line test structures such as ring oscillators (RO). Accurate measurements of within-die variations can be used, e.g., to better tune models to actual hardware (model-to-hardware correlations). In this research project, I propose an embedded test structure called REBEL (Regional dELay BEhavior) that is designed to measure path delays in a minimally invasive fashion; and its architecture measures the path delays more accurately. Design for manufacture-ability (DFM) analysis is done on the on 90 nm ASIC chips and 28nm Zynq 7000 series FPGA boards. I present ASIC results on within-die path delay variations in a floating-point unit (FPU) fabricated in IBM\u27s 90 nm technology, with 5 pipeline stages, used as a test vehicle in chip experiments carried out at nine different temperature/voltage (TV) corners. Also experimental data has been analyzed for path delay variations in short vs long paths. FPGA results on within-die variation and die-to-die variations on Advanced Encryption System (AES) using single pipelined stage are also presented. Other analysis that have been performed on the calibrated path delays are Flip Flop propagation delays for both rising and falling edge (tpHL and tpLH), uncertainty analysis, path distribution analysis, short versus long path variations and mid-length path within-die variation. I also analyze the impact on delay when the chips are subjected to industrial-level temperature and voltage variations. From the experimental results, it has been established that the proposed REBEL provides capabilities similar to an off-chip logic analyzer, i.e., it is able to capture the temporal behavior of the signal over time, including any static and dynamic hazards that may occur on the tested path. The ASIC results further show that path delays are correlated to the launch-capture (LC) interval used to time them. Therefore, calibration as proposed in this work must be carried out in order to obtain an accurate analysis of within-die variations. Results on ASIC chips show that short paths can vary up to 35% on average, while long paths vary up to 20% at nominal temperature and voltage. A similar trend occurs for within-die variations of mid-length paths where magnitudes reduced to 20% and 5%, respectively. The magnitude of delay variations in both these analyses increase as temperature and voltage are changed to increase performance. The high level of within-die delay variations are undesirable from a design perspective, but they represent a rich source of entropy for applications that make use of \u27secrets\u27 such as authentication, hardware metering and encryption. Physical unclonable functions (PUFs) are a class of primitives that leverage within-die-variations as a means of generating random bit strings for these types of applications, including hardware security and trust. Zynq FPGAs Die-to-Die and within-die variation study shows that on average there is 5% of within-Die variation and the range of die-to-Die variation can go upto 3ns. The die-to-Die variations can be explored in much further detail to study the variations spatial dependance. Additionally, I also carried out research in the area data mining to cater for big data by focusing the work on decision tree classification (DTC) to speed-up the classification step in hardware implementation. For this purpose, I devised a pipelined architecture for the implementation of axis parallel binary decision tree classification for meeting up with the requirements of execution time and minimal resource usage in terms of area. The motivation for this work is that analyzing larger data-sets have created abundant opportunities for algorithmic and architectural developments, and data-mining innovations, thus creating a great demand for faster execution of these algorithms, leading towards improving execution time and resource utilization. Decision trees (DT) have since been implemented in software programs. Though, the software implementation of DTC is highly accurate, the execution times and the resource utilization still require improvement to meet the computational demands in the ever growing industry. On the other hand, hardware implementation of DT has not been thoroughly investigated or reported in detail. Therefore, I propose a hardware acceleration of pipelined architecture that incorporates the parallel approach in acquiring the data by having parallel engines working on different partitions of data independently. Also, each engine is processing the data in a pipelined fashion to utilize the resources more efficiently and reduce the time for processing all the data records/tuples. Experimental results show that our proposed hardware acceleration of classification algorithms has increased throughput, by reducing the number of clock cycles required to process the data and generate the results, and it requires minimal resources hence it is area efficient. This architecture also enables algorithms to scale with increasingly large and complex data sets. We developed the DTC algorithm in detail and explored techniques for adapting it to a hardware implementation successfully. This system is 3.5 times faster than the existing hardware implementation of classification.\u2

    Novel Transistor Resistance Variation-based Physical Unclonable Functions with On-Chip Voltage-to-Digital Converter Designed for Use in Cryptographic and Authentication Applications

    Get PDF
    Security mechanisms such as encryption, authentication, and feature activation depend on the integrity of embedded secret keys. Currently, this keying material is stored as digital bitstrings in non-volatile memory on FPGAs and ASICs. However, secrets stored this way are not secure against a determined adversary, who can use specialized probing attacks to uncover the secret. Furthermore, storing these pre-determined bitstrings suffers from the disadvantage of not being able to generate the key only when needed. Physical Unclonable Functions (PUFs) have emerged as a superior alternative to this. A PUF is an embedded Integrated Circuit (IC) structure that is designed to leverage random variations in physical parameters of on-chip components as the source of entropy for generating random and unique bitstrings. PUFs also incorporate an on-chip infrastructure for measuring and digitizing these variations in order to produce bitstrings. Additionally, PUFs are designed to reproduce a bitstring on-demand and therefore eliminate the need for on-chip storage. In this work, two novel PUFs are presented that leverage the random variations observed in the resistance of transistors. A thorough analysis of the randomness, uniqueness and stability characteristics of the bitstrings generated by these PUFs is presented. All results shown are based on an exhaustive testing of a set of 63 chips designed with numerous copies of the PUFs on each chip and fabricated in a 90nm nine-metal layer technology. An on-chip voltage-to-digital conversion technique is also presented and tested on the set of 63 chips. Statistical results of the bitstrings generated by the on-chip digitization technique are compared with that of the voltage-derived bitstrings to evaluate the efficacy of the digitization technique. One of the most important quality metrics of the PUF and the on-chip voltage-to-digital converter, the stability, is evaluated through a lengthy temperature-voltage testing over the range of -40C to +85C and voltage variations of +/- 10% of the nominal supply voltage. The stability of both the bitstrings and the underlying physical parameters is evaluated for the PUFs using the data collected from the hardware experiments and supported with software simulations conducted on the devices. Several novel techniques are proposed and successfully tested that address known issues related to instability of PUFs to changing temperature and voltage conditions, thus rendering our PUFs more resilient to these changing conditions faced in practical use. Lastly, an analysis of the stability to changing temperature and voltage variations of a third PUF that leverages random variations in the resistance of the metal wires in the power and ground grids of a chip is also presented

    Lightweight Digital Hardware Random Number Generators

    Get PDF
    Abstract — Random Number Generator (RNG) plays an essential role in many sensor network systems and applications, such as security and robust communication. We have developed the first digital hardware random number generator (DHRNG). DHRNG has a small footprint and requires ultra-low energy. It uses a new recursive structure that directly targets efficient FPGA implementation. The core idea is to place or extract random values in FPGA configuration bits and randomly connect the building blocks. We present our architecture, introduce accompanying protocols for secure public key communication, and adopt the NIST randomness test on the DHRNG’s output stream. I

    A Physical Unclonable Function Based on Inter-Metal Layer Resistance Variations and an Evaluation of its Temperature and Voltage Stability

    Get PDF
    Keying material for encryption is stored as digital bistrings in non-volatile memory (NVM) on FPGAs and ASICs in current technologies. However, secrets stored this way are not secure against a determined adversary, who can use probing attacks to steal the secret. Physical Unclonable functions (PUFs) have emerged as an alternative. PUFs leverage random manufacturing variations as the source of entropy for generating random bitstrings, and incorporate an on-chip infrastructure for measuring and digitizing the corresponding variations in key electrical parameters, such as delay or voltage. PUFs are designed to reproduce a bitstring on demand and therefore eliminate the need for on-chip storage. In this dissertation, I propose a kind of PUF that measures resistance variations in inter-metal layers that define the power grid of the chip and evaluate its temperature and voltage stability. First, I introduce two implementations of a power grid-based PUF (PG-PUF). Then, I analyze the quality of bit strings generated without considering environmental variations from the PG-PUFs that leverage resistance variations in: 1) the power grid metal wires in 60 copies of a 90 nm chip and 2) in the power grid metal wires of 58 copies of a 65 nm chip. Next, I carry out a series of experiments in a set of 63 chips in IBM\u27s 90 nm technology at 9 TV corners, i.e., over all combination of 3 temperatures: -40oC, 25oC and 85oC and 3 voltages: nominal and +/-10% of the nominal supply voltage. The randomness, uniqueness and stability characteristics of bitstrings generated from PG-PUFs are evaluated. The stability of the PG-PUF and an on-chip voltage-to-digital (VDC) are also evaluated at 9 temperature-voltage corners. I introduce several techniques that have not been previously described, including a mechanism to eliminate voltage trends or \u27bias\u27 in the power grid voltage measurements, as well as a voltage threshold, Triple-Module-Redundancy (TMR) and majority voting scheme to identify and exclude unstable bits

    Comprehensive study of physical unclonable functions on FPGAs: correlation driven Implementation, deep learning modeling attacks, and countermeasures

    Get PDF
    For more than a decade and a half, Physical Unclonable Functions (PUFs) have been presented as a promising hardware security primitive. The idea of exploiting variabilities in hardware fabrication to generate a unique fingerprint for every silicon chip introduced a more secure and cheaper alternative. Other solutions using non-volatile memory to store cryptographic keys, require additional processing steps to generate keys externally, and secure environments to exchange generated keys, which introduce many points of attack that can be used to extract the secret keys. PUFs were addressed in the literature from different perspectives. Many publications focused on proposing new PUF architectures and evaluation metrics to improve security properties like response uniqueness per chip, response reproducibility of the same PUF input, and response unpredictability using previous input/response pairs. Other research proposed attack schemes to clone the response of PUFs, using conventional machine learning (ML) algorithms, side-channel attacks using power and electromagnetic traces, and fault injection using laser beams and electromagnetic pulses. However, most attack schemes to be successful, imposed some restrictions on the targeted PUF architectures, which make it simpler and easier to attack. Furthermore, they did not propose solid and provable enhancements on these architectures to countermeasure the attacks. This leaves many open questions concerning how to implement perfect secure PUFs especially on FPGAs, how to extend previous modeling attack schemes to be successful against more complex PUF architectures (and understand why modeling attacks work) and how to detect and countermeasure these attacks to guarantee that secret data are safe from the attackers. This Ph.D. dissertation contributes to the state of the art research on physical unclonable functions in several ways. First, the thesis provides a comprehensive analysis of the implementation of secure PUFs on FPGAs using manual placement and manual routing techniques guided by new performance metrics to overcome FPGAs restrictions with minimum hardware and area overhead. Then the impact of deep learning (DL) algorithms is studied as a promising modeling attack scheme against complex PUF architectures, which were reported immune to conventional (ML) techniques. Furthermore, it is shown that DL modeling attacks successfully overcome the restrictions imposed by previous research even with the lack of accurate mathematical models of these PUF architectures. Finally, this comprehensive analysis is completed by understanding why deep learning attacks are successful and how to build new PUF architectures and extra circuitry to thwart these types of attacks. This research is important for deploying cheap and efficient hardware security primitives in different fields, including IoT applications, embedded systems, automotive and military equipment. Additionally, it puts more focus on the development of strong intrinsic PUFs which are widely proposed and deployed in many security protocols used for authentication, key establishment, and Oblivious transfer protocols

    A hardware-embedded, delay-based PUF engine designed for use in cryptographic and authentication applications

    Get PDF
    Cryptographic and authentication applications in application-specific integrated circuits (ASICs) and field-programmable gate arrays (FPGAs), as well as codes for the activation of on-chip features, require the use of embedded secret information. The generation of secret bitstrings using physical unclonable functions, or PUFs, provides several distinct advantages over conventional methods, including the elimination of costly non-volatile memory, and the potential to increase the random bits available to applications. In this dissertation, a Hardware-Embedded Delay PUF (HELP) is proposed that is designed to leverage path delay variations that occur in the core logic macros of a chip to create random bitstrings. A thorough discussion is provided of the operational details of an embedded path timing structure called REBEL that is used by HELP to provide the timing functionality upon which HELP relies for the entropy source for the cryptographic quality of the bitstrings. Further details of the FPGA-based implementation used to prove the viability of the HELP PUF concept are included, along with a discussion of the evolution of the techniques employed in realizing the final PUF engine design. The bitstrings produced by a set of 30 FPGA boards are evaluated with regard to several statistical quality metrics including uniqueness, randomness, and stability. The stability characteristics of the bitstrings are evaluated by subjecting the FPGAs to commercial-grade temperature and power supply voltage variations. In particular, this work evaluates the reproducibility of the bitstrings generated at 0C, 25C, and 70C, and 10% of the rated supply voltage. A pair of error avoidance schemes are proposed and presented that provide significant improvements to the HELP PUF\u27s resiliency against bit-flip errors in the bitstrings
    corecore