61 research outputs found

    A Group-Based Ring Oscillator Physical Unclonable Function

    Get PDF
    Silicon Physical Unclonable Function (PUF) is a physical structure of the chip which has functional characteristics that are hard to predict before fabrication but are expected to be unique after fabrication. This is caused by the random fabrication variations. The secret characteristics can only be extracted through physical measurement and will vanish immediately when the chip is powered down. PUF promises a securer means for cryptographic key generation and storage among many other security applications. However, there are still many practical challenges to cost effectively build secure and reliable PUF secrecy. This dissertation proposes new architectures for ring oscillator (RO) PUFs to answer these challenges. First, our temperature-aware cooperative (TAC) RO PUF can utilize certain ROs that were otherwise discarded due to their instability. Second, our novel group-based algorithm can generate secrecy higher than the theoretical upper bound of the conventional pairwise comparisons approach. Third, we build the first regression-based entropy distiller that can turn the PUF secrecy statistically random and robust, meeting the NIST standards. Fourth, we develop a unique Kendall syndrome coding (KSC) that makes the PUF secrecy error resilient against potential environmental fluctuations. Each of these methods can improve the hardware efficiency of the RO PUF implementation by 1.5X to 8X while improving the security and reliability of the PUF secrecy

    The Monte Carlo PUF

    Get PDF
    Physically unclonable functions are used for IP protection, hardware authentication and supply chain security. While many PUF constructions have been put forward in the past decade, only few of them are applicable to FPGA platforms. Strict constraints on the placement and routing are the main disadvantages of the existing PUFs on FPGAs, because they place a high effort on the designer. In this paper we propose a new delay-based PUF construction called Monte Carlo PUF, that does not require low-level placement and routing control. This construction relies on the on-chip Monte Carlo method that is applied for measuring the delays of logic elements in order to extract a unique device fingerprint. The proposed construction allows a trade-off between the evaluation time and the error rate. The Monte Carlo PUF is implemented and evaluated on Xilinx Spartan-6 FPGAs

    Within-Die Delay Variation Measurement And Analysis For Emerging Technologies Using An Embedded Test Structure

    Get PDF
    Both random and systematic within-die process variations (PV) are growing more severe with shrinking geometries and increasing die size. Escalation in the variations in delay and power with reductions in feature size places higher demands on the accuracy of variation models. Their availability can be used to improve yield, and the corresponding profitability and product quality of the fabricated integrated circuits (ICs). Sources of within-die variations include optical source limitations, and layout-based systematic effects (pitch, line-width variability, and microscopic etch loading). Unfortunately, accurate models of within-die PVs are becoming more difficult to derive because of their increasingly sensitivity to design-context. Embedded test structures (ETS) continue to play an important role in the development of models of PVs and as a mechanism to improve correlations between hardware and models. Variations in path delays are increasing with scaling, and are increasingly affected by neighborhood\u27 interactions. In order to fully characterize within-die variations, delays must be measured in the context of actual core-logic macros. Doing so requires the use of an embedded test structure, as opposed to traditional scribe line test structures such as ring oscillators (RO). Accurate measurements of within-die variations can be used, e.g., to better tune models to actual hardware (model-to-hardware correlations). In this research project, I propose an embedded test structure called REBEL (Regional dELay BEhavior) that is designed to measure path delays in a minimally invasive fashion; and its architecture measures the path delays more accurately. Design for manufacture-ability (DFM) analysis is done on the on 90 nm ASIC chips and 28nm Zynq 7000 series FPGA boards. I present ASIC results on within-die path delay variations in a floating-point unit (FPU) fabricated in IBM\u27s 90 nm technology, with 5 pipeline stages, used as a test vehicle in chip experiments carried out at nine different temperature/voltage (TV) corners. Also experimental data has been analyzed for path delay variations in short vs long paths. FPGA results on within-die variation and die-to-die variations on Advanced Encryption System (AES) using single pipelined stage are also presented. Other analysis that have been performed on the calibrated path delays are Flip Flop propagation delays for both rising and falling edge (tpHL and tpLH), uncertainty analysis, path distribution analysis, short versus long path variations and mid-length path within-die variation. I also analyze the impact on delay when the chips are subjected to industrial-level temperature and voltage variations. From the experimental results, it has been established that the proposed REBEL provides capabilities similar to an off-chip logic analyzer, i.e., it is able to capture the temporal behavior of the signal over time, including any static and dynamic hazards that may occur on the tested path. The ASIC results further show that path delays are correlated to the launch-capture (LC) interval used to time them. Therefore, calibration as proposed in this work must be carried out in order to obtain an accurate analysis of within-die variations. Results on ASIC chips show that short paths can vary up to 35% on average, while long paths vary up to 20% at nominal temperature and voltage. A similar trend occurs for within-die variations of mid-length paths where magnitudes reduced to 20% and 5%, respectively. The magnitude of delay variations in both these analyses increase as temperature and voltage are changed to increase performance. The high level of within-die delay variations are undesirable from a design perspective, but they represent a rich source of entropy for applications that make use of \u27secrets\u27 such as authentication, hardware metering and encryption. Physical unclonable functions (PUFs) are a class of primitives that leverage within-die-variations as a means of generating random bit strings for these types of applications, including hardware security and trust. Zynq FPGAs Die-to-Die and within-die variation study shows that on average there is 5% of within-Die variation and the range of die-to-Die variation can go upto 3ns. The die-to-Die variations can be explored in much further detail to study the variations spatial dependance. Additionally, I also carried out research in the area data mining to cater for big data by focusing the work on decision tree classification (DTC) to speed-up the classification step in hardware implementation. For this purpose, I devised a pipelined architecture for the implementation of axis parallel binary decision tree classification for meeting up with the requirements of execution time and minimal resource usage in terms of area. The motivation for this work is that analyzing larger data-sets have created abundant opportunities for algorithmic and architectural developments, and data-mining innovations, thus creating a great demand for faster execution of these algorithms, leading towards improving execution time and resource utilization. Decision trees (DT) have since been implemented in software programs. Though, the software implementation of DTC is highly accurate, the execution times and the resource utilization still require improvement to meet the computational demands in the ever growing industry. On the other hand, hardware implementation of DT has not been thoroughly investigated or reported in detail. Therefore, I propose a hardware acceleration of pipelined architecture that incorporates the parallel approach in acquiring the data by having parallel engines working on different partitions of data independently. Also, each engine is processing the data in a pipelined fashion to utilize the resources more efficiently and reduce the time for processing all the data records/tuples. Experimental results show that our proposed hardware acceleration of classification algorithms has increased throughput, by reducing the number of clock cycles required to process the data and generate the results, and it requires minimal resources hence it is area efficient. This architecture also enables algorithms to scale with increasingly large and complex data sets. We developed the DTC algorithm in detail and explored techniques for adapting it to a hardware implementation successfully. This system is 3.5 times faster than the existing hardware implementation of classification.\u2

    Design and Evaluation of FPGA-based Hybrid Physically Unclonable Functions

    Get PDF
    A Physically Unclonable Function (PUF) is a new and promising approach to provide security for physical systems and to address the problems associated with traditional approaches. One of the most important performance metrics of a PUF is the randomness of its generated response, which is presented via uniqueness, uniformity, and bit-aliasing. In this study, we implement three known PUF schemes on an FPGA platform, namely SR Latch PUF, Basic RO PUF, and Anderson PUF. We then perform a thorough statistical analysis on their performance. In addition, we propose the idea of the Hybrid PUF structure in which two (or more) sources of randomness are combined in a way to improve randomness. We investigate two methods in combining the sources of randomness and we show that the second one improves the randomness of the response, significantly. For example, in the case of combining the Basic RO PUF and the Anderson PUF, the Hybrid PUF uniqueness is increased nearly 8%, without any pre-processing or post-processing tasks required. Two main categories of applications for PUFs have been introduced and analyzed: authentication and secret key generation. In this study, we introduce another important application for PUFs. In fact, we develop a secret sharing scheme using a PUF to increase the information rate and provide cheater detection capability for the system. We show that, using the proposed method, the information rate of the secret sharing scheme will improve significantly

    A hardware-embedded, delay-based PUF engine designed for use in cryptographic and authentication applications

    Get PDF
    Cryptographic and authentication applications in application-specific integrated circuits (ASICs) and field-programmable gate arrays (FPGAs), as well as codes for the activation of on-chip features, require the use of embedded secret information. The generation of secret bitstrings using physical unclonable functions, or PUFs, provides several distinct advantages over conventional methods, including the elimination of costly non-volatile memory, and the potential to increase the random bits available to applications. In this dissertation, a Hardware-Embedded Delay PUF (HELP) is proposed that is designed to leverage path delay variations that occur in the core logic macros of a chip to create random bitstrings. A thorough discussion is provided of the operational details of an embedded path timing structure called REBEL that is used by HELP to provide the timing functionality upon which HELP relies for the entropy source for the cryptographic quality of the bitstrings. Further details of the FPGA-based implementation used to prove the viability of the HELP PUF concept are included, along with a discussion of the evolution of the techniques employed in realizing the final PUF engine design. The bitstrings produced by a set of 30 FPGA boards are evaluated with regard to several statistical quality metrics including uniqueness, randomness, and stability. The stability characteristics of the bitstrings are evaluated by subjecting the FPGAs to commercial-grade temperature and power supply voltage variations. In particular, this work evaluates the reproducibility of the bitstrings generated at 0C, 25C, and 70C, and 10% of the rated supply voltage. A pair of error avoidance schemes are proposed and presented that provide significant improvements to the HELP PUF\u27s resiliency against bit-flip errors in the bitstrings

    Techniques for Improving Security and Trustworthiness of Integrated Circuits

    Get PDF
    The integrated circuit (IC) development process is becoming increasingly vulnerable to malicious activities because untrusted parties could be involved in this IC development flow. There are four typical problems that impact the security and trustworthiness of ICs used in military, financial, transportation, or other critical systems: (i) Malicious inclusions and alterations, known as hardware Trojans, can be inserted into a design by modifying the design during GDSII development and fabrication. Hardware Trojans in ICs may cause malfunctions, lower the reliability of ICs, leak confidential information to adversaries or even destroy the system under specifically designed conditions. (ii) The number of circuit-related counterfeiting incidents reported by component manufacturers has increased significantly over the past few years with recycled ICs contributing the largest percentage of the total reported counterfeiting incidents. Since these recycled ICs have been used in the field before, the performance and reliability of such ICs has been degraded by aging effects and harsh recycling process. (iii) Reverse engineering (RE) is process of extracting a circuit’s gate-level netlist, and/or inferring its functionality. The RE causes threats to the design because attackers can steal and pirate a design (IP piracy), identify the device technology, or facilitate other hardware attacks. (iv) Traditional tools for uniquely identifying devices are vulnerable to non-invasive or invasive physical attacks. Securing the ID/key is of utmost importance since leakage of even a single device ID/key could be exploited by an adversary to hack other devices or produce pirated devices. In this work, we have developed a series of design and test methodologies to deal with these four challenging issues and thus enhance the security, trustworthiness and reliability of ICs. The techniques proposed in this thesis include: a path delay fingerprinting technique for detection of hardware Trojans, recycled ICs, and other types counterfeit ICs including remarked, overproduced, and cloned ICs with their unique identifiers; a Built-In Self-Authentication (BISA) technique to prevent hardware Trojan insertions by untrusted fabrication facilities; an efficient and secure split manufacturing via Obfuscated Built-In Self-Authentication (OBISA) technique to prevent reverse engineering by untrusted fabrication facilities; and a novel bit selection approach for obtaining the most reliable bits for SRAM-based physical unclonable function (PUF) across environmental conditions and silicon aging effects

    DRAM Bender: An Extensible and Versatile FPGA-based Infrastructure to Easily Test State-of-the-art DRAM Chips

    Full text link
    To understand and improve DRAM performance, reliability, security and energy efficiency, prior works study characteristics of commodity DRAM chips. Unfortunately, state-of-the-art open source infrastructures capable of conducting such studies are obsolete, poorly supported, or difficult to use, or their inflexibility limit the types of studies they can conduct. We propose DRAM Bender, a new FPGA-based infrastructure that enables experimental studies on state-of-the-art DRAM chips. DRAM Bender offers three key features at the same time. First, DRAM Bender enables directly interfacing with a DRAM chip through its low-level interface. This allows users to issue DRAM commands in arbitrary order and with finer-grained time intervals compared to other open source infrastructures. Second, DRAM Bender exposes easy-to-use C++ and Python programming interfaces, allowing users to quickly and easily develop different types of DRAM experiments. Third, DRAM Bender is easily extensible. The modular design of DRAM Bender allows extending it to (i) support existing and emerging DRAM interfaces, and (ii) run on new commercial or custom FPGA boards with little effort. To demonstrate that DRAM Bender is a versatile infrastructure, we conduct three case studies, two of which lead to new observations about the DRAM RowHammer vulnerability. In particular, we show that data patterns supported by DRAM Bender uncovers a larger set of bit-flips on a victim row compared to the data patterns commonly used by prior work. We demonstrate the extensibility of DRAM Bender by implementing it on five different FPGAs with DDR4 and DDR3 support. DRAM Bender is freely and openly available at https://github.com/CMU-SAFARI/DRAM-Bender.Comment: To appear in TCAD 202
    corecore