6 research outputs found

    Improving Security and Reliability of Physical Unclonable Functions Using Machine Learning

    Get PDF
    Physical Unclonable Functions (PUFs) are promising security primitives for device authenti-cation and key generation. Due to the noise influence, reliability is an important performance metric of PUF-based authentication. In the literature, lots of efforts have been devoted to enhancing PUF reliability by using error correction methods such as error-correcting codes and fuzzy extractor. Ho-wever, one property that most of these prior works overlooked is the non-uniform distribution of PUF response across different bits. This wok proposes a two-step methodology to improve the reliability of PUF under noisy conditions. The first step involves acquiring the parameters of PUF models by using machine lear-ning algorithms. The second step then utilizes these obtained parameters to improve the reliability of PUFs by selectively choosing challenge-response pairs (CRPs) for authentication. Two distinct algorithms for improving the reliability of multiplexer (MUX) PUF, i.e., total delay difference thresholding and sensitive bits grouping, are presented. It is important to note that the methodology can be easily applied to other types of PUFs as well. Our experimental results show that the relia-bility of PUF-based authentication can be significantly improved by the proposed approaches. For example, in one experimental setting, the reliability of an MUX PUF is improved from 89.75% to 94.07% using total delay difference thresholding, while 89.30% of generated challenges are stored. As opposed to total delay difference thresholding, sensitive bits grouping possesses higher efficiency, as it can produce reliable CRPs directly. Our experimental results show that the reliability can be improved to 96.91% under the same setting, when we group 12 bits in the challenge vector of a 128-stage MUX PUF. Besides, because the actual noise varies greatly in different conditions, it is hard to predict the error of of each individual PUF response bit. This wok proposes a novel methodology to improve the efficiency of PUF response error correction based on error-rates. The proposed method first obtains the PUF model by using machine learning techniques, which is then used to predict the error-rates. Intuitively, we are inclined to tolerate errors in PUF response bits with relatively higher error-rates. Thus, we propose to treat different PUF response bits with different degrees of error tolerance, according to their estimated error-rates. Specifically, by assigning optimized weights, i.e., 0, 1, 2, 3, and infinity to PUF response bits, while a small portion of high error rates responses are truncated; the other responses are duplicated to a limited number of bits according to error-rates before error correction and a portion of low error-rates responses bypass the error correction as direct keys. The hardware cost for error correction can also be reduced by employing these methods. Response weighting is capable of reducing the false negative and false positive simultaneously. The entropy can also be controlled. Our experimental results show that the response weighting algorithm can reduce not only the false negative from 20.60% to 1.71%, but also the false positive rate from 1.26 × 10−21 to 5.38 × 10−22 for a PUF-based authentication with 127-bit response and 13-bit error correction. Besides, three case studies about the applications of the proposed algorithm are also discussed. Along with the rapid development of hardware security techniques, the revolutionary gro-wth of countermeasures or attacking methods developed by intelligent and adaptive adversaries have significantly complicated the ability to create secure hardware systems. Thus, there is a critical need to (re)evaluate existing or new hardware security techniques against these state-of-the-art attacking methods. With this in mind, this wok presents a novel framework for incorporating active learning techniques into hardware security field. We demonstrate that active learning can significantly im-prove the learning efficiency of PUF modeling attack, which samples the least confident and the most informative challenge-response pair (CRP) for training in each iteration. For example, our ex-perimental results show that in order to obtain a prediction error below 4%, 2790 CRPs are required in passive learning, while only 811 CRPs are required in active learning. The sampling strategies and detailed applications of PUF modeling attack under various environmental conditions are also discussed. When the environment is very noisy, active learning may sample a large number of mis-labeled CRPs and hence result in high prediction error. We present two methods to mitigate the contradiction between informative and noisy CRPs. At last, it is critical to design secure PUF, which can mitigate the countermeasures or modeling attacking from intelligent and adaptive adversaries. Previously, researchers devoted to hiding PUF information by pre- or post processing of PUF challenge/response. However, these methods are still subject to side-channel analysis based hybrid attacks. Methods for increasing the non-linearity of PUF structure, such as feedforward PUF, cascade PUF and subthreshold current PUF, have also been proposed. However, these methods significantly degrade the reliability. Based on the previous work, this work proposes a novel concept, noisy PUF, which achieves modeling attack resistance while maintaining a high degree of reliability for selected CRPs. A possible design of noisy PUF along with the corresponding experimental results is also presented

    Attack-Resistance and Reliability Analysis of Feed-Forward and Feed-Forward XOR PUFs

    Get PDF
    University of Minnesota M.S.E.E. thesis.May 2019. Major: Electrical/Computer Engineering. Advisor: Keshab Parhi. 1 computer file (PDF); ix, 75 pages.Physical unclonable functions (PUFs) are lightweight hardware security primitives that are used to authenticate devices or generate cryptographic keys without using non-volatile memories. This is accomplished by harvesting the inherent randomness in manufacturing process variations (e.g. path delays) to generate random yet unique outputs. A multiplexer (MUX) based arbiter PUF comprises two parallel delay chains with MUXs as switching elements. An input to a PUF is called a challenge vector and comprises of the select bits of all the MUX elements in the circuit. The output-bits are referred to as responses. In other words, when queried with a challenge, the PUF generates a response based on the uncontrollable physical characteristics of the underlying PUF hardware. Thus, the overall path delays of these delay chains are random and unique functions of the challenge. The contributions in this thesis can be classified into four main ideas. First, a novel approach to estimate delay differences of each stage in MUX-based standard arbiter PUFs, feed-forward PUFs (FF PUFs) and modified feed-forward PUFs (MFF PUFs) is presented. Test data collected from PUFs fabricated using 32 nm process are used to learn models that characterize the PUFs. The delay differences of individual stages of arbiter PUFs correspond to the model parameters. This was accomplished by employing the least mean squares (LMS) adaptive algorithm. The models trained to learn the parameters of two standard arbiter PUF-chips were able to predict responses with 97.5% and 99.5% accuracy, respectively. Additionally, it was observed that perceptrons can be used to attain 100% (approx.) prediction accuracy. A comparison shows that the perceptron model parameters are scaled versions of the model derived by the LMS algorithm. Since the delay differences are challenge independent, these parameters can be stored on the server which enables the server to issue random challenges whose responses need not be stored. By extending this analysis to 96 standard arbiter PUFs, we confirm that the delay differences of each MUX stage of the PUFs follow a Gaussian probability distribution. Second, artificial neural network (ANN) models are trained to predict hard and soft-responses of the three configurations: standard arbiter PUFs, FF PUFs and MFF PUFs. These models were trained using silicon data extracted from 32-stage arbiter PUF circuits fabricated using IBM 32 nm HKMG process and achieve a response-prediction accuracy of 99.8% in case of standard arbiter PUFs, approximately 97% in case FF PUFs and approximately 99% in case of MFF PUFs. Also, a probability based thresholding scheme is used to define soft-responses and artificial neural networks were trained to predict these soft-responses. If the response of a given challenge has at least 90% consistency on repeated evaluation, it is considered stable. It is shown that the soft-response models can be used to filter out unstable challenges from a randomly chosen independent test-set. From the test measurements, it is observed that the probability of a stable challenge is typically in the range of 87% to 92%. However, if a challenge is chosen with the proposed soft-response model, then its portability of being stable is found to be 99% compared to the ground truth. Third, we provide the first systematic empirical analysis of the effect of FF PUF design choices on their reliability and attack resistance. FF PUFs consist of feed-forward loops that enable internally generated responses to be used as select-bits, making them slightly more secure than a standard arbiter PUFs. While FF PUFs have been analyzed earlier, no prior study has addressed the effect of loop positions on the security and reliability. After evaluating the performance of hundreds of PUF structures in various design configurations, it is observed that the locations of the arbiters and their outputs can have a substantial impact on the security and reliability of FF PUFs. Appropriately choosing the input and output locations of the FF loops, the amount of data required to attack can be increased by 7 times and can be further increased by 15 times if two intermediate arbiters are used. It is observed adding more loops makes PUFs more susceptible to noise; FF PUFs with 5 intermediate arbiters can have reliability values that are as low as 81%. It is further demonstrated that a soft-response thresholding strategy can significantly increase the reliability during authentication to more than 96%. It is known that XOR arbiter PUFs (XOR PUFs) were introduced as more secure alternatives to standard arbiter PUFs. XOR PUFs typically contain multiple standard arbiter PUFs as their components and the output of the component PUFs is XOR-ed to generate the final response. Finally, we propose the design of feed-forward XOR PUFs (FFXOR PUFs) where each component PUF is an FF PUF instead of a standard arbiter PUF. Attack-resistance analysis of FFXOR PUFs was carried out by employing artificial neural networks with 2-3 hidden layers and compared with XOR PUFs. It is shown that FFXOR PUFs cannot be accurately modeled if the number of component PUFs is more than 5. However, the increase in the attack resistance comes at the cost of degraded reliability. We also show that the soft-response thresholding strategy can increase the reliability of FFXOR PUFs by about 30%

    A Study on Modeling of MUX-based Physical Unclonable Functions

    Get PDF
    University of Minnesota M.S.E.C.E. thesis. 2018. Major: Electrical/Computer Engineering. Advisor: Keshab Parhi. 1 computer file (PDF); 82 pages.Physical Unclonable Functions (PUFs) are simple circuits that are ideal for hardware security. Typically, they are used for identifying and authenticating integrated circuits (ICs). In this work, we are interested in a class of delay based PUFs which mainly consist of multiplexers. They are known as multiplexer-based PUFs or MUX PUFs, for short. We are interested in modelling their structure and then, analyzing their performances. Our work can be mainly divided into some key contributions. First, we discuss about the different types of MUX PUFs that we deal with in this work. They are the simple or linear configuration, feed-forward configuration and modified feed-forward configuration. We then, present a typical scheme used for the authentication of these PUFs. However, much of the work concentrates on a modified version of the authentication scheme, where instead of storing a look-up table (LUT) of challenge-response pairs (CRP) in the server, we store a set of delay parameters corresponding to the physical attributes of the MUX PUF. These stored parameters are the delay-differences of the MUX stage and the arbiter delay. We show that MUX PUFs can be modelled using an additive linear delay model. The additive model helps in the computation of an important parameter, known as total delay-difference. Based on the total delay-difference, we can compute two different versions of the output or response: hard-response, which is either a `0' or `1' bit and soft-response, which can take continuous values between 0 and 1. We formulate models for obtaining both these responses. Various metrics used for the evaluation of PUF performance are discussed. The general lab setup used to collect the required PUF data is also discussed. Next, we discuss about the various effects of aging on the performance of MUX PUFs. We extend the linear delay model to include the variations in delay parameters due to aging. The model makes certain assumptions about how noise and aging affect the delay chain (consisting of the multiplexers) and the arbiter. We assume that for a fixed set of conditions, the noise can only cause a constant amount of degradation to the performance of an aging PUF. However, aging which is caused due to undesirable changes like negative bias temperature instability (NBTI), hot carrier injection (HCI) and time dependent dielectric breakdown (TDDB) results in a gradual degradation of performance. That is, the variations due to aging gradually increase with time in contrast to that of noise. In our study, we compare the standalone effects of aging and noise on the PUF. We observe that for the same amount of variation, aging degrades the authentication performance much more than noise. Furthermore, experimental aging data collected from PUFs in our lab suggest that the percent variation in delay parameters can be modelled as a Gaussian distribution. However, there is a small difference in how the percent variations of delay-differences of MUX stages and the arbiter delay are modelled. The former is a zero mean Gaussian, whereas the latter is a positive mean Gaussian with mean and variance both gradually increasing with aging. In addition, the variation in arbiter delay is assumed to be higher than that of delay-differences due to ``asymmetric'' aging in case of arbiter. This happens under unequal aging scenario. Using a Monte-Carlo based simulation for aging, authentication accuracy of the three configurations are studied. We also suggest approaches to improve the authentication accuracy that will increase the lifetime of a PUF. This can be done by either recalibrating the delay parameters or by tuning a threshold based on total delay-difference. Next, we discuss an entropy based approach that can be used to identify whether a MUX is linear or non-linear. The approach is focused on computing the conditional entropy of responses to a set of predefined challenges. The challenge set consists of randomly chosen challenges and their 1-bit neighbors. The entropy is computed across the responses of two 1-bit neighboring challenges. For non-linear MUX PUFs like feed-forward, the method determines the MUX stages which are controlled by internally generated challenge bits as opposed to external challenge bits. This is based on the observation that the conditional entropy for each of these stages is zero. Also, the number of zero conditional entropy values across the MUX stages provide an upper bound on the number of internal arbiters present in the PUF. With the proposed approach, we observe 100% sensitivity and 100% specificity for identifying non-linearity. Furthermore, we show that the proposed approach requires very less number of stable random challenges (about 50) for successfully determining whether a PUF is linear or not for real chips. Our next contribution involves a logistic regression based approach to predict the soft-response for a challenge using the total delay-difference as an input. This approach enables us to determine whether a challenge is stable or not. The approach learns a logistic function based on the total delay-difference which has just 3 parameters. Therefore, this is a simple approach which gives comparable performance against a more complex approach based on artificial neural network (ANN) models. The model demonstrates good sensitivity and precision but poor specificity. Finally, we discuss a bit-flipping algorithm used to convert the unstable challenges to stable challenges. It is based on the idea that a threshold on the total delay-difference can guarantee stability of challenges. The thresholds can be obtained empirically from the probability distributions of the total delay-difference. A straightforward approach is to discard and issue a new random challenge for authentication if the current challenge is unstable. In this paper, we propose a novel bit-flipping based approach in which we claim that by flipping few bits of the original unstable challenge, we can convert it to a stable one with minimal number of bit-flips. By using the algorithm, we are able to transform the most likely unstable challenges to stable ones, typically with 1 bit-flip for linear and modified feed-forward PUFs and 3 bit-flips for the feed-forward PUFs. These bit-flips correspond to the flips in the XOR-ed challenge. We also compare the computation complexities of best, average and worst-case scenarios for the straightforward and proposed approaches. In terms of number of addition operations, the proposed approach has slightly better average-case performance but much better worst-case performance than the straightforward approach

    D2.1 - Report on Selected TRNG and PUF Principles

    Get PDF
    This report represents the final version of Deliverable 2.1 of the HECTOR work package WP2. It is a result of discussions and work on Task 2.1 of all HECTOR partners involved in WP2. The aim of the Deliverable 2.1 is to select principles of random number generators (RNGs) and physical unclonable functions (PUFs) that fulfill strict technology, design and security criteria. For example, the selected RNGs must be suitable for implementation in logic devices according to the German AIS20/31 standard. Correspondingly, the selected PUFs must be suitable for applying similar security approach. A standard PUF evaluation approach does not exist, yet, but it should be proposed in the framework of the project. Selected RNGs and PUFs should be then thoroughly evaluated from the point of view of security and the most suitable principles should be implemented in logic devices, such as Field Programmable Logic Arrays (FPGAs) and Application Specific Integrated Circuits (ASICs) during the next phases of the project

    Within-Die Delay Variation Measurement And Analysis For Emerging Technologies Using An Embedded Test Structure

    Get PDF
    Both random and systematic within-die process variations (PV) are growing more severe with shrinking geometries and increasing die size. Escalation in the variations in delay and power with reductions in feature size places higher demands on the accuracy of variation models. Their availability can be used to improve yield, and the corresponding profitability and product quality of the fabricated integrated circuits (ICs). Sources of within-die variations include optical source limitations, and layout-based systematic effects (pitch, line-width variability, and microscopic etch loading). Unfortunately, accurate models of within-die PVs are becoming more difficult to derive because of their increasingly sensitivity to design-context. Embedded test structures (ETS) continue to play an important role in the development of models of PVs and as a mechanism to improve correlations between hardware and models. Variations in path delays are increasing with scaling, and are increasingly affected by neighborhood\u27 interactions. In order to fully characterize within-die variations, delays must be measured in the context of actual core-logic macros. Doing so requires the use of an embedded test structure, as opposed to traditional scribe line test structures such as ring oscillators (RO). Accurate measurements of within-die variations can be used, e.g., to better tune models to actual hardware (model-to-hardware correlations). In this research project, I propose an embedded test structure called REBEL (Regional dELay BEhavior) that is designed to measure path delays in a minimally invasive fashion; and its architecture measures the path delays more accurately. Design for manufacture-ability (DFM) analysis is done on the on 90 nm ASIC chips and 28nm Zynq 7000 series FPGA boards. I present ASIC results on within-die path delay variations in a floating-point unit (FPU) fabricated in IBM\u27s 90 nm technology, with 5 pipeline stages, used as a test vehicle in chip experiments carried out at nine different temperature/voltage (TV) corners. Also experimental data has been analyzed for path delay variations in short vs long paths. FPGA results on within-die variation and die-to-die variations on Advanced Encryption System (AES) using single pipelined stage are also presented. Other analysis that have been performed on the calibrated path delays are Flip Flop propagation delays for both rising and falling edge (tpHL and tpLH), uncertainty analysis, path distribution analysis, short versus long path variations and mid-length path within-die variation. I also analyze the impact on delay when the chips are subjected to industrial-level temperature and voltage variations. From the experimental results, it has been established that the proposed REBEL provides capabilities similar to an off-chip logic analyzer, i.e., it is able to capture the temporal behavior of the signal over time, including any static and dynamic hazards that may occur on the tested path. The ASIC results further show that path delays are correlated to the launch-capture (LC) interval used to time them. Therefore, calibration as proposed in this work must be carried out in order to obtain an accurate analysis of within-die variations. Results on ASIC chips show that short paths can vary up to 35% on average, while long paths vary up to 20% at nominal temperature and voltage. A similar trend occurs for within-die variations of mid-length paths where magnitudes reduced to 20% and 5%, respectively. The magnitude of delay variations in both these analyses increase as temperature and voltage are changed to increase performance. The high level of within-die delay variations are undesirable from a design perspective, but they represent a rich source of entropy for applications that make use of \u27secrets\u27 such as authentication, hardware metering and encryption. Physical unclonable functions (PUFs) are a class of primitives that leverage within-die-variations as a means of generating random bit strings for these types of applications, including hardware security and trust. Zynq FPGAs Die-to-Die and within-die variation study shows that on average there is 5% of within-Die variation and the range of die-to-Die variation can go upto 3ns. The die-to-Die variations can be explored in much further detail to study the variations spatial dependance. Additionally, I also carried out research in the area data mining to cater for big data by focusing the work on decision tree classification (DTC) to speed-up the classification step in hardware implementation. For this purpose, I devised a pipelined architecture for the implementation of axis parallel binary decision tree classification for meeting up with the requirements of execution time and minimal resource usage in terms of area. The motivation for this work is that analyzing larger data-sets have created abundant opportunities for algorithmic and architectural developments, and data-mining innovations, thus creating a great demand for faster execution of these algorithms, leading towards improving execution time and resource utilization. Decision trees (DT) have since been implemented in software programs. Though, the software implementation of DTC is highly accurate, the execution times and the resource utilization still require improvement to meet the computational demands in the ever growing industry. On the other hand, hardware implementation of DT has not been thoroughly investigated or reported in detail. Therefore, I propose a hardware acceleration of pipelined architecture that incorporates the parallel approach in acquiring the data by having parallel engines working on different partitions of data independently. Also, each engine is processing the data in a pipelined fashion to utilize the resources more efficiently and reduce the time for processing all the data records/tuples. Experimental results show that our proposed hardware acceleration of classification algorithms has increased throughput, by reducing the number of clock cycles required to process the data and generate the results, and it requires minimal resources hence it is area efficient. This architecture also enables algorithms to scale with increasingly large and complex data sets. We developed the DTC algorithm in detail and explored techniques for adapting it to a hardware implementation successfully. This system is 3.5 times faster than the existing hardware implementation of classification.\u2

    Energy-Efficient Neural Network Hardware Design and Circuit Techniques to Enhance Hardware Security

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2019. Major: Electrical Engineering. Advisor: Chris Kim. 1 computer file (PDF); ix, 108 pages.Artificial intelligence (AI) algorithms and hardware are being developed at a rapid pace for emerging applications such as self-driving cars, speech/image/video recognition, deep learning, etc. Today’s AI tasks are mostly performed at remote datacenters, while in the future, more AI workloads are expected to run on edge devices. To fulfill this goal, innovative design techniques are needed to improve energy-efficiency, form factor, and as well as the security of AI chips. In this dissertation, two topics are focused on to address these challenges: building energy-efficient AI chips based on various neural network architectures, and designing “chip fingerprint” circuits as well as counterfeit chip sensors to improve hardware security. First of all, in order to deploy AI tasks on edge devices, we come up with various energy and area efficient computing platforms. One is a novel time-domain computing scheme for fully connected multi-layer perceptron (MLP) neural network and the other is an efficient binarized architecture for long short-term memory (LSTM) neural network. Secondly, to enhance the hardware security and ensure secure data communication between edge devices, we need to make sure the authenticity of the chip. Physical Unclonable Function (PUF) is a circuit primitive that can serve as a chip “fingerprint” by generating a unique ID for each chip. Another source of security concerns comes from the counterfeit ICs, and recycled and remarked ICs account for more than 80% of the counterfeit electronics. To effectively detect those counterfeit chips that have been physically compromised, we came up with a passive IC tamper sensor. This proposed sensor is demonstrated to be able to efficiently and reliably detect suspicious activities such as high temperature cycling, ambient humidity rise, and increased dust particles in the chip cavity
    corecore