1,005 research outputs found

    A hardware-embedded, delay-based PUF engine designed for use in cryptographic and authentication applications

    Get PDF
    Cryptographic and authentication applications in application-specific integrated circuits (ASICs) and field-programmable gate arrays (FPGAs), as well as codes for the activation of on-chip features, require the use of embedded secret information. The generation of secret bitstrings using physical unclonable functions, or PUFs, provides several distinct advantages over conventional methods, including the elimination of costly non-volatile memory, and the potential to increase the random bits available to applications. In this dissertation, a Hardware-Embedded Delay PUF (HELP) is proposed that is designed to leverage path delay variations that occur in the core logic macros of a chip to create random bitstrings. A thorough discussion is provided of the operational details of an embedded path timing structure called REBEL that is used by HELP to provide the timing functionality upon which HELP relies for the entropy source for the cryptographic quality of the bitstrings. Further details of the FPGA-based implementation used to prove the viability of the HELP PUF concept are included, along with a discussion of the evolution of the techniques employed in realizing the final PUF engine design. The bitstrings produced by a set of 30 FPGA boards are evaluated with regard to several statistical quality metrics including uniqueness, randomness, and stability. The stability characteristics of the bitstrings are evaluated by subjecting the FPGAs to commercial-grade temperature and power supply voltage variations. In particular, this work evaluates the reproducibility of the bitstrings generated at 0C, 25C, and 70C, and 10% of the rated supply voltage. A pair of error avoidance schemes are proposed and presented that provide significant improvements to the HELP PUF\u27s resiliency against bit-flip errors in the bitstrings

    Millimeter-Precision Laser Rangefinder Using a Low-Cost Photon Counter

    Get PDF
    In this book we successfully demonstrate a millimeter-precision laser rangefinder using a low-cost photon counter. An application-specific integrated circuit (ASIC) comprises timing circuitry and single-photon avalanche diodes (SPADs) as the photodetectors. For the timing circuitry, a novel binning architecture for sampling the received signal is proposed which mitigates non-idealities that are inherent to a system with SPADs and timing circuitry in one chip

    On-Chip Structure for Timing Uncertainity Measurement Induced by Noise in Integrated Circuits

    Get PDF
    Noise such as voltage drop and temperature in integrated circuits can cause significant performance variation and even functional failure in lower technology nodes. In this paper, we propose an on-chip structure that measures the timing uncertainty induced by noise during functional and test operations. The proposed on-chip structure facilitates the speed characterization under various workloads and test conditions. The basic structure is highly scalable and can be tailored for various applications such as silicon validation, monitoring operation condition and validating logic built-in-self-test conditions. Simulation results show that it offers very high measurement resolution in a highly efficient manner

    Simulation and real time processing techniques for space instrumentation

    Get PDF
    Designing and developing space instruments involves a wide variety of evaluation and simulation techniques in order to ensure correct operation under all possible conditions likely to be encountered in space and to allow parallel development of different subsystems of an instrument. This thesis describes three such evaluation and simulation techniques including real-time processing techniques, devised for two major European space missions, the Solar and Heliospheric Observatory (SOHO) and the X-ray Multi-Mirror (XMM) mission. The design of a Science Data Display Adapter is described, which was developed to provide comprehensive performance evaluation of the detectors of the Grazing Incidence Spectrometer (GIS), part of the Coronal Diagnostic Spectrometer on-board the SOHO, in the absence of the Command and Data Handling System and the Experiment Ground Support Equipment. The requirements to handle high data rates and to have significant display flexibility are discussed. This thesis also describes a user-controlled detector simulator developed to carry out full range tests of the GIS processing electronics in the absence of real detectors, including extreme conditions not easily achievable by other means. With its large degree of flexibility, the simulator provides realistic shapes and a wide range of characteristics for the output events of the Spiral Anode (SPAN). Although the simulator was designed specifically to simulate the SPAN, the design is applicable to any three channel detector system and has since been used for the FONEMA instrument for the Russian Mars96 mission. Finally, two alternative algorithms, which are applied to reduce the telemetry requirements for a Charge Coupled Device based, space-borne, X-ray spectrometer by on-board reconstruction of X-ray events split over two or more adjacent pixels, are described. The algorithms have been developed for the Reflection Grating Spectrometer (RGS) on the XMM, and were also used to study the feasibility of having a single processor Data Pre-Processor subsystem as part of the RGS Digital Electronics. Such design has now been adopted for flight

    Robust Design With Increasing Device Variability In Sub-Micron Cmos And Beyond: A Bottom-Up Framework

    Full text link
    My Ph.D. research develops a tiered systematic framework for designing process-independent and variability-tolerant integrated circuits. This bottom-up approach starts from designing self-compensated circuits as accurate building blocks, and moves up to sub-systems with negative feedback loop and full system-level calibration. a. Design methodology for self-compensated circuits My collaborators and I proposed a novel design methodology that offers designers intuitive insights to create new topologies that are self-compensated and intrinsically process-independent without external reference. It is the first systematic approaches to create "correct-by-design" low variation circuits, and can scale beyond sub-micron CMOS nodes and extend to emerging non-silicon nano-devices. We demonstrated this methodology with an addition-based current source in both 180nm and 90nm CMOS that has 2.5x improved process variation and 6.7x improved temperature sensitivity, and a GHz ring oscillator (RO) in 90nm CMOS with 65% reduction in frequency variation and 85ppm/oC temperature sensitivity. Compared to previous designs, our RO exhibits the lowest temperature sensitivity and process variation, while consuming the least amount of power in the GHz range. Another self-compensated low noise amplifiers (LNA) we designed also exhibits 3.5x improvement in both process and temperature variation and enhanced supply voltage regulation. As part of the efforts to improve the accuracy of the building blocks, I also demonstrated experimentally that due to "diversification effect", the upper bound of circuit accuracy can be better than the minimum tolerance of on-chip devices (MOSFET, R, C, and L), which allows circuit designers to achieve better accuracy with less chip area and power consumption. b. Negative feedback loop based sub-system I explored the feasibility of using high-accuracy DC blocks as low-variation "rulers-on-chip" to regulate high-speed high-variation blocks (e.g. GHz oscillators). In this way, the trade-off between speed (which can be translated to power) and variation can be effectively de-coupled. I demonstrated this proposed structure in an integrated GHz ring oscillators that achieve 2.6% frequency accuracy and 5x improved temperature sensitivity in 90nm CMOS. c. Power-efficient system-level calibration To enable full system-level calibration and further reduce power consumption in active feedback loops, I implemented a successive-approximation-based calibration scheme in a tunable GHz VCO for low power impulse radio in 65nm CMOS. Events such as power-up and temperature drifts are monitored by the circuits and used to trigger the need-based frequency calibration. With my proposed scheme and circuitry, the calibration can be performed under 135pJ and the oscillator can operate between 0.8 and 2GHz at merely 40[MICRO SIGN]W, which is ideal for extremely power-and-cost constraint applications such as implantable biomedical device and wireless sensor networks

    Design-for-delay-testability techniques for high-speed digital circuits

    Get PDF
    The importance of delay faults is enhanced by the ever increasing clock rates and decreasing geometry sizes of nowadays' circuits. This thesis focuses on the development of Design-for-Delay-Testability (DfDT) techniques for high-speed circuits and embedded cores. The rising costs of IC testing and in particular the costs of Automatic Test Equipment are major concerns for the semiconductor industry. To reverse the trend of rising testing costs, DfDT is\ud getting more and more important

    Ein flexibles, heterogenes Bildverarbeitungs-Framework fĂŒr weltraumbasierte, rekonfigurierbare Datenverarbeitungsmodule

    Get PDF
    Scientific instruments as payload of current space missions are often equipped with high-resolution sensors. Thereby, especially camera-based instruments produce a vast amount of data. To obtain the desired scientific information, this data usually is processed on ground. Due to the high distance of missions within the solar system, the data rate for downlink to the ground station is strictly limited. The volume of scientific relevant data is usually less compared to the obtained raw data. Therefore, processing already has to be carried out on-board the spacecraft. An example of such an instrument is the Polarimetric and Helioseismic Imager (PHI) on-board Solar Orbiter. For acquisition, storage and processing of images, the instrument is equipped with a Data Processing Module (DPM). It makes use of heterogeneous computing based on a dedicated LEON3 processor in combination with two reconfigurable Xilinx Virtex-4 Field-Programmable Gate Arrays (FPGAs). The thesis will provide an overview of the available space-grade processing components (processors and FPGAs) which fulfill the requirements of deepspace missions. It also presents existing processing platforms which are based upon a heterogeneous system combining processors and FPGAs. This also includes the DPM of the PHI instrument, whose architecture will be introduced in detail. As core contribution of this thesis, a framework will be presented which enables high-performance image processing on such hardware-based systems while retaining software-like flexibility. This framework mainly consists of a variety of modules for hardware acceleration which are integrated seamlessly into the data flow of the on-board software. Supplementary, it makes extensive use of the dynamic in-flight reconfigurability of the used Virtex-4 FPGAs. The flexibility of the presented framework is proven by means of multiple examples from within the image processing of the PHI instrument. The framework is analyzed with respect to processing performance as well as power consumption.Wissenschaftliche Instrumente auf aktuellen Raumfahrtmissionen sind oft mit hochauflösenden Sensoren ausgestattet. Insbesondere kamerabasierte Instrumente produzieren dabei eine große Menge an Daten. Diese werden ĂŒblicherweise nach dem Empfang auf der Erde weiterverarbeitet, um daraus wissenschaftlich relevante Informationen zu gewinnen. Aufgrund der großen Entfernung von Missionen innerhalb unseres Sonnensystems ist die Datenrate zur Übertragung an die Bodenstation oft sehr begrenzt. Das Volumen der wissenschaftlich relevanten Daten ist meist deutlich kleiner als die aufgenommenen Rohdaten. Daher ist es vorteilhaft, diese bereits an Board der Sonde zu verarbeiten. Ein Beispiel fĂŒr solch ein Instrument ist der Polarimetric and Helioseismic Imager (PHI) an Bord von Solar Orbiter. Um die Daten aufzunehmen, zu speichern und zu verarbeiten, ist das Instrument mit einem Data Processing Module (DPM) ausgestattet. Dieses nutzt ein heterogenes Rechnersystem aus einem dedizierten LEON3 Prozessor, zusammen mit zwei rekonfigurierbaren Xilinx Virtex-4 Field-Programmable Gate Arrays (FPGAs). Die folgende Arbeit gibt einen Überblick ĂŒber verfĂŒgbare Komponenten zur Datenverarbeitung (Prozessoren und FPGAs), die den Anforderungen von Raumfahrtmissionen gerecht werden, und stellt einige existierende Plattformen vor, die auf einem heterogenen System aus Prozessor und FPGA basieren. Hierzu gehört auch das Data Processing Module des PHI Instrumentes, dessen Architektur im Verlauf dieser Arbeit beschrieben wird. Als Kernelement der Dissertation wird ein Framework vorgestellt, das sowohl eine performante, als auch eine flexible Bilddatenverarbeitung auf einem solchen System ermöglicht. Dieses Framework besteht aus verschiedenen Modulen zur Hardwarebeschleunigung und bindet diese nahtlos in den Datenfluss der On-Board Software ein. Dabei wird außerdem die Möglichkeit genutzt, die eingesetzten Virtex-4 FPGAs dynamisch zur Laufzeit zu rekonfigurieren. Die FlexibilitĂ€t des vorgestellten Frameworks wird anhand mehrerer Fallbeispiele aus der Bildverarbeitung von PHI dargestellt. Das Framework wird bezĂŒglich der Verarbeitungsgeschwindigkeit und Energieeffizienz analysiert

    Characterisation, optimisation and performance studies of pixel vertex detector modules for the Belle II experiment

    Get PDF
    Das Standardmodell der Elementarteilchenphysik beschreibt sehr erfolgreich die fundamentalen Teilchen und ihre Wechselwirkungen. Dennoch bleiben einige Fragen bezĂŒglich Details und Parameter der Theorie bislang unbeantwortet. Einige PhĂ€nomene, wie beispielsweise Dunkle Materie oder Quantengravitation, sind bisher nicht, oder nur unzureichend beschrieben. Heutige Hochenergiephysikexperimente erforschen diese neue Physik in Teilchenkollisionen. Das Belle II-Experiment am SuperKEKB e+e−-Beschleuniger in Japan untersucht die Eigenschaften von Teilchenwechselwirkungen mit höchster PrĂ€zision, um so die Grenzen der bisherigen Theorie zu erweitern und Parameter des Standardmodells genauer zu bestimmen. Die genaue Beobachtung von zahlreichen ZerfĂ€llen von B-Mesonen ermöglicht es, offene Fragen der elektroschwachen Wechselwirkung zu beantworten. PrĂ€zisionsmessungen im Belle II Experiment werden insbesondere durch einen Silizium-Pixeldetektor ermöglicht, der sehr nah am Wechselwirkungspunkt der Teilchenkollisionen positioniert ist. Der Pixeldetektor basiert auf der depleted field-effect transistor (DEPFET) Technologie, die hier zum ersten Mal in einem Hochenergiephysikexperiment zum Einsatz kommt. FĂŒr den Belle II Pixeldetektor wurden auf dieser Technologie basierende Module produziert. Eine genaue Charakterisierungs- und Optimierungsprozedur fĂŒr diese Module wurde im Rahmen dieser Dissertation entwickelt. Insgesamt 17 dieser Module wurden diesem Messprogramm im Verlauf dieser Arbeit unterzogen und die Qualifikation jedes einzelnen Moduls fĂŒr den finalen Detektor wurde geprĂŒft. Detailierte Untersuchungen des Verhaltens der Detektormodule wurden in Teststrahlmessungen durchgefĂŒhrt. Die Optimierungsprozedur liefert konsistente Resultate zwischen den getesteten Modulen mit Signal-Rausch-VerhĂ€ltnissen im Bereich von 20 bis 40 bei einem Inpixel-VerstĂ€rkungsfaktor von etwa 500 pA/electron. Die intrinsische Ortsauflösung der Detektormodule wurde zu etwa 10 ÎŒm gemessen, abhĂ€ngig vom Einfallswinkel der Teilchen, bei einer Detektionseffizienz von 99.6 %. Das gemessene Verhalten entspricht den Erwartungen des Moduldesigns und die Voraussetzungen an den Belle II Pixeldetektor werden mit dem vorliegenden Moduldesign erfĂŒllt.The Standard Model of particle physics is very successful in describing the fundamental particles and their interactions. Still, some questions regarding the details and input parameters of the theory, as well as regarding to date unsatisfactorily described phenomena are to be answered. Today’s high-energy physics experiments probe this new physics. The Belle II experiment at the SuperKEKB e+e−-collider in Japan explores the precision frontier, measuring the properties of particle interactions at great detail. The precise and abundant study of decays of B-mesons is a particularly good window to seek answers to the open questions of electro-weak interactions. Precise measurements in Belle II are possible in particular with a silicon pixel detector located very close to the interaction region of the electrons and positrons. The pixel detector is based on the depleted field-effect transistor (DEPFET) technology, which is employed for the first time in a high-energy physics experiment. Modules for the Belle II pixel detector were produced. A characterisation and optimisation procedure for these modules was developed in the scope of this thesis. A number of 17 modules have been processed according to this program, and the qualification criteria for the installation in Belle II have been determined. In addition, the performance of detector modules has been evaluated in beam tests. It is demonstrated that the optimisation procedure yields consistent characteristics among the tested modules. Signal-to-noise ratios of 20 to 40 are achieved at an in-pixel amplification factor of about 500 pA/electron in the DEPFET cell. The intrinsic spatial resolution is measured to be in the order of 10 ÎŒm, depending on pixel pitch and incidence angles, with a hit efficiency of 99.6 %. The module performance is in good agreement with the design goals and the requirements for the Belle II pixel detector are met

    Within-Die Delay Variation Measurement And Analysis For Emerging Technologies Using An Embedded Test Structure

    Get PDF
    Both random and systematic within-die process variations (PV) are growing more severe with shrinking geometries and increasing die size. Escalation in the variations in delay and power with reductions in feature size places higher demands on the accuracy of variation models. Their availability can be used to improve yield, and the corresponding profitability and product quality of the fabricated integrated circuits (ICs). Sources of within-die variations include optical source limitations, and layout-based systematic effects (pitch, line-width variability, and microscopic etch loading). Unfortunately, accurate models of within-die PVs are becoming more difficult to derive because of their increasingly sensitivity to design-context. Embedded test structures (ETS) continue to play an important role in the development of models of PVs and as a mechanism to improve correlations between hardware and models. Variations in path delays are increasing with scaling, and are increasingly affected by neighborhood\u27 interactions. In order to fully characterize within-die variations, delays must be measured in the context of actual core-logic macros. Doing so requires the use of an embedded test structure, as opposed to traditional scribe line test structures such as ring oscillators (RO). Accurate measurements of within-die variations can be used, e.g., to better tune models to actual hardware (model-to-hardware correlations). In this research project, I propose an embedded test structure called REBEL (Regional dELay BEhavior) that is designed to measure path delays in a minimally invasive fashion; and its architecture measures the path delays more accurately. Design for manufacture-ability (DFM) analysis is done on the on 90 nm ASIC chips and 28nm Zynq 7000 series FPGA boards. I present ASIC results on within-die path delay variations in a floating-point unit (FPU) fabricated in IBM\u27s 90 nm technology, with 5 pipeline stages, used as a test vehicle in chip experiments carried out at nine different temperature/voltage (TV) corners. Also experimental data has been analyzed for path delay variations in short vs long paths. FPGA results on within-die variation and die-to-die variations on Advanced Encryption System (AES) using single pipelined stage are also presented. Other analysis that have been performed on the calibrated path delays are Flip Flop propagation delays for both rising and falling edge (tpHL and tpLH), uncertainty analysis, path distribution analysis, short versus long path variations and mid-length path within-die variation. I also analyze the impact on delay when the chips are subjected to industrial-level temperature and voltage variations. From the experimental results, it has been established that the proposed REBEL provides capabilities similar to an off-chip logic analyzer, i.e., it is able to capture the temporal behavior of the signal over time, including any static and dynamic hazards that may occur on the tested path. The ASIC results further show that path delays are correlated to the launch-capture (LC) interval used to time them. Therefore, calibration as proposed in this work must be carried out in order to obtain an accurate analysis of within-die variations. Results on ASIC chips show that short paths can vary up to 35% on average, while long paths vary up to 20% at nominal temperature and voltage. A similar trend occurs for within-die variations of mid-length paths where magnitudes reduced to 20% and 5%, respectively. The magnitude of delay variations in both these analyses increase as temperature and voltage are changed to increase performance. The high level of within-die delay variations are undesirable from a design perspective, but they represent a rich source of entropy for applications that make use of \u27secrets\u27 such as authentication, hardware metering and encryption. Physical unclonable functions (PUFs) are a class of primitives that leverage within-die-variations as a means of generating random bit strings for these types of applications, including hardware security and trust. Zynq FPGAs Die-to-Die and within-die variation study shows that on average there is 5% of within-Die variation and the range of die-to-Die variation can go upto 3ns. The die-to-Die variations can be explored in much further detail to study the variations spatial dependance. Additionally, I also carried out research in the area data mining to cater for big data by focusing the work on decision tree classification (DTC) to speed-up the classification step in hardware implementation. For this purpose, I devised a pipelined architecture for the implementation of axis parallel binary decision tree classification for meeting up with the requirements of execution time and minimal resource usage in terms of area. The motivation for this work is that analyzing larger data-sets have created abundant opportunities for algorithmic and architectural developments, and data-mining innovations, thus creating a great demand for faster execution of these algorithms, leading towards improving execution time and resource utilization. Decision trees (DT) have since been implemented in software programs. Though, the software implementation of DTC is highly accurate, the execution times and the resource utilization still require improvement to meet the computational demands in the ever growing industry. On the other hand, hardware implementation of DT has not been thoroughly investigated or reported in detail. Therefore, I propose a hardware acceleration of pipelined architecture that incorporates the parallel approach in acquiring the data by having parallel engines working on different partitions of data independently. Also, each engine is processing the data in a pipelined fashion to utilize the resources more efficiently and reduce the time for processing all the data records/tuples. Experimental results show that our proposed hardware acceleration of classification algorithms has increased throughput, by reducing the number of clock cycles required to process the data and generate the results, and it requires minimal resources hence it is area efficient. This architecture also enables algorithms to scale with increasingly large and complex data sets. We developed the DTC algorithm in detail and explored techniques for adapting it to a hardware implementation successfully. This system is 3.5 times faster than the existing hardware implementation of classification.\u2

    Millimeter-Precision Laser Rangefinder Using a Low-Cost Photon Counter

    Get PDF
    In this book we successfully demonstrate a millimeter-precision laser rangefinder using a low-cost photon counter. An application-specific integrated circuit (ASIC) comprises timing circuitry and single-photon avalanche diodes (SPADs) as the photodetectors. For the timing circuitry, a novel binning architecture for sampling the received signal is proposed which mitigates non-idealities that are inherent to a system with SPADs and timing circuitry in one chip
    • 

    corecore