15 research outputs found

    Serial-data computation in VLSI

    Get PDF

    Microarchitectural Low-Power Design Techniques for Embedded Microprocessors

    Get PDF
    With the omnipresence of embedded processing in all forms of electronics today, there is a strong trend towards wireless, battery-powered, portable embedded systems which have to operate under stringent energy constraints. Consequently, low power consumption and high energy efficiency have emerged as the two key criteria for embedded microprocessor design. In this thesis we present a range of microarchitectural low-power design techniques which enable the increase of performance for embedded microprocessors and/or the reduction of energy consumption, e.g., through voltage scaling. In the context of cryptographic applications, we explore the effectiveness of instruction set extensions (ISEs) for a range of different cryptographic hash functions (SHA-3 candidates) on a 16-bit microcontroller architecture (PIC24). Specifically, we demonstrate the effectiveness of light-weight ISEs based on lookup table integration and microcoded instructions using finite state machines for operand and address generation. On-node processing in autonomous wireless sensor node devices requires deeply embedded cores with extremely low power consumption. To address this need, we present TamaRISC, a custom-designed ISA with a corresponding ultra-low-power microarchitecture implementation. The TamaRISC architecture is employed in conjunction with an ISE and standard cell memories to design a sub-threshold capable processor system targeted at compressed sensing applications. We furthermore employ TamaRISC in a hybrid SIMD/MIMD multi-core architecture targeted at moderate to high processing requirements (> 1 MOPS). A range of different microarchitectural techniques for efficient memory organization are presented. Specifically, we introduce a configurable data memory mapping technique for private and shared access, as well as instruction broadcast together with synchronized code execution based on checkpointing. We then study an inherent suboptimality due to the worst-case design principle in synchronous circuits, and introduce the concept of dynamic timing margins. We show that dynamic timing margins exist in microprocessor circuits, and that these margins are to a large extent state-dependent and that they are correlated to the sequences of instruction types which are executed within the processor pipeline. To perform this analysis we propose a circuit/processor characterization flow and tool called dynamic timing analysis. Moreover, this flow is employed in order to devise a high-level instruction set simulation environment for impact-evaluation of timing errors on application performance. The presented approach improves the state of the art significantly in terms of simulation accuracy through the use of statistical fault injection. The dynamic timing margins in microprocessors are then systematically exploited for throughput improvements or energy reductions via our proposed instruction-based dynamic clock adjustment (DCA) technique. To this end, we introduce a 6-stage 32-bit microprocessor with cycle-by-cycle DCA. Besides a comprehensive design flow and simulation environment for evaluation of the DCA approach, we additionally present a silicon prototype of a DCA-enabled OpenRISC microarchitecture fabricated in 28 nm FD-SOI CMOS. The test chip includes a suitable clock generation unit which allows for cycle-by-cycle DCA over a wide range with fine granularity at frequencies exceeding 1 GHz. Measurement results of speedups and power reductions are provided

    Side Channel Information Leakage: Design and Implementation of Hardware Countermeasure

    Get PDF
    Deployment of Dynamic Differential Logics (DDL) appears to be a promising choice for providing resistance against leakage of side channel information. However, the resistance provided by these logics is too costly for widespread area-constrained applications. Implementation of a secure DDL-based countermeasure also requires a complex layout methodology for balancing the load at the differential outputs. This thesis, unlike previous logic level approaches, presents a novel exploitation of static and single-ended logic for designing the side channel countermeasure. The proposed technique is used in the implementation of a protected crypto core consisting of the AES “AddRoundKey” and “SubByte” transformation. The test chip including the protected and unprotected crypto cores is fabricated in 180nm CMOS technology. A correlation analysis on the unprotected core results in revealing the key at the output of the combinational networks and the registers. The quality of the measurements is further improved by introducing an enhanced data capturing method that inserts a minimum power consuming input as a reference vector. In comparison, no key-related information is leaked from the protected core even with an order of magnitude increase in the number of averaged traces. For the first time, fabricated chip results are used to validate a new logic level side channel countermeasure that offers lower area and reduced circuit design complexity compared to the DDL-based countermeasures. This thesis also provides insight into the side channel vulnerability of cryptosystems in sub-90nm CMOS technology nodes. In particular, data dependency of leakage power is analyzed. The number of traces to disclose the key is seen to decrease by 35% from 90nm to 45nm CMOS technology nodes. Analysis shows that the temperature dependency of the subthreshold leakage has an important role in increasing the ability to attack future nanoscale crypto cores. For the first time, the effectiveness of a circuit-based leakage reduction technique is examined for side channel security. This investigation demonstrates that high threshold voltage transistor assignment improves resistance against information leakage. The analysis initiated in this thesis is crucial for rolling out the guidelines of side channel security for the next generation of Cryptosystem.1 yea

    Designing Novel Hardware Security Primitives for Smart Computing Devices

    Get PDF
    Smart computing devices are miniaturized electronics devices that can sense their surroundings, communicate, and share information autonomously with other devices to work cohesively. Smart devices have played a major role in improving quality of the life and boosting the global economy. They are ubiquitously present, smart home, smart city, smart girds, industry, healthcare, controlling the hazardous environment, and military, etc. However, we have witnessed an exponential rise in potential threat vectors and physical attacks in recent years. The conventional software-based security approaches are not suitable in the smart computing device, therefore, hardware-enabled security solutions have emerged as an attractive choice. Developing hardware security primitives, such as True Random Number Generator (TRNG) and Physically Unclonable Function (PUF) from electrical properties of the sensor could be a novel research direction. Secondly, the Lightweight Cryptographic (LWC) ciphers used in smart computing devices are found vulnerable against Correlation Power Analysis (CPA) attack. The CPA performs statistical analysis of the power consumption of the cryptographic core and reveals the encryption key. The countermeasure against CPA results in an increase in energy consumption, therefore, they are not suitable for battery operated smart computing devices. The primary goal of this dissertation is to develop novel hardware security primitives from existing sensors and energy-efficient LWC circuit implementation with CPA resilience. To achieve these. we focus on developing TRNG and PUF from existing photoresistor and photovoltaic solar cell sensors in smart devices Further, we explored energy recovery computing (also known as adiabatic computing) circuit design technique that reduces the energy consumption compared to baseline CMOS logic design and same time increasing CPA resilience in low-frequency applications, e.g. wearable fitness gadgets, hearing aid and biomedical instruments. The first contribution of this dissertation is to develop a TRNG prototype from the uncertainty present in photoresistor sensors. The existing sensor-based TRNGs suffer a low random bit generation rate, therefore, are not suitable in real-time applications. The proposed prototype has an average random bit generation rate of 8 kbps, 32 times higher than the existing sensor-based TRNG. The proposed lightweight scrambling method results in random bit entropy close to ideal value 1. The proposed TRNG prototype passes all 15 statistical tests of the National Institute of Standards and Technology (NIST) Statistical Test Suite with quality performance. The second contribution of this dissertation is to develop an integrated TRNG-PUF designed using photovoltaic solar cell sensors. The TRNG and PUF are mutually independent in the way they are designed, therefore, integrating them as one architecture can be beneficial in resource-constrained computing devices. We propose a novel histogram-based technique to segregate photovoltaic solar cell sensor response suitable for TRNG and PUF respectively. The proposed prototype archives approximately 34\% improvement in TRNG output. The proposed prototype achieves an average of 92.13\% reliability and 50.91\% uniformity performance in PUF response. The proposed sensor-based hardware security primitives do not require additional interfacing hardware. Therefore, they can be ported as a software update on existing photoresistor and photovoltaic sensor-based devices. Furthermore, the sensor-based design approach can identify physically tempered and faulty sensor nodes during authentication as their response bit differs. The third contribution is towards the development of a novel 2-phase sinusoidal clocking implementation, 2-SPGAL for existing Symmetric Pass Gate Adiabatic Logic (SPGAL). The proposed 2-SPGAL logic-based LWC cipher PRESENT shows an average of 49.34\% energy saving compared to baseline CMOS logic implementation. Furthermore, the 2-SPGAL prototype has an average of 22.76\% better energy saving compared to 2-EE-SPFAL (2-phase Energy-Efficient-Secure Positive Feedback Adiabatic Logic). The proposed 2-SPGAL was tested for energy-efficiency performance for the frequency range of 50 kHz to 250 kHz, used in healthcare gadgets and biomedical instruments. The proposed 2-SPGAL based design saves 16.78\% transistor count compared to 2-EE-SPFAL counterpart. The final contribution is to explore Clocked CMOS Adiabatic Logic (CCAL) to design a cryptographic circuit. Previously proposed 2-SPGAL and 2-EE-SPFAL uses two complementary pairs of the transistor evaluation network, thus resulting in a higher transistor count compared to the CMOS counterpart. The CCAL structure is very similar to CMOS and unlike 2-SPGAL and 2-EE-SPFAL, it does not require discharge circuitry to improve security performance. The case-study implementation LWC cipher PRESENT S-Box using CCAL results into 45.74\% and 34.88\% transistor count saving compared to 2-EE-SPFAL and 2-SPGAL counterpart. Furthermore, the case-study implementation using CCAL shows more than 95\% energy saving compared to CMOS logic at frequency range 50 kHz to 125 kHz, and approximately 60\% energy saving at frequency 250 kHz. The case study also shows 32.67\% and 11.21\% more energy saving compared to 2-EE-SPFAL and 2-SPGAL respectively at frequency 250 kHz. We also show that 200 fF of tank capacitor in the clock generator circuit results in optimum energy and security performance in CCAL

    Branch Prediction For Network Processors

    Get PDF
    Originally designed to favour flexibility over packet processing performance, the future of the programmable network processor is challenged by the need to meet both increasing line rate as well as providing additional processing capabilities. To meet these requirements, trends within networking research has tended to focus on techniques such as offloading computation intensive tasks to dedicated hardware logic or through increased parallelism. While parallelism retains flexibility, challenges such as load-balancing limit its scope. On the other hand, hardware offloading allows complex algorithms to be implemented at high speed but sacrifice flexibility. To this end, the work in this thesis is focused on a more fundamental aspect of a network processor, the data-plane processing engine. Performing both system modelling and analysis of packet processing functions; the goal of this thesis is to identify and extract salient information regarding the performance of multi-processor workloads. Following on from a traditional software based analysis of programme workloads, we develop a method of modelling and analysing hardware accelerators when applied to network processors. Using this quantitative information, this thesis proposes an architecture which allows deeply pipelined micro-architectures to be implemented on the data-plane while reducing the branch penalty associated with these architectures

    Research on performance enhancement for electromagnetic analysis and power analysis in cryptographic LSI

    Get PDF
    制度:新 ; 報告番号:甲3785号 ; 学位の種類:博士(工学) ; 授与年月日:2012/11/19 ; 早大学位記番号:新6161Waseda Universit

    Embedded System Design

    Get PDF
    A unique feature of this open access textbook is to provide a comprehensive introduction to the fundamental knowledge in embedded systems, with applications in cyber-physical systems and the Internet of things. It starts with an introduction to the field and a survey of specification models and languages for embedded and cyber-physical systems. It provides a brief overview of hardware devices used for such systems and presents the essentials of system software for embedded systems, including real-time operating systems. The author also discusses evaluation and validation techniques for embedded systems and provides an overview of techniques for mapping applications to execution platforms, including multi-core platforms. Embedded systems have to operate under tight constraints and, hence, the book also contains a selected set of optimization techniques, including software optimization techniques. The book closes with a brief survey on testing. This fourth edition has been updated and revised to reflect new trends and technologies, such as the importance of cyber-physical systems (CPS) and the Internet of things (IoT), the evolution of single-core processors to multi-core processors, and the increased importance of energy efficiency and thermal issues

    Embedded System Design

    Get PDF
    A unique feature of this open access textbook is to provide a comprehensive introduction to the fundamental knowledge in embedded systems, with applications in cyber-physical systems and the Internet of things. It starts with an introduction to the field and a survey of specification models and languages for embedded and cyber-physical systems. It provides a brief overview of hardware devices used for such systems and presents the essentials of system software for embedded systems, including real-time operating systems. The author also discusses evaluation and validation techniques for embedded systems and provides an overview of techniques for mapping applications to execution platforms, including multi-core platforms. Embedded systems have to operate under tight constraints and, hence, the book also contains a selected set of optimization techniques, including software optimization techniques. The book closes with a brief survey on testing. This fourth edition has been updated and revised to reflect new trends and technologies, such as the importance of cyber-physical systems (CPS) and the Internet of things (IoT), the evolution of single-core processors to multi-core processors, and the increased importance of energy efficiency and thermal issues
    corecore