24 research outputs found

    Low Power Decoding Circuits for Ultra Portable Devices

    Get PDF
    A wide spread of existing and emerging battery driven wireless devices do not necessarily demand high data rates. Rather, ultra low power, portability and low cost are the most desired characteristics. Examples of such applications are wireless sensor networks (WSN), body area networks (BAN), and a variety of medical implants and health-care aids. Being small, cheap and low power for the individual transceiver nodes, let those to be used in abundance in remote places, where access for maintenance or recharging the battery is limited. In such scenarios, the lifetime of the battery, in most cases, determines the lifetime of the individual nodes. Therefore, energy consumption has to be so low that the nodes remain operational for an extended period of time, even up to a few years. It is known that using error correcting codes (ECC) in a wireless link can potentially help to reduce the transmit power considerably. However, the power consumption of the coding-decoding hardware itself is critical in an ultra low power transceiver node. Power and silicon area overhead of coding-decoding circuitry needs to be kept at a minimum in the total energy and cost budget of the transceiver node. In this thesis, low power approaches in decoding circuits in the framework of the mentioned applications and use cases are investigated. The presented work is based on the 65nm CMOS technology and is structured in four parts as follows: In the first part, goals and objectives, background theory and fundamentals of the presented work is introduced. Also, the ECC block in coordination with its surrounding environment, a low power receiver chain, is presented. Designing and implementing an ultra low power and low cost wireless transceiver node introduces challenges that requires special considerations at various levels of abstraction. Similarly, a competitive solution often occurs after a conclusive design space exploration. The proposed decoder circuits in the following parts are designed to be embedded in the low power receiver chain, that is introduced in the first part. Second part, explores analog decoding method and its capabilities to be embedded in a compact and low power transceiver node. Analog decod- ing method has been theoretically introduced over a decade ago that followed with early proof of concept circuits that promised it to be a feasible low power solution. Still, with the increased popularity of low power sensor networks, it has not been clear how an analog decoding approach performs in terms of power, silicon area, data rate and integrity of calculations in recent technologies and for low data rates. Ultra low power budget, small size requirement and more relaxed demands on data rates suggests a decoding circuit with limited complexity. Therefore, the four-state (7,5) codes are considered for hardware implementation. Simulations to chose the critical design factors are presented. Consequently, to evaluate critical specifications of the decoding circuit, three versions of analog decoding circuit with different transistor dimensions fabricated. The measurements results reveal different trade-off possibilities as well as the potentials and limitations of the analog decoding approach for the target applications. Measurements seem to be crucial, since the available computer-aided design (CAD) tools provide limited assistance and precision, given the amount of calculations and parameters that has to be included in the simulations. The largest analog decoding core (AD1) takes 0.104mm2 on silicon and the other two (AD2 and AD3) take 0.035mm2 and 0.015mm2, respectively. Consequently, coding gain in trade-off with silicon area and throughput is presented. The analog decoders operate with 0.8V supply. The achieved coding gain is 2.3 dB at bit error rates (BER)=0.001 and 10 pico-Joules per bit (pJ/b) energy efficiency is reached at 2 Mbps. Third part of this thesis, proposes an alternative low power digital decoding approach for the same codes. The desired compact and low power goal has been pursued by designing an equivalent digital decoding circuit that is fabricated in 65nm CMOS technology and operates in low voltage (near-threshold) region. The architecture of the design is optimized in system and circuit levels to propose a competitive digital alternative. Similarly, critical specifications of the decoder in terms of power, area, data rate (speed) and integrity are reported according to the measurements. The digital implementation with 0.11mm2 area, consumes minimum energy at 0.32V supply which gives 9 pJ/b energy efficiency at 125 kb/s and 2.9 dB coding gain at BER=0.001. The forth and last part, compares the proposed design alternatives based on the fabricated chips and the results attained from the measurements to conclude the most suitable solution for the considered target applications. Advantages and disadvantages of both approaches are discussed. Possible extensions of this work is introduced as future work

    Circuit Techniques for Low-Power and Secure Internet-of-Things Systems

    Full text link
    The coming of Internet of Things (IoT) is expected to connect the physical world to the cyber world through ubiquitous sensors, actuators and computers. The nature of these applications demand long battery life and strong data security. To connect billions of things in the world, the hardware platform for IoT systems must be optimized towards low power consumption, high energy efficiency and low cost. With these constraints, the security of IoT systems become a even more difficult problem compared to that of computer systems. A new holistic system design considering both hardware and software implementations is demanded to face these new challenges. In this work, highly robust and low-cost true random number generators (TRNGs) and physically unclonable functions (PUFs) are designed and implemented as security primitives for secret key management in IoT systems. They provide three critical functions for crypto systems including runtime secret key generation, secure key storage and lightweight device authentication. To achieve robustness and simplicity, the concept of frequency collapse in multi-mode oscillator is proposed, which can effectively amplify the desired random variable in CMOS devices (i.e. process variation or noise) and provide a runtime monitor of the output quality. A TRNG with self-tuning loop to achieve robust operation across -40 to 120 degree Celsius and 0.6 to 1V variations, a TRNG that can be fully synthesized with only standard cells and commercial placement and routing tools, and a PUF with runtime filtering to achieve robust authentication, are designed based upon this concept and verified in several CMOS technology nodes. In addition, a 2-transistor sub-threshold amplifier based "weak" PUF is also presented for chip identification and key storage. This PUF achieves state-of-the-art 1.65% native unstable bit, 1.5fJ per bit energy efficiency, and 3.16% flipping bits across -40 to 120 degree Celsius range at the same time, while occupying only 553 feature size square area in 180nm CMOS. Secondly, the potential security threats of hardware Trojan is investigated and a new Trojan attack using analog behavior of digital processors is proposed as the first stealthy and controllable fabrication-time hardware attack. Hardware Trojan is an emerging concern about globalization of semiconductor supply chain, which can result in catastrophic attacks that are extremely difficult to find and protect against. Hardware Trojans proposed in previous works are based on either design-time code injection to hardware description language or fabrication-time modification of processing steps. There have been defenses developed for both types of attacks. A third type of attack that combines the benefits of logical stealthy and controllability in design-time attacks and physical "invisibility" is proposed in this work that crosses the analog and digital domains. The attack eludes activation by a diverse set of benchmarks and evades known defenses. Lastly, in addition to security-related circuits, physical sensors are also studied as fundamental building blocks of IoT systems in this work. Temperature sensing is one of the most desired functions for a wide range of IoT applications. A sub-threshold oscillator based digital temperature sensor utilizing the exponential temperature dependence of sub-threshold current is proposed and implemented. In 180nm CMOS, it achieves 0.22/0.19K inaccuracy and 73mK noise-limited resolution with only 8865 square micrometer additional area and 75nW extra power consumption to an existing IoT system.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/138779/1/kaiyuan_1.pd

    Quality-of-Service-Adequate Wireless Receiver Design

    Get PDF

    Quality-of-Service-Adequate Wireless Receiver Design

    Get PDF

    Toward the implementation of analog LDPC decoders for long codewords

    Get PDF
    Error control codes are used in virtually every digital communication system. Traditionally, decoders have been implemented digitally. Analog decoders have been recently shown to have the potential to outperform digital decoders in terms of area and power/speed ratio. Analog designers have attempted to fully understand and exploit this potential for large decoders. However, large codes are generally still implemented with digital circuits. Nevertheless, in this thesis a number of aspects of analog decoder implementation are investigated with the hope of enabling the design of large analog decoders. In this thesis, we study and modify analog circuits used in a decoding algorithm known as the sum-product algorithm for implementation in a CMOS 90 nm technology. We apply a current-mode approach at the input nodes of these circuits and show through simulations that the power/speed ratio will be improved. Interested in studying the dynamics of decoders, we model an LDPC code in MATLAB's Simulink. We then apply the linearization technique on the modeled LDPC code in order to linearize the decoder about an initial state as its solution point. Challenges associated with decoder linearization are discussed. We also design and implement a chip comprised of the sum-product circuits with different configurations and sizes in order to study the effect of mismatch on the accuracy of the outputs. Unfortunately, testing of the chip fails as a result of errors in either the packaging process or fabrication

    An Analog Decoder for Turbo-Structured Low-Density Parity-Check Codes

    Get PDF
    In this work, we consider a class of structured regular LDPC codes, called Turbo-Structured LDPC (TS-LDPC). TS-LDPC codes outperform random LDPC codes and have much lower error floor at high Signal-to-Noise Ratio (SNR). In this thesis, Min-Sum (MS) algorithms are adopted in the decoding of TS-LDPC codes due to their low complexity in the implementation. We show that the error performance of the MS-based TS-LDPC decoder is comparable with the Sum-Product (SP) based decoder and the error floor property of TS-LDPC codes is preserved. The TS-LDPC decoding algorithms can be performed by analog or digital circuitry. Analog decoders are preferred in many communication systems due to their potential for higher speed, lower power dissipation and smaller chip area compared to their digital counterparts. In this work, implementation of the (120, 75) MS-based TS-LDPC analog decoder is considered. The decoder chip consists of an analog decoder heart, digital input and digital output blocks. These digital blocks are required to deliver the received signal to the analog decoder heart and transfer the estimated codewords to the off-chip module. The analog decoder heart is an analog processor performing decoding on the Tanner graph of the code. Variable and check nodes are the main building blocks of analog decoder which are designed and evaluated. The check node is the most complicated unit in MS-based decoders. The minimizer circuit, the fundamental block of a check node, is designed to have a good trade-off between speed and accuracy. In addition, the structure of a high degree minimizer is proposed considering the accuracy, speed, power consumption and robustness against mismatch of the check node unit. The measurement results demonstrate that the error performance of the chip is comparable with theory. The SNR loss at Bit-Error-Rate of 10−5 is only 0.2dB compared to the theory while information throughput is 750Mb/s and the energy efficiency of the decoder chip is 17pJ/b. It is shown that the proposed decoder outperforms the analog decoders that have been fabricated to date in the sense of error performance, throughput and energy efficiency. This decoder is the first analog decoder that has ever been implemented in a sub 100-nm technology and it improves the throughput of analog decoders by a factor of 56. This decoder sets a new state-of-the-art in analog decoding

    Low power JPEG2000 5/3 discrete wavelet transform algorithm and architecture

    Get PDF

    Design Techniques for Energy-Quality Scalable Digital Systems

    Get PDF
    Energy efficiency is one of the key design goals in modern computing. Increasingly complex tasks are being executed in mobile devices and Internet of Things end-nodes, which are expected to operate for long time intervals, in the orders of months or years, with the limited energy budgets provided by small form-factor batteries. Fortunately, many of such tasks are error resilient, meaning that they can toler- ate some relaxation in the accuracy, precision or reliability of internal operations, without a significant impact on the overall output quality. The error resilience of an application may derive from a number of factors. The processing of analog sensor inputs measuring quantities from the physical world may not always require maximum precision, as the amount of information that can be extracted is limited by the presence of external noise. Outputs destined for human consumption may also contain small or occasional errors, thanks to the limited capabilities of our vision and hearing systems. Finally, some computational patterns commonly found in domains such as statistics, machine learning and operational research, naturally tend to reduce or eliminate errors. Energy-Quality (EQ) scalable digital systems systematically trade off the quality of computations with energy efficiency, by relaxing the precision, the accuracy, or the reliability of internal software and hardware components in exchange for energy reductions. This design paradigm is believed to offer one of the most promising solutions to the impelling need for low-energy computing. Despite these high expectations, the current state-of-the-art in EQ scalable design suffers from important shortcomings. First, the great majority of techniques proposed in literature focus only on processing hardware and software components. Nonetheless, for many real devices, processing contributes only to a small portion of the total energy consumption, which is dominated by other components (e.g. I/O, memory or data transfers). Second, in order to fulfill its promises and become diffused in commercial devices, EQ scalable design needs to achieve industrial level maturity. This involves moving from purely academic research based on high-level models and theoretical assumptions to engineered flows compatible with existing industry standards. Third, the time-varying nature of error tolerance, both among different applications and within a single task, should become more central in the proposed design methods. This involves designing “dynamic” systems in which the precision or reliability of operations (and consequently their energy consumption) can be dynamically tuned at runtime, rather than “static” solutions, in which the output quality is fixed at design-time. This thesis introduces several new EQ scalable design techniques for digital systems that take the previous observations into account. Besides processing, the proposed methods apply the principles of EQ scalable design also to interconnects and peripherals, which are often relevant contributors to the total energy in sensor nodes and mobile systems respectively. Regardless of the target component, the presented techniques pay special attention to the accurate evaluation of benefits and overheads deriving from EQ scalability, using industrial-level models, and on the integration with existing standard tools and protocols. Moreover, all the works presented in this thesis allow the dynamic reconfiguration of output quality and energy consumption. More specifically, the contribution of this thesis is divided in three parts. In a first body of work, the design of EQ scalable modules for processing hardware data paths is considered. Three design flows are presented, targeting different technologies and exploiting different ways to achieve EQ scalability, i.e. timing-induced errors and precision reduction. These works are inspired by previous approaches from the literature, namely Reduced-Precision Redundancy and Dynamic Accuracy Scaling, which are re-thought to make them compatible with standard Electronic Design Automation (EDA) tools and flows, providing solutions to overcome their main limitations. The second part of the thesis investigates the application of EQ scalable design to serial interconnects, which are the de facto standard for data exchanges between processing hardware and sensors. In this context, two novel bus encodings are proposed, called Approximate Differential Encoding and Serial-T0, that exploit the statistical characteristics of data produced by sensors to reduce the energy consumption on the bus at the cost of controlled data approximations. The two techniques achieve different results for data of different origins, but share the common features of allowing runtime reconfiguration of the allowed error and being compatible with standard serial bus protocols. Finally, the last part of the manuscript is devoted to the application of EQ scalable design principles to displays, which are often among the most energy- hungry components in mobile systems. The two proposals in this context leverage the emissive nature of Organic Light-Emitting Diode (OLED) displays to save energy by altering the displayed image, thus inducing an output quality reduction that depends on the amount of such alteration. The first technique implements an image-adaptive form of brightness scaling, whose outputs are optimized in terms of balance between power consumption and similarity with the input. The second approach achieves concurrent power reduction and image enhancement, by means of an adaptive polynomial transformation. Both solutions focus on minimizing the overheads associated with a real-time implementation of the transformations in software or hardware, so that these do not offset the savings in the display. For each of these three topics, results show that the aforementioned goal of building EQ scalable systems compatible with existing best practices and mature for being integrated in commercial devices can be effectively achieved. Moreover, they also show that very simple and similar principles can be applied to design EQ scalable versions of different system components (processing, peripherals and I/O), and to equip these components with knobs for the runtime reconfiguration of the energy versus quality tradeoff

    Modeling and Mitigation of Soft Errors in Nanoscale SRAMs

    Get PDF
    Energetic particle (alpha particle, cosmic neutron, etc.) induced single event data upset or soft error has emerged as a key reliability concern in SRAMs in sub-100 nanometre technologies. Low operating voltage, small node capacitance, high packing density, and lack of error masking mechanisms are primarily responsible for the soft error susceptibility of SRAMs. In addition, since SRAM occupies the majority of die area in system-on-chips (SoCs) and microprocessors, different leakage reduction techniques, such as, supply voltage reduction, gated grounding, etc., are applied to SRAMs in order to limit the overall chip leakage. These leakage reduction techniques exponentially increase the soft error rate in SRAMs. The soft error rate is further accentuated by process variations, which are prominent in scaled-down technologies. In this research, we address these concerns and propose techniques to characterize and mitigate soft errors in nanoscale SRAMs. We develop a comprehensive analytical model of the critical charge, which is a key to assessing the soft error susceptibility of SRAMs. The model is based on the dynamic behaviour of the cell and a simple decoupling technique for the non-linearly coupled storage nodes. The model describes the critical charge in terms of NMOS and PMOS transistor parameters, cell supply voltage, and noise current parameters. Consequently, it enables characterizing the spread of critical charge due to process induced variations in these parameters and to manufacturing defects, such as, resistive contacts or vias. In addition, the model can estimate the improvement in critical charge when MIM capacitors are added to the cell in order to improve the soft error robustness. The model is validated by SPICE simulations (90nm CMOS) and radiation test. The critical charge calculated by the model is in good agreement with SPICE simulations with a maximum discrepancy of less than 5%. The soft error rate estimated by the model for low voltage (sub 0.8 V) operations is within 10% of the soft error rate measured in the radiation test. Therefore, the model can serve as a reliable alternative to time consuming SPICE simulations for optimizing the critical charge and hence the soft error rate at the design stage. In order to limit the soft error rate further, we propose an area-efficient multiword based error correction code (MECC) scheme. The MECC scheme combines four 32 bit data words to form a composite 128 bit ECC word and uses an optimized 4-input transmission-gate XOR logic. Thus MECC significantly reduces the area overhead for check-bit storage and the delay penalty for error correction. In addition, MECC interleaves two composite words in a row for limiting cosmic neutron induced multi-bit errors. The ground potentials of the composite words are controlled to minimize leakage power without compromising the read data stability. However, use of composite words involves a unique write operation where one data word is written while other three data words are read to update the check-bits. A power efficient word line signaling technique is developed to facilitate the write operation. A 64 kb SRAM macro with MECC is designed and fabricated in a commercial 90nm CMOS technology. Measurement results show that the SRAM consumes 534 ÎŒW at 100 MHz with a data latency of 3.3 ns for a single bit error correction. This translates into 82% per-bit energy saving and 8x speed improvement over recently reported multiword ECC schemes. Accelerated neutron radiation test carried out at TRIUMF in Vancouver confirms that the proposed MECC scheme can correct up to 85% of soft errors

    Cross-Layer Optimization for Power-Efficient and Robust Digital Circuits and Systems

    Full text link
    With the increasing digital services demand, performance and power-efficiency become vital requirements for digital circuits and systems. However, the enabling CMOS technology scaling has been facing significant challenges of device uncertainties, such as process, voltage, and temperature variations. To ensure system reliability, worst-case corner assumptions are usually made in each design level. However, the over-pessimistic worst-case margin leads to unnecessary power waste and performance loss as high as 2.2x. Since optimizations are traditionally confined to each specific level, those safe margins can hardly be properly exploited. To tackle the challenge, it is therefore advised in this Ph.D. thesis to perform a cross-layer optimization for digital signal processing circuits and systems, to achieve a global balance of power consumption and output quality. To conclude, the traditional over-pessimistic worst-case approach leads to huge power waste. In contrast, the adaptive voltage scaling approach saves power (25% for the CORDIC application) by providing a just-needed supply voltage. The power saving is maximized (46% for CORDIC) when a more aggressive voltage over-scaling scheme is applied. These sparsely occurred circuit errors produced by aggressive voltage over-scaling are mitigated by higher level error resilient designs. For functions like FFT and CORDIC, smart error mitigation schemes were proposed to enhance reliability (soft-errors and timing-errors, respectively). Applications like Massive MIMO systems are robust against lower level errors, thanks to the intrinsically redundant antennas. This property makes it applicable to embrace digital hardware that trades quality for power savings.Comment: 190 page
    corecore