Search CORE

15 research outputs found

공정 변화에 둔감한 자동 온도 보상 셀프 리프레쉬용 모바일 디램 온도계

Author: 심대용
Publication venue: 서울대학교 대학원
Publication date: 01/08/2013
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2013. 8. 김수환.Smaller transistors mean that capacitors are charged less uniformly, which increases the self-refresh current in the DRAMs used in mobile devices. Adaptive self-refresh using an on-chip thermometer can solve this problem. In this thesis, a PVT tolerant on-chip CMOS thermometer specifically designed for controlling the refresh period of a DRAM will be proposed for low power mobile DRAM. Two types of on-chip CMOS thermometer including a novel temperature sensor is proposed, which is implemented in two different DRAM process technologies integrated into mobile LPDDR2 and LPDDR3 products. The on-chip thermometer incorporating in mobile LPDDR2 chip is fabricated in a 44nm DRAM process with a supply of 1.1V. The sensor has a temperature sensitivity of −3.2mV/°C, over a range of 0°C to 110°C. Its resolution is 1.94°C and is only limited by the 6.2mV step of the associated resistor ladder not by its own design. The high linearity of the sensor permits one-point calibration, after which the errors in 61 sample circuits ranged between −1.42°C and +2.66°C. The sensor has an active area of 0.001725mm2 and consumes less than 0.36μW on average with a supply of 1.1V. To improve the overall performance including ultra-low operation voltage, temperature sensitivity, low power consumption, high linearity regardless of process skew variations and high productivity improved by one point calibration, the folded type on-chip thermometer incorporating in mobile LPDDR3 chip which fabricated in a 29nm DRAM process with a supply of 1.1V and 0.8V will be proposed. This folded type sensor exhibits further upgrading properties such as a temperature sensitivity of −3.2mV/°[email protected] &−3.13mV/°C @0.8V, over wide range of -40°C to 110°C. Its resolution is 1.85°[email protected] & 1.98°[email protected] and is only limited by the 6.2mV step. The more linearity of folded type sensor permits one-point calibration, after which the errors in 494 sample circuits ranged between −1.94°C and +1.61°C. The folded type sensor has an active area of 0.001606mm2 and consumes less than 0.19μ[email protected] & 0.14μ[email protected] on average slightly more than unfolded type sensor.ABSTRACT I CONTENTS III LIST OF FIGURES V LIST OF TABLES IX CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 3 CHAPTER 2 ARCHITECTURE OF THERMOMETER 5 2.1 INTRODUCTION TO ON-CHIP THERMOMETER IN MOBILE DRAM 5 2.2 PROPOSED ON-CHIP CMOS THERMOMETER ARCHITECTURE 17 2.3 TEMPERATURE READOUT PROCEDURE OF PROPOSED ON-CHIP CMOS THERMOMETER 23 2.4 PROPOSED FOLDED TYPE ON-CHIP CMOS THERMOMETER ARCHITECTURE 25 2.5 TEMPERATURE READOUT PROCEDURE OF PROPOSED FOLDED TYPE ON-CHIP CMOS THERMOMETER 30 2.6 ONE-POINT CALIBRATION METHOD 32 2.7 TEMPERATURE LINEARITY OF TEMPERATURE SENSOR 35 CHAPTER 3 OPERATIONAL PRINCIPLES OF CMOS TEMPERATURE SENSOR IN MOBILE DRAM 39 3.1 PRIOR WORKS OF ON-CHIP THERMOMETER 39 3.2 PROPOSED CMOS TEMPERATURE SENSOR IN MOBILE DRAM 44 3.3 OPERATION PRINCIPLES OF PROPOSED TEMPERATURE SENSOR 48 3.4 PROPOSED FOLDED TYPE TEMPERATURE SENSOR 55 CHAPTER 4 PERIPHERAL CIRCUITS OF THERMOMETER 60 4.1 REGULATOR FOR VLTCSR SUPPLY 61 4.1.1 DC ANALYSIS 62 4.1.2 AC ANALYSIS 63 4.2 RESISTOR DECK 67 4.3 COMPARATOR 68 CHAPTER 5 EXPERIMENTAL RESULTS 70 5.1 ON-CHIP CMOS THERMOMETER IN 44NM CMOS PROCESS FOR MOBILE LPDDR2 74 5.2 FOLDED TYPE ON-CHIP CMOS THERMOMETER IN 29NM CMOS PROCESS FOR MOBILE LPDDR3 77 CHAPTER 6 CONCLUSIONS 83 BIBLIOGRAPHY 86 ABSTRACT IN KOREAN 89Docto

SNU Open Repository and Archive

적응형 눈 감지 방법을 포함한 저전력 메모리 컨트롤러의 설계

Author: 김민오
Publication venue: 서울대학교 대학원
Publication date: 01/08/2017
Field of study

학위논문 (박사)-- 서울대학교 대학원 공과대학 전기·컴퓨터공학부, 2017. 8. 김수환.and the read margin was enhanced from 0.30UI and 76mV without AF-CTLE to 0.47UI and 80mV to with AF-CTLE. The power efficiency during burst write and read were 5.68pJ/bit and 1.83pJ/bit respectively.A 4266Mb/s/pin LPDDR4 memory controller with an asynchronous feedback continuous-time linear equalizer and an adaptive 3-step eye detection algorithm is presented. The asynchronous feedback continuous-time linear equalizer removes the glitch of DQS without training by applying an offset larger than the noise, and improves read margin by operating as a decision feedback equalizer in DQ path. The adaptive 3-step eye detection algorithm reduces power consumption and black-out time in initialization sequence and retraining in comparison to the 2-dimensional full scanning. In addition, the adaptive 3-step eye detection algorithm can maintain the accuracy by sequentially searching the eye boundaries and initializing the resolution using the binary search method when the eye detection result changes. To achieve high bandwidth, a transmitter and receiver suitable for training are proposed. The transmitter consists of a phase interpolator, a digitally-controlled delay line, a 16:1 serializer, a pre-driver and low-voltage swing terminated logic. The receiver consists of a reference voltage generator, a continuous-time linear equalizer, a phase interpolator, a digitally-controlled delay line, a 1:4 deserializer, and a 4:16 deserializer. The clocking architecture is also designed for low power consumption in idle periods, which are commonly lengthy in mobile applications. A prototype chip was implemented in a 65nm CMOS process with ball grid array package and tested with commodity LPDDR4. The write margin was 0.36UI and 148mVCHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 5 CHAPTER 2 LPDDR4 6 2.1 COMPARISON BETWEEN LPDDR3 AND LPDDR4 6 2.2 SOURCE SYNCHRONOUS CLOCKING SCHEME 9 2.3 SIGNALING STANDARDS 11 2.4 MULTIPLE TRAININGS 14 2.5 RE-TRAINING AND RE-INITIALIZATION 16 CHAPTER 3 ADAPTIVE EYE DETECTION 18 3.1 EYE DETECTION 18 3.2 1X2Y3X EYE DETECTION 20 3.3 ADAPTIVE GAIN CONTROL 22 3.4 ADAPTIVE 1X2Y3X EYE DETECTION 24 CHAPTER 4 LPDDR4 MEMORY CONTROLLER 26 4.1 DESIGN PROCEDURE 26 4.2 ARCHITECTURE 30 4.2.1 TRANSMITTER 33 4.2.2 RECEIVER 35 4.2.3 CLOCKING ARCHITECTURE 38 4.3 CIRCUIT IMPLEMENTATION 43 4.3.1 ADPLL WITH MULTI-MODULUS DIVIDER 43 4.3.2 ADDLL WITH TRIANGULAR-MODULATED PI 45 4.3.3 CTLE WITH AUTO-DQS CLEANING 47 4.3.4 DES WITH CLOCK DOMAIN CROSSING 52 4.3.5 LVSTL WITH ZQ CALIBRATION 54 4.3.6 COARSE-FINE DCDL 56 4.4 LINK TRAINING 57 4.4.1 SIMULATION RESULTS 59 CHAPTER 5 MEASUREMENT RESULTS 72 5.1 MEASUREMENT SETUP 72 5.2 MEASUREMENT RESULTS OF SUB-BLOCK 80 5.2.1 ADPLL WITH MULTI-MODULUS DIVIDER 80 5.2.2 ADDLL WITH TRIANGULAR-MODULATED PI 82 5.2.3 COARSE-FINE DCDL 84 5.3 LPDDR4 INTERFACE MEASUREMENT RESULTS 84 CHAPTER 6 CONCLUSION 88 BIBLIOGRAPHY 90Docto

SNU Open Repository and Archive

Ultra-low Power Circuits for Internet of Things (IOT)

Author: Jeong Seok Hyeon
Publication venue
Publication date
Field of study

Miniaturized sensor nodes offer an unprecedented opportunity for the semiconductor industry which led to a rapid development of the application space: the Internet of Things (IoT). IoT is a global infrastructure that interconnects physical and virtual things which have the potential to dramatically improve people's daily lives. One of key aspect that makes IoT special is that the internet is expanding into places that has been ever reachable as device form factor continue to decreases. Extremely small sensors can be placed on plants, animals, humans, and geologic features, and connected to the Internet. Several challenges, however, exist that could possibly slow the development of IoT. In this thesis, several circuit techniques as well as system level optimizations to meet the challenging power/energy requirement for the IoT design space are described. First, a fully-integrated temperature sensor for battery-operated, ultra-low power microsystems is presented. Sensor operation is based on temperature independent/dependent current sources that are used with oscillators and counters to generate a digital temperature code. Second, an ultra-low power oscillator designed for wake-up timers in compact wireless sensors is presented. The proposed topology separates the continuous comparator from the oscillation path and activates it only for short period when it is required. As a result, both low power tracking and generation of precise wake-up signal is made possible. Third, an 8-bit sub-ranging SAR ADC for biomedical applications is discussed that takes an advantage of signal characteristics. ADC uses a moving window and stores the previous MSBs voltage value on a series capacitor to achieve energy saving compared to a conventional approach while maintaining its accuracy. Finally, an ultra-low power acoustic sensing and object recognition microsystem that uses frequency domain feature extraction and classification is presented. By introducing ultra-low 8-bit SAR-ADC with 50fF input capacitance, power consumption of the frontend amplifier has been reduced to single digit nW-level. Also, serialized discrete Fourier transform (DFT) feature extraction is proposed in a digital back-end, replacing a high-power/area-consuming conventional FFT.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/137157/1/seojeong_1.pd

Deep Blue Documents at the University of Michigan

Current-mode processing based Temperature-to-Digital Converters for MEMS applications

Author: APRILE ANTONIO
Publication venue: Università degli studi di Pavia
Publication date: 05/04/2023
Field of study

This thesis presents novel Temperature-to-Digital Converters (TDCs) designed and fabricated in CMOS technology. These integrated smart temperature sensing circuits are widely employed in the Micro-Electro-Mechanical Systems (MEMS) field in order to mitigate the impact of the ambient temperature on their performance. In this framework, the increasingly stringent demands of the market have led the cost-effectiveness specification of these compensation solutions to an higher and higher level, directly translating into the requirement of more and more compact designs (< 0.1 mm²); in addition to this, considering that the great majority of the systems whose thermal drift needs to be compensated is battery supplied, ultra-low energy-per-conversion (< 10 nJ) is another requirement of primary importance. This thesis provides a detailed description of two different test-chips (mas fuerte and es posible) that have been designed with this orientation and that are the result of three years of research activity; for both devices, the conception, design, layout and testing phases are all described in detail and are supported by simulation and measurement results.This thesis presents novel Temperature-to-Digital Converters (TDCs) designed and fabricated in CMOS technology. These integrated smart temperature sensing circuits are widely employed in the Micro-Electro-Mechanical Systems (MEMS) field in order to mitigate the impact of the ambient temperature on their performance. In this framework, the increasingly stringent demands of the market have led the cost-effectiveness specification of these compensation solutions to an higher and higher level, directly translating into the requirement of more and more compact designs (< 0.1 mm²); in addition to this, considering that the great majority of the systems whose thermal drift needs to be compensated is battery supplied, ultra-low energy-per-conversion (< 10 nJ) is another requirement of primary importance. This thesis provides a detailed description of two different test-chips (mas fuerte and es posible) that have been designed with this orientation and that are the result of three years of research activity; for both devices, the conception, design, layout and testing phases are all described in detail and are supported by simulation and measurement results

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

A LPDDR4 MEMORY CONTROLLER DESIGN WITH EYE CENTER DETECTION ALGORITHM

Author: 홍기문
Publication venue: 서울대학교 대학원
Publication date: 01/02/2016
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 2. 김수환.The demand for higher bandwidth with reduced power consumption in mobile memory is increasing. In this thesis, architecture of the LPDDR4 memory controller, operated with a LPDDR4 memory, is proposed and designed, and efficient training algorithm, which is appropriate for this architecture, is proposed for memory training and verification. The operation speed range of the LPDDR4 memory specification is from 533Mbps to 4266Mbps, and the LPDDR4 memory controller is designed to support that range of the LPDDR4 memory. The phase-locked loop in the LPDDR4 memory controller is designed to operate between 1333MHz and 2133MHz. To cover the range of the LPDDR4 memory, the selectable frequency divider is used to provide operation clock. The output frequency of the phase-locked loop with divider is from 266MHz to 2133MHz. The delay-locked loop in the LPDDR4 memory controller is designed to operate between 266MHz and 2133MHz with 180˚ phase locking. The delay-locked loop is used each training operation, which is command training, data read and write training. To complete training in each training stage, eye center detection algorithm is used. The circuits for the proposed eye center detection algorithm such as delay line, phase interpolator and reference generator are designed and validated. The proposed 1x2y3x eye center detection algorithm is 23 times faster than conventional two-dimensional eye center detection algorithm and it can be implemented simply. Using 65nm CMOS process, the proposed LPDDR4 memory controller occupies 12mm2. The verification of the LPDDR4 memory controller is performed with commodity LPDDR4 memory. The verification of all training sequence, which is power on, initializing, boot up, command training, write leveling, read training, write training, is performed in this environment. The low voltage swing terminated logic driver and other several functions, including write leveling and data transmission, are verified at 4266Mbps and the entire LPDDR4 memory controller operations from 566Mbps to 1600Mbps are verified. The proposed eye center detection algorithm is verified from 566Mbps to 2843Mbps.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 INTRODUCTION 5 1.3 THESIS ORGANIZATION 7 CHAPTER 2 LPDDR4 MEMORY CONTROLLER DESIGN 8 2.1 DIFFERENCE BETWEEN LPDDR3 AND LPDDR4 MEMORY 8 2.1.1 ARCHITECTURAL DIFFERENCE BETWEEN LPDDR3 AND LPDDR4 MEMORY 10 2.1.2 SOURCE SYNCHRONOUS MATCHED SCHEME AND UNMATCHED SCHEME 11 2.1.3 LOW VOLTAGE SWING TERMINATED LOGIC DRIVER AND TERMINATION SCHEME 12 2.2 LPDDR4 MEMORY CONTROLLER SPECIFICATION 15 2.3 DESIGN PROCEDURE 18 CHAPTER 3 LPDDR4 MEMORY CONTROLLER ARCHITECTURE BASED ON MEMORY TRAINING 20 3.1 LPDDR4 MEMORY TRAINING SEQUENCE 20 3.2 LPDDR4 MEMORY TRAINING EYE DETECTION ALGORITHM 24 3.2.1 EYE CENTER DETECTION 24 3.2.2 1X2Y3X EYE CENTER DETECTION ALGORITHM 27 3.3. LPDDR4 MEMORY CONTROLLER DESIGN BASED ON MEMORY TRAINING 31 3.3.1 ARCHITECTURE FOR MEMORY BOOT UP AND POWER UP 31 3.3.2 CLOCK PATH ARCHITECTURE AND CLOCK TREE 34 3.3.3 COMMAND TRAINING AND COMMAND PATH ARCHITECTURE 35 3.3.4 WRITE LEVELING AND DATA STROBE TRANSMISSION PATH ARCHITECTURE 39 3.3.5 READ TRAINING AND READ PATH ARCHITECTURE 41 3.3.6 WRITE TRAINING AND WRITE PATH ARCHITECTURE 43 3.3.7 NORMAL READ/WRITE OPERATION AND MARGIN TEST 46 CHAPTER 4 LPDDR4 MEMORY CONTROLLER ARCHITECTURE MODELING AND CIRCUIT DESIGN 48 4.1 OVERALL LPDDR4 MEMORY CONTROLLER ARCHITECTURE MODELING 48 4.2 SIMULATION RESULT OF LPDDR4 MEMORY CONTROLLER MODELING 51 4.3 LPDDR4 MEMORY CONTROLLER CIRCUIT DESIGN 61 4.3.1 PHASE-LOCKED LOOP 61 4.3.2 DELAY-LOCKED LOOP 65 4.3.3 TRANSMITTER OF LPDDR4 MEMORY CONTROLLER: WRITE PATH 70 4.3.4 DE-SERIALIZER WITH CLOCK DOMAIN CROSSING 75 CHAPTER 5 MEASUREMENT RESULT OF LPDDR4 MEMORY CONTROLLER 77 5.1 LPDDR4 MEMORY CONTROLLER MEASUREMENT SETUP 77 5.1.1 LPDDR4 MEMORY CONTROLLER FLOOR PLAN AND LAYOUT 77 5.1.2 PACKAGE AND TEST BOARD 79 5.2 LPDDR4 MEMORY CONTROLLER SUB-BLOCK MEASUREMENT 81 5.2.1 PHASE-LOCKED LOOP 81 5.2.2 DELAY-LOCKED LOOP 83 5.2.3 200PS AND 800PS DELAY LINE 85 5.2.4 VOLTAGE REFERENCE GENERATOR 86 5.2.5 PHASE INTERPOLATOR 87 5.3 LPDDR4 MEMORY SYSTEM OPERATION MEASUREMENT 90 CHAPTER 6 CONCLUSION 93 APPENDIX OPERATION FLOW CHART OF THE PROPOSED LPDDR4 MEMORY CONTROLLER 95 BIBLIOGRAPHY 118 KOREAN ABSTRACT 124Docto

SNU Open Repository and Archive

Approximation Opportunities in Edge Computing Hardware : A Systematic Literature Review

Author: Damsgaard Hans Jakob
Nurmi Jari
Ometov Aleksandr
Publication venue
Publication date: 29/11/2022
Field of study

With the increasing popularity of the Internet of Things and massive Machine Type Communication technologies, the number of connected devices is rising. However, while enabling valuable effects to our lives, bandwidth and latency constraints challenge Cloud processing of their associated data amounts. A promising solution to these challenges is the combination of Edge and approximate computing techniques that allows for data processing nearer to the user. This paper aims to survey the potential benefits of these paradigms’ intersection. We provide a state-of-the-art review of circuit-level and architecture-level hardware techniques and popular applications. We also outline essential future research directions.publishedVersionPeer reviewe

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Trepo - Institutional Repository of Tampere University

Toward realizing power scalable and energy proportional high-speed wireline links

Author: Anand Tejasvi
Publication venue
Publication date: 01/12/2015
Field of study

Growing computational demand and proliferation of cloud computing has placed high-speed serial links at the center stage. Due to saturating energy efficiency improvements over the last five years, increasing the data throughput comes at the cost of power consumption. Conventionally, serial link power can be reduced by optimizing individual building blocks such as output drivers, receiver, or clock generation and distribution. However, this approach yields very limited efficiency improvement. This dissertation takes an alternative approach toward reducing the serial link power. Instead of optimizing the power of individual building blocks, power of the entire serial link is reduced by exploiting serial link usage by the applications. It has been demonstrated that serial links in servers are underutilized. On average, they are used only 15% of the time, i.e. these links are idle for approximately 85% of the time. Conventional links consume power during idle periods to maintain synchronization between the transmitter and the receiver. However, by powering-off the link when idle and powering it back when needed, power consumption of the serial link can be scaled proportionally to its utilization. This approach of rapid power state transitioning is known as the rapid-on/off approach. For the rapid-on/off to be effective, ideally the power-on time, off-state power, and power state transition energy must all be close to zero. However, in practice, it is very difficult to achieve these ideal conditions. Work presented in this dissertation addresses these challenges. When this research work was started (2011-12), there were only a couple of research papers available in the area of rapid-on/off links. Systematic study or design of a rapid power state transitioning in serial links was not available in the literature. Since rapid-on/off with nanoseconds granularity is not a standard in any wireline communication, even the popular test equipment does not support testing any such feature, neither any formal measurement methodology was available. All these circumstances made the beginning difficult. However, these challenges provided a unique opportunity to explore new architectural techniques and identify trade-offs. The key contributions of this dissertation are as follows. The first and foremost contribution is understanding the underlying limitations of saturating energy efficiency improvements in serial links and why there is a compelling need to find alternative ways to reduce the serial link power. The second contribution is to identify potential power saving techniques and evaluate the challenges they pose and the opportunities they present. The third contribution is the design of a 5Gb/s transmitter with a rapid-on/off feature. The transmitter achieves rapid-on/off capability in voltage mode output driver by using a fast-digital regulator, and in the clock multiplier by accurate frequency pre-setting and periodic reference insertion. To ease timing requirements, an improved edge replacement logic circuit for the clock multiplier is proposed. Mathematical modeling of power-on time as a function of various circuit parameters is also discussed. The proposed transmitter demonstrates energy proportional operation over wide variations of link utilization, and is, therefore, suitable for energy efficient links. Fabricated in 90nm CMOS technology, the voltage mode driver, and the clock multiplier achieve power-on-time of only 2ns and 10ns, respectively. This dissertation highlights key trade-off in the clock multiplier architecture, to achieve fast power-on-lock capability at the cost of jitter performance. The fourth contribution is the design of a 7GHz rapid-on/off LC-PLL based clock multi- plier. The phase locked loop (PLL) based multiplier was developed to overcome the limita- tions of the MDLL based approach. Proposed temperature compensated LC-PLL achieves power-on-lock in 1ns. The fifth and biggest contribution of this dissertation is the design of a 7Gb/s embedded clock transceiver, which achieves rapid-on/off capability in LC-PLL, current-mode transmit- ter and receiver. It was the first reported design of a complete transceiver, with an embedded clock architecture, having rapid-on/off capability. Background phase calibration technique in PLL and CDR phase calibration logic in the receiver enable instantaneous lock on power-on. The proposed transceiver demonstrates power scalability with a wide range of link utiliza- tion and, therefore, helps in improving overall system efficiency. Fabricated in 65nm CMOS technology, the 7Gb/s transceiver achieves power-on-lock in less than 20ns. The transceiver achieves power scaling by 44x (63.7mW-to-1.43mW) and energy efficiency degradation by only 2.2x (9.1pJ/bit-to-20.5pJ/bit), when the effective data rate (link utilization) changes by 100x (7Gb/s-to-70Mb/s). The sixth and final contribution is the design of a temperature sensor to compensate the frequency drifts due to temperature variations, during long power-off periods, in the fast power-on-lock LC-PLL. The proposed self-referenced VCO-based temperature sensor is designed with all digital logic gates and achieves low supply sensitivity. This sensor is suitable for integration in processor and DRAM environments. The proposed sensor works on the principle of directly converting temperature information to frequency and finally to digital bits. A novel sensing technique is proposed in which temperature information is acquired by creating a threshold voltage difference between the transistors used in the oscillators. Reduced supply sensitivity is achieved by employing junction capacitance, and the overhead of voltage regulators and an external ideal reference frequency is avoided. The effect of VCO phase noise on the sensor resolution is mathematically evaluated. Fabricated in the 65nm CMOS process, the prototype can operate with a supply ranging from 0.85V to 1.1V, and it achieves a supply sensitivity of 0.034oC/mV and an inaccuracy of ±0.9oC and ±2.3oC from 0-100oC after 2-point calibration, with and without static nonlinearity correction, respectively. It achieves a resolution of 0.3oC, resolution FoM of 0.3(nJ/conv)res2 , and measurement (conversion) time of 6.5μs

Illinois Digital Environment for Access to Learning and Scholarship Repository

Recommended from our members

Variation-Tolerant and Voltage-Scalable Integrated Circuits Design

Author: Kim Seongjong
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2016
Field of study

Ultra-low-voltage (ULV) operation where the supply voltage of the digital computing hardware is scaled down to the level near or below transistor threshold voltage (e.g. 300-500mV) is a key technique to achieve high computing energy efficiency. It has enabled many new exciting applications in the field of Internet of Things (IoT) devices and energy-constrained applications such as medical implants, environment sensors, and micro-robots. Ultra-low-voltage (ULV) operation is also commonly used with the emerging architectures that are often non Von-Neumann style to empower energy-efficient cognitive computing. One the biggest challenge in realizing ULV design is the large circuit delay variability. To guarantee functionality in the worst-case process, voltage, and temperature (PVT) condition, the traditional safety margin approach requires operating at a slower clock frequency or higher supply voltage which significantly limits the achievable energy efficiency of the hardware. To fully claim the energy efficiency of ULV, the large circuit delay variation needs to be adaptively handled. However, the existing adaptive techniques that are optimized for nominal supply voltage operation and traditional Von-Neumann architectures become inefficient for ULV designs and emerging architectures. This thesis presents adaptive techniques based on timing error detection and correction (EDAC) that are more suitable for the energy-constrained ULV designs and the emerging architectures. The proposed techniques are demonstrated in three test chips: (1) R-Processor: A 0.4V resilient processor with a voltage-scalable and low-overhead in-situ EDAC technique. It achieves 38% energy efficiency improvement or 2.3X throughput improvement as compared to the traditional safety margin approach. (2) A 450mV timing-margin-free waveform sorter for brain computer interface (BCI) microsystem. It achieves 49.3% higher energy efficiency and 35.6% higher throughput than the traditional safety margin approach. (3) Ultra-low-power and robust power-management system which consists of a microprocessor employing ULV EDAC, 63-ratio integrated switched-capacitor DC-DC converter, and a fully-digital error based regulation controller. In this thesis, we also explore circuits for emerging techniques. The first is temperature sensors for dynamic-thermal-management (DTM). The modern high-performance microprocessors suffer from ever-increasing power densities which has led to reliability concerns and increased cooling costs from excessive heat. In order to monitor and manage the thermal behavior, DTM techniques embed multiple temperature sensors and use its information. The size, accuracy, and voltage-scalability of the sensor are critical for the performance of DTM. Therefore, we propose a temperature sensor that directly senses transistor threshold voltage and the test chip demonstrates 9X smaller area, 3X higher accuracy, and 200mV lower voltage scalability (down to 400mV) than the previous state-of-art. Another area of exploration is interconnect design for ultra-dynamic-voltage-scaling (UDVS) systems. UDVS has been proposed for applications that require both high performance and high energy efficiency. UDVS can provide peak performance with nominal supply voltage when work load is high. When work load is moderate or low, UDVS systems can switch to ULV operation for higher energy efficiency. One of the critical challenges for developing UDVS systems is the inflexibility in various circuit fabrics that are often optimized for a single supply voltage. One critical example is conventional repeater based long interconnects which suffers from non-optimal performance and energy efficiency in UDVS systems. Therefore, in this thesis, we propose a reconfigurable interconnect design based on regenerators and demonstrate near optimal performance and energy efficiency across the supply voltage of 0.3V and 1V

Columbia University Academic Commons

Portable Computer Technology (PCT) Research and Development Program Phase 2

Author: Castillo Michael
McGuire Kenyon
Sorgi Alan
Publication venue
Publication date
Field of study

The subject of this project report, focused on: (1) Design and development of two Advanced Portable Workstation 2 (APW 2) units. These units incorporate advanced technology features such as a low power Pentium processor, a high resolution color display, National Television Standards Committee (NTSC) video handling capabilities, a Personal Computer Memory Card International Association (PCMCIA) interface, and Small Computer System Interface (SCSI) and ethernet interfaces. (2) Use these units to integrate and demonstrate advanced wireless network and portable video capabilities. (3) Qualification of the APW 2 systems for use in specific experiments aboard the Mir Space Station. A major objective of the PCT Phase 2 program was to help guide future choices in computing platforms and techniques for meeting National Aeronautics and Space Administration (NASA) mission objectives. The focus being on the development of optimal configurations of computing hardware, software applications, and network technologies for use on NASA missions

NASA Technical Reports Server

CMOS SPAD-based image sensor for single photon counting and time of flight imaging

Author: Dutton Neale Arthur William
Publication venue: The University of Edinburgh
Publication date: 27/06/2016
Field of study

The facility to capture the arrival of a single photon, is the fundamental limit to the detection of quantised electromagnetic radiation. An image sensor capable of capturing a picture with this ultimate optical and temporal precision is the pinnacle of photo-sensing. The creation of high spatial resolution, single photon sensitive, and time-resolved image sensors in complementary metal oxide semiconductor (CMOS) technology offers numerous benefits in a wide field of applications. These CMOS devices will be suitable to replace high sensitivity charge-coupled device (CCD) technology (electron-multiplied or electron bombarded) with significantly lower cost and comparable performance in low light or high speed scenarios. For example, with temporal resolution in the order of nano and picoseconds, detailed three-dimensional (3D) pictures can be formed by measuring the time of flight (TOF) of a light pulse. High frame rate imaging of single photons can yield new capabilities in super-resolution microscopy. Also, the imaging of quantum effects such as the entanglement of photons may be realised. The goal of this research project is the development of such an image sensor by exploiting single photon avalanche diodes (SPAD) in advanced imaging-specific 130nm front side illuminated (FSI) CMOS technology. SPADs have three key combined advantages over other imaging technologies: single photon sensitivity, picosecond temporal resolution and the facility to be integrated in standard CMOS technology. Analogue techniques are employed to create an efficient and compact imager that is scalable to mega-pixel arrays. A SPAD-based image sensor is described with 320 by 240 pixels at a pitch of 8μm and an optical efficiency or fill-factor of 26.8%. Each pixel comprises a SPAD with a hybrid analogue counting and memory circuit that makes novel use of a low-power charge transfer amplifier. Global shutter single photon counting images are captured. These exhibit photon shot noise limited statistics with unprecedented low input-referred noise at an equivalent of 0.06 electrons. The CMOS image sensor (CIS) trends of shrinking pixels, increasing array sizes, decreasing read noise, fast readout and oversampled image formation are projected towards the formation of binary single photon imagers or quanta image sensors (QIS). In a binary digital image capture mode, the image sensor offers a look-ahead to the properties and performance of future QISs with 20,000 binary frames per second readout with a bit error rate of 1.7 x 10-3. The bit density, or cumulative binary intensity, against exposure performance of this image sensor is in the shape of the famous Hurter and Driffield densitometry curves of photographic film. Oversampled time-gated binary image capture is demonstrated, capturing 3D TOF images with 3.8cm precision in a 60cm range

Edinburgh Research Archive