22 research outputs found
Voltage-to-Time Converter for High-Speed Time-Based Analog-to-Digital Converters
In modern complementary metal oxide semiconductor (CMOS) technologies, the supply voltage scales faster than the threshold voltage (Vth) of the transistors in successive smaller nodes. Moreover, the intrinsic gain of the transistors diminishes as well. Consequently, these issues increase the difficulty of designing higher speed and larger resolution analog-to-digital converters (ADCs) employing voltage-domain ADC architectures. Nevertheless, smaller transistor dimensions in state-of-the-art CMOS technologies leads to reduced capacitance, resulting in lower gate delays. Therefore, it becomes beneficial to first convert an input voltage to a 'time signal' using a voltage-to-time converter (VTC), instead of directly converting it into a digital output. This 'time-signal' could then be converted to a digital output through a time-to-digital converter (TDC) for complete analog-to-digital conversion. However, the overall performance of such an ADC will still be limited to the performance level of the voltage-to-time conversion process.
Hence, this thesis presents the design of a linear VTC for a high-speed time-based ADC in 28 nm CMOS process. The proposed VTC consists of a sample-and-hold (S/H) circuit, a ramp generator and a comparator to perform the conversion of the input signal from the voltage to the time domain. Larger linearity is attained by integrating a constant current (with high output impedance) over a capacitor, generating a linear ramp. The VTC operates at 256 MSPS consuming 1.3 mW from 1 V supply with a full-scale 1 V pk-pk differential input signal, while achieving a time-domain output signal with a spurious-free-dynamic-range (SFDR) of 77 dB and a signal-to-noise-and-distortion ratio (SNDR) of 56 dB at close to Nyquist frequency (f = 126.5 MHz). The proposed VTC attains an output range of 2.7 ns, which is the highest linear output range for a VTC at this speed, published to date
Low Power Decoding Circuits for Ultra Portable Devices
A wide spread of existing and emerging battery driven wireless devices do not necessarily demand high data rates. Rather, ultra low power, portability and low cost are the most desired characteristics. Examples of such applications are wireless sensor networks (WSN), body area networks (BAN), and a variety of medical implants and health-care aids. Being small, cheap and low power for the individual transceiver nodes, let those to be used in abundance in remote places, where access for maintenance or recharging the battery is limited. In such scenarios, the lifetime of the battery, in most cases, determines the lifetime of the individual nodes. Therefore, energy consumption has to be so low that the nodes remain operational for an extended period of time, even up to a few years. It is known that using error correcting codes (ECC) in a wireless link can potentially help to reduce the transmit power considerably. However, the power consumption of the coding-decoding hardware itself is critical in an ultra low power transceiver node. Power and silicon area overhead of coding-decoding circuitry needs to be kept at a minimum in the total energy and cost budget of the transceiver node. In this thesis, low power approaches in decoding circuits in the framework of the mentioned applications and use cases are investigated. The presented work is based on the 65nm CMOS technology and is structured in four parts as follows: In the first part, goals and objectives, background theory and fundamentals of the presented work is introduced. Also, the ECC block in coordination with its surrounding environment, a low power receiver chain, is presented. Designing and implementing an ultra low power and low cost wireless transceiver node introduces challenges that requires special considerations at various levels of abstraction. Similarly, a competitive solution often occurs after a conclusive design space exploration. The proposed decoder circuits in the following parts are designed to be embedded in the low power receiver chain, that is introduced in the first part. Second part, explores analog decoding method and its capabilities to be embedded in a compact and low power transceiver node. Analog decod- ing method has been theoretically introduced over a decade ago that followed with early proof of concept circuits that promised it to be a feasible low power solution. Still, with the increased popularity of low power sensor networks, it has not been clear how an analog decoding approach performs in terms of power, silicon area, data rate and integrity of calculations in recent technologies and for low data rates. Ultra low power budget, small size requirement and more relaxed demands on data rates suggests a decoding circuit with limited complexity. Therefore, the four-state (7,5) codes are considered for hardware implementation. Simulations to chose the critical design factors are presented. Consequently, to evaluate critical specifications of the decoding circuit, three versions of analog decoding circuit with different transistor dimensions fabricated. The measurements results reveal different trade-off possibilities as well as the potentials and limitations of the analog decoding approach for the target applications. Measurements seem to be crucial, since the available computer-aided design (CAD) tools provide limited assistance and precision, given the amount of calculations and parameters that has to be included in the simulations. The largest analog decoding core (AD1) takes 0.104mm2 on silicon and the other two (AD2 and AD3) take 0.035mm2 and 0.015mm2, respectively. Consequently, coding gain in trade-off with silicon area and throughput is presented. The analog decoders operate with 0.8V supply. The achieved coding gain is 2.3 dB at bit error rates (BER)=0.001 and 10 pico-Joules per bit (pJ/b) energy efficiency is reached at 2 Mbps. Third part of this thesis, proposes an alternative low power digital decoding approach for the same codes. The desired compact and low power goal has been pursued by designing an equivalent digital decoding circuit that is fabricated in 65nm CMOS technology and operates in low voltage (near-threshold) region. The architecture of the design is optimized in system and circuit levels to propose a competitive digital alternative. Similarly, critical specifications of the decoder in terms of power, area, data rate (speed) and integrity are reported according to the measurements. The digital implementation with 0.11mm2 area, consumes minimum energy at 0.32V supply which gives 9 pJ/b energy efficiency at 125 kb/s and 2.9 dB coding gain at BER=0.001. The forth and last part, compares the proposed design alternatives based on the fabricated chips and the results attained from the measurements to conclude the most suitable solution for the considered target applications. Advantages and disadvantages of both approaches are discussed. Possible extensions of this work is introduced as future work
Techniques for Energy-Efficient Communication Pipeline Design
The performance of many modern computer and
communication systems is dictated by the latency of communication
pipelines. At the same time, power/energy consumption
is often another limiting factor in many portable systems. We
address the problem of how to minimize the power consumption
in system-level pipelines under latency constraints. In particular,
we apply fragmentation technique to achieve parallelism and
exploit advantages provided by variable voltage design methodology
to optimally select voltage and, therefore, speed of each
pipeline stage.We focus our study on the practical case when each
pipeline stage operates at a fixed speed. Unlike the conventional
pipeline system, where all stages run at the same speed, our
system may have different stages running at different speeds to
conserve energy while providing guaranteed latency. For a given
latency requirement, we find explicit solutions for the most energy
efficient fragmentation and voltage setting. We further study a
less practical case when each stage can dynamically change its
speed to get further energy saving. We define the problem and
transform it to a nonlinear system whose solution provides a lower
bound for energy consumption. We apply the obtained theoretical
results to develop algorithms for power/energy minimization of
computer and communication systems. The experimental result
suggests that significant power/energy reduction, is possible
without additional latency. In fact, we achieve almost 40% total
energy saving over the combined minimal supply voltage selection
and system shut-down technique and 85% if none of these two
energy minimization methods is used
Recommended from our members
Design and implementation of Radix-3/Radix-2 based novel hybrid SAR ADC in scaled CMOS technologies
This thesis focuses on low power and high speed design techniques for successive
approximation register (SAR) analog-to-digital converters (ADCs) in nanoscale
CMOS technologies. SAR ADCs’ speed is limited by the number of bits of
resolution. An N-bit conventional SAR ADC takes N conversion cycles. To speed
up the conversion process, we introduce a radix-3 SAR ADC which can compute
1:6 bits per cycle. To our knowledge, it is the first fully programmable and efficiently
hardware controlled radix-3 SAR ADC. We had to use two comparators per
cycle due to ADC architecture and we proposed a simple calibration scheme for
the comparators. Also, as the architecture of the DAC array is completely different
from the architecture of conventional radix-2 SAR ADC’s DAC arrays, we came up
with an algorithm for calibration of capacitors of the DAC.
Low power SAR ADCs face two major challenges especially at high resolutions:
(1) increased comparator power to suppress the noise, and (2) increased
DAC switching energy due to the large DAC size. Due to our proposed architecture,the radix-3 SAR ADC uses two comparators per cycle and two differential DACs.
To improve the comparator’s power efficiency, an efficient and low cost calibration
technique has been introduced. It allows a low power and noisy comparator to
achieve high signal-to-noise ratio (SNR).
To improve the DAC switching energy, we introduced a radix-3/radix-2
based novel hybrid SAR ADC. We use two single ended DACs for radix-3 SAR
ADC and these two single ended DACs can be used as one differential DAC for
radix-2 SAR ADC. So, overall, we only have a single DAC as conventional radix-
2 SAR ADC. In addition, a monotonic switching technique is adopted for radix-2
search to reduce the DAC capacitor size and hence, to reduce switching power. It
can reduce the total number of unit capacitors by four times. Our proposed hybrid
SAR ADC can achieve less DAC energy compared to radix-3 and radix-2 SAR
ADCs. Also, to utilize technology scaling, we used the minimum capacitor size
allowed by thermal noise limitations. To achieve high resolution, we introduced
calibration algorithm for the DAC array.
As mentioned earlier, the radix-3 SAR ADC offers higher power than conventional
radix-2 SAR ADC because of simultaneous use of two comparators. In
the proposed hybrid SAR ADC, we will be using radix-3 search for first few MSB
bits. So, the resolution required for radix-3 comparators are much larger than the
LSB value of 10-bit ADC. By implementing calibration of comparators, we can
use low power, high input referred offset and high speed comparators for radix-3
search. Radix-2 search will be used for rest of the bits and the resolution of the
radix-2 comparator has to be less than the required LSB value. So, a high power, low input referred offset and high speed comparator is used for radix-2 search.
Also, we introduced clock gating for comparators. So, radix-3 comparators will not
toggle during radix-2 search and the radix-2 comparators will be inactive during
radix-3 search. By using the aforementioned techniques, the overall comparator
power is definitely less than a radix-3 SAR ADC and comparable to a conventional
radix-2 SAR ADC.
A prototype radix-3/radix-2 based hybrid SAR ADC with the proposed
technique is designed and fabricated in 40nm CMOS technology. It achieves an
SNDR of 56.9 dB and consumes only 0.38 mW power at 30MS/s, leading to a
Walden figure of merit of 21.5 fJ/conv-step.Electrical and Computer Engineerin
Parametric analog signal amplification applied to nanoscale cmos wireless digital transceivers
Thesis presented in partial fulfillment of the requirements for the degree of Doctor
of Philosophy in the subject of Electrical and Computer Engineering by the Universidade Nova de Lisboa,Faculdade de Ciências e TecnologiaSignal amplification is required in almost every analog electronic system. However
noise is also present, thus imposing limits to the overall circuit performance, e.g., on
the sensitivity of the radio transceiver. This drawback has triggered a major research
on the field, which has been producing several solutions to achieve amplification with minimum added noise. During the Fifties, an interesting out of mainstream path was followed which was based on variable reactance instead of resistance based amplifiers.
The principle of these parametric circuits permits to achieve low noise amplifiers since
the controlled variations of pure reactance elements is intrinsically noiseless. The
amplification is based on a mixing effect which enables energy transfer from an AC
pump source to other related signal frequencies.
While the first implementations of these type of amplifiers were already available at that time, the discrete-time version only became visible more recently. This discrete-time version is a promising technique since it is well adapted to the mainstream nanoscale CMOS technology. The technique itself is based on the principle of changing the surface potential of the MOS device while maintaining the transistor gate in a floating state.
In order words, the voltage amplification is achieved by changing the capacitance value
while maintaining the total charge unchanged during an amplification phase.
Since a parametric amplifier is not intrinsically dependent on the transconductance of the MOS transistor, it does not directly suffer from the intrinsic transconductance MOS gain issues verified in nanoscale MOS technologies. As a consequence, open-loop and opamp free structures can further emerge with this additional contribution.
This thesis is dedicated to the analysis of parametric amplification with special emphasis on the MOS discrete-time implementation. The use of the latter is supported on the presentation of several circuits where the MOS Parametric Amplifier cell is well suited:
small gain amplifier, comparator, discrete-time mixer and filter, and ADC. Relatively to the latter, a high speed time-interleaved pipeline ADC prototype is implemented in a,standard 130 nm CMOS digital technology from United Microelectronics Corporation (UMC). The ADC is fully based on parametric MOS amplification which means that one could achieve a compact and MOS-only implementation. Furthermore, any high
speed opamp has not been used in the signal path, being all the amplification steps
implemented with open-loop parametric MOS amplifiers. To the author’s knowledge,
this is first reported pipeline ADC that extensively used the parametric amplification
concept.Fundação para a Ciência e Tecnologia through
the projects SPEED, LEADER and IMPAC
Computing with Spintronics: Circuits and architectures
This thesis makes the following contributions towards the design of computing platforms with spintronic devices. 1) It explores the use of spintronic memories in the design of a domain-specific processor for an emerging class of data-intensive applications, namely recognition, mining and synthesis (RMS). Two different spintronic memory technologies — Domain Wall Memory (DWM) and STT-MRAM — are utilized to realize the different levels in the memory hierarchy of the domain-specific processor, based on their respective access characteristics. Architectural tradeoffs created by the use of spintronic memories are analyzed. The proposed design achieves 1.5X-4X improvements in energy-delay product compared to a CMOS baseline. 2) It describes the first attempt to use DWM in the cache hierarchy of general-purpose processors. DWM promises unparalleled density by packing several bits of data into each bit-cell. TapeCache, the proposed DWM-based cache architecture, utilizes suitable circuit and architectural optimizations to address two key challenges (i) the high energy and latency requirement of write operations and (ii) the need for shift operations to access the data stored in each DWM bit-cell. At the circuit level, DWM bit-cells that are tailored to the distinct design requirements of different levels in the cache hierarchy are proposed. At the architecture level, TapeCache proposes suitable cache organization and management policies to alleviate the performance impact of shift operations required to access data stored in DWM bit-cells. TapeCache achieves more than 7X improvements in both cache area and energy with virtually identical performance compared to an SRAM-based cache hierarchy. 3) It investigates the design of the on-chip memory hierarchy of general-purpose graphics processing units (GPGPUs)—massively parallel processors that are optimized for data-intensive high-throughput workloads—using DWM. STAG, a high density, energy-efficient Spintronic- Tape Architecture for GPGPU cache hierarchies is described. STAG utilizes different DWM bit-cells to realize different memory arrays in the GPGPU cache hierarchy. To address the challenge of high access latencies due to shifts, STAG predicts upcoming cache accesses by leveraging unique characteristics of GPGPU architectures and workloads, and prefetches data that are both likely to be accessed and require large numbers of shift operations. STAG achieves 3.3X energy reduction and 12.1% performance improvement over CMOS SRAM under iso-area conditions. 4) While the potential of spintronic devices for memories is widely recognized, their utility in realizing logic is much less clear. The thesis presents Spintastic, a new paradigm that utilizes Stochastic Computing (SC) to realize spintronic logic. In SC, data is encoded in the form of pseudo-random bitstreams, such that the probability of a \u271\u27 in a bitstream corresponds to the numerical value that it represents. SC can enable compact, low-complexity logic implementations of various arithmetic functions. Spintastic establishes the synergy between stochastic computing and spin-based logic by demonstrating that they mutually alleviate each other\u27s limitations. On the one hand, various building blocks of SC, which incur significant overheads in CMOS implementations, can be efficiently realized by exploiting the physical characteristics of spin devices. On the other hand, the reduced logic complexity and low logic depth of SC circuits alleviates the shortcomings of spintronic logic. Based on this insight, the design of spin-based stochastic arithmetic circuits, bitstream generators, bitstream permuters and stochastic-to-binary converter circuits are presented. Spintastic achieves 7.1X energy reduction over CMOS implementations for a wide range of benchmarks from the image processing, signal processing, and RMS application domains. 5) In order to evaluate the proposed spintronic designs, the thesis describes various device-to-architecture modeling frameworks. Starting with devices models that are calibrated to measurements, the characteristics of spintronic devices are successively abstracted into circuit-level and architectural models, which are incorporated into suitable simulation frameworks. (Abstract shortened by UMI.
Direct digital synthesizers : theory, design and applications
Traditional designs of high bandwidth frequency synthesizers employ the use of a phase-locked-loop (PLL). A direct digital synthesizer (DDS) provides many significant advantages over the PLL approaches. Fast settling time, sub-Hertz frequency resolution, continuous-phase switching response and low phase noise are features easily obtainable in the DDS systems. Although the principle of the DDS has been known for many years, the DDS did not play a dominant role in wideband frequency generation until recent years. Earlier DDSs were limited to produce narrow bands of closely spaced frequencies, due to limitations of digital logic and D/A-converter technologies. Recent advantages in integrated circuit (IC) technologies have brought about remarkable progress in this area. By programming the DDS, adaptive channel bandwidths, modulation formats, frequency hopping and data rates are easily achieved. This is an important step towards a "software-radio" which can be used in various systems. The DDS could be applied in the modulator or demodulator in the communication systems. The applications of DDS are restricted to the modulator in the base station. The aim of this research was to find an optimal front-end for a transmitter by focusing on the circuit implementations of the DDS, but the research also includes the interface to baseband circuitry and system level design aspects of digital communication systems.
The theoretical analysis gives an overview of the functioning of DDS, especially with respect to noise and spurs. Different spur reduction techniques are studied in detail. Four ICs, which were the circuit implementations of the DDS, were designed. One programmable logic device implementation of the CORDIC based quadrature amplitude modulation (QAM) modulator was designed with a separate D/A converter IC. For the realization of these designs some new building blocks, e.g. a new tunable error feedback structure and a novel and more cost-effective digital power ramp generator, were developed.reviewe
Recommended from our members
Energy Harvesting and Power Management Integrated Circuits for Self-Sustaining Wearables
Harvesting energy from ambient sources can provide power autonomy to energy efficient electronics and sensors. The last decade has seen a multitude of ways to scavenge energy from various sources like solar, thermal, electromagnetic, electrostatic, piezo-electric and many more. Thermal energy from human body heat is ubiquitous and can be harnessed seamlessly across day and night. Micropower generation from human body heat using thermoelectric generators (TEG) can replace battery to power miniaturized, unobtrusive, energy-efficient wearable devices for preventive health care and vital body signs monitoring and make them self-sustainable. This thesis is focused in realizing such a system and presents different integrated power management circuit techniques to solve the primary challenges associated with energy harvesting from human body heat.
The first part of the thesis demonstrates an on-chip electrical cold-start technique to achieve low-voltage and fast start-up of a boost converter for autonomous thermal energy harvesting from human body heat. Improved charge transfer through high gate-boosted switches by means of cross-coupled complementary charge pumps enables voltage multiplication of the low input voltage during cold start. The start-up voltage multiplier operates with an on-chip clock generated by an ultra-low-voltage ring oscillator. The proposed cold-start scheme implemented in a general-purpose 0.18 µm CMOS process assists an inductive boost converter to start operation with a minimum input voltage of 57 mV in 135 ms, while consuming only 90 nJ of energy from the harvesting source, without using additional sources of energy or additional off-chip components.
A single-inductor, self-starting and efficient low-voltage boost converter is described next, suitable for TEG-based body-heat energy harvesting. In order to extract maximum energy from a thermoelectric generator (TEG) at small temperature gradient, a loss-optimized maximum power transfer (LO-MPT) scheme is proposed that enables the harvester to achieve high end-to-end efficiency at small input voltages. The boost converter is implemented in a 0.18 µm CMOS technology and achieves above 75% efficiency for a matched input voltage range of 15 mV-100 mV, with a peak efficiency of 82%. Enhanced power extraction enables the converter to sustain operation at an input voltage as low as 3.5 mV. In addition, the boost converter self-starts in 252 ms with a minimum input voltage of 50 mV utilizing a dual-path architecture and a one-shot cold-start mechanism.
The final section demonstrates a self-sustainable system where a low-power signal conditioning front-end with a unique dynamic threshold tracking loop is designed to decode heart beats from a noisy ECG signal and is powered by human body heat utilizing an autonomous DC-DC converter embedded in the same chip and an off-chip centimeter-scale TEG