678 research outputs found
LOW-JITTER AND LOW-SPUR RING-OSCILLATOR-BASED PHASE-LOCKED LOOPS
Department of Electrical EngineeringIn recent years, ring-oscillator based clock generators have drawn a lot of attention due to the merits of high area efficiency, potentially wide tuning range, and multi-phase generation. However, the key challenge is how to suppress the poor jitter of ring oscillators. There have been many efforts to develop a ring-oscillator-based clock generator targeting very low-jitter performance. However, it remains difficult for conventional architectures to achieve both low RMS jitter and low levels of reference spurs concurrently while having a high multiplication factor. In this dissertation, a time-domain analysis is presented that provides an intuitive understanding of RMS jitter calculation of the clock generators from their phase-error correction mechanisms. Based on this analysis, we propose new designs of a ring-oscillator-based PLL that addresses the challenges of prior-art ring-based architectures.
This dissertation introduces a ring-oscillator-based PLL with the proposed fast phase-error correction (FPEC) technique, which emulates the phase-realignment mechanism of an injection-locked clock multiplier (ILCM). With the FPEC technique, the phase error of the voltage-controlled oscillator (VCO) is quickly removed, achieving ultra-low jitter. In addition, in the transfer function of the proposed architecture, an intrinsic integrator is involved since it is naturally based on a PLL topology. The proposed PLL can thus have low levels of reference spurs while maintaining high stability even for a large multiplication factor.
Furthermore, it presents another design of a digital PLL embodying the FPEC technique (or FPEC DPLL). To overcome the problem of a conventional TDC, a low-power optimally-spaced (OS) TDC capable of effectively minimizing the quantization error is presented. In the proposed FPEC DPLL, background digital controllers continuously calibrate the decision thresholds and the gain of the error correction by the loop to be optimal, thus dramatically reducing the quantization error. Since the proposed architecture is implemented in a digital fashion, the variables defining the characteristics of the loop can be easily estimated and calibrated by digital calibrators. As a result, the performances of an ultra-low jitter and the figure-of-merit can be achieved.clos
Recommended from our members
Architectures and Circuits Leveraging Injection-Locked Oscillators for Ultra-Low Voltage Clock Synthesis and Reference-less Receivers for Dense Chip-to-Chip Communications
High performance computing is critical for the needs of scientific discovery and economic competitiveness. An extreme-scale computing system at 1000x the performance of today’s petaflop machines will exhibit massive parallelism on multiple vertical fronts, from thousands of computational units on a single processor to thousands of processors in a single data center. To facilitate such a massively-parallel extreme-scale computing, a key challenge is power. The challenge is not power associated with base computation but rather the problem of transporting data from one chip to another at high enough rates. This thesis presents architectures and techniques to achieve low power and area footprint while achieving high data rates in a dense very-short reach (VSR) chip-to-chip (C2C) communication network. High-speed serial communication operating at ultra-low supplies improves the energy-efficiency and lowers the power envelop of a system doing an exaflop of loops. One focus area of this thesis is clock synthesis for such energy-efficient interconnect applications operating at high speeds and ultra-low supplies. A sub-integer clockfrequency synthesizer is presented that incorporates a multi-phase injection-locked ring-oscillator-based prescaler for operation at an ultra-low supply voltage of 0.5V, phase-switching based programmable division for sub-integer clock-frequency synthesis, and automatic calibration to ensure injection lock. A record speed of 9GHz has been demonstrated at 0.5V in 45nm SOI CMOS. It consumes 3.5mW of power at 9.12GHz and 0.052 of area, while showing an output phase noise of -100dBc/Hz at 1MHz offset and RMS jitter of 325fs; it achieves a net of -186.5 in a 45-nm SOI CMOS process. This thesis also describes a receiver with a reference-less clocking architecture for high-density VSR-C2C links. This architecture simplifies clock-tree planning in dense extreme-scaling computing environments and has high-bandwidth CDR to enable SSC for suppressing EMI and to mitigate TX jitter requirements. It features clock-less DFE and a high-bandwidth CDR based on master-slave ILOs for phase generation/rotation. The RX is implemented in 14nm CMOS and characterized at 19Gb/s. It is 1.5x faster that previous reference-less embedded-oscillator based designs with greater than 100MHz jitter tolerance bandwidth and recovers error-free data over VSR-C2C channels. It achieves a power-efficiency of 2.9pJ/b while recovering error-free data (BER 200MHz and the INL of the ILO-based phase-rotator (32- Steps/UI) is <1-LSB. Lastly, this thesis develops a time-domain delay-based modeling of injection locking to describe injection-locking phenomena in nonharmonic oscillators. The model is used to predict the locking bandwidth, and the locking dynamics of the locked oscillator. The model predictions are verified against simulations and measurements of a four-stage differential ring oscillator. The model is further used to predict the injection-locking behavior of a single-ended CMOS inverter based ring oscillator, the lock range of a multi-phase injection-locked ring-oscillator-based prescaler, as well as the dynamics of tracking injection phase perturbations in injection-locked masterslave oscillators; demonstrating its versatility in application to any nonharmonic oscillator
Development of high speed integrated circuit for very high resolution timing measurements
A multi-channel high-precision low-power time-to-digital converter application specific integrated circuit for high energy physics applications has been designed and implemented in a 130 nm CMOS process. To reach a target resolution of 24.4 ps, a novel delay element has been conceived. This nominal resolution has been experimentally verified with a prototype, with a minimum resolution of 19 ps. To further improve the resolution, a new interpolation scheme has been described. The ASIC has been designed to use a reference clock with the LHC bunch crossing frequency of 40MHz and generate all required timing signals internally, to ease to use within the framework of an LHC upgrade. Special care has been taken to minimise the power consumption
ULTRA-LOW-JITTER, MMW-BAND FREQUENCY SYNTHESIZERS BASED ON A CASCADED ARCHITECTURE
Department of Electrical EngineeringThis thesis presents an ultra-low-jitter, mmW-band frequency synthesizers based on a cascaded
architecture. First, the mmW-band frequency synthesizer based on a CP PLL is presented. At the
first stage, the CP PLL operating at GHz-band frequencies generated low-jitter output signals due
to a high-Q VCO. At the second stage, an ILFM operating at mmW-band frequencies has a wide
injection bandwidth, so that the jitter performance of the mmW-band output signals is determined
by the GHz-range PLL. The proposed ultra-low-jitter, mmW-band frequency synthesizer based on
a CP PLL, fabricated in a 65-nm CMOS technology, generated output signals from GHz-band
frequencies to mmW-band frequencies, achieving an RMS jitter of 206 fs and an IPN of ???31 dBc.
The active silicon area and the total power consumption were 0.32 mm2 and 42 mW, respectively.
However, due to a large in-band phase noise contribution of a PFD and a CP in the CP PLL, this
first stage was difficult to achieve an ultra-low in-band phase noise. Second, to improve the in-band
phase noise further, the mmW-band frequency synthesizer based on a digital SSPLL is presented.
At the first stage, the digital SSPLL operating at GHz-band frequencies generated ultra-low-jitter
output signals due to its sub-sampling operation and a high-Q GHz VCO. To minimize the
quantization noise of the voltage quantizer in the digital SSPLL, this thesis presents an OSVC as a
voltage quantizer while a small amount of power was consumed. The proposed ultra-low-jitter,
mmW-band frequency synthesizer fabricated in a 65-nm CMOS technology, generated output
signals from GHz-band frequencies to mmW-band frequencies, achieving an RMS jitter of 77 fs
and an IPN of ???40 dBc. The active silicon area and the total power consumption were 0.32 mm2 and
42 mW, respectively.clos
DLWUC: Distance and Load Weight Updated Clustering-Based Clock Distribution for SOC Architecture
High-clock skew variations and degradation of driving ability of buffers lead to an additional power dissipation in Clock Distribution Network (CDN) that increases the dimensionality of buffers and coordination among flip-flops. The manual threshold level to predict the Region of Interest (ROI) is not applicable in clustering process due to the complexities of excessive wire length and critical delay. This paper proposes the Distance and Load Weight Updated Clustering (DLWUC) to determine the suitable position of logical components. Initially, the DLWUC utilizes the Hybrid Weighted Distance (HWD) to estimate the distance and construct the distance matrix. The weight value extracted from the sorted distance matrix facilitates the projection of buffers. The updated weight value serves as the base for clustering with labeled outputs. The placement of buffer at the suitable place from load weight updated clustering provides the necessary trade-off between clock provision and load balance. The DLWUC discussed in this paper reduces the size of buffers, skew, power and latency compared to the existing topologies
Sub-Picosecond Jitter Clock Generation for Time Interleaved Analog to Digital Converter
Nowadays, Multi-GHz analog-to-digital converters (ADCs) are becoming more and more popular in radar systems, software-defined radio (SDR) and wideband communications, because they can realize much higher operation speed through using many interleaved sub-ADCs to relax ADC sampling rates. Although the time interleaved ADC has some issues such as gain mismatch, offset mismatch and timing skew between each ADC channel, these deterministic errors can be solved by previous works such as digital calibration technique. However, time-interleaved ADCs require a precise sample clock to achieve an acceptable effective-numberof-bits (ENOB) which can be degraded by jitter in the sample clock. The clock generation circuits presented in this work achieves sub-picosecond jitter performance in 180nm CMOS which is suitable for time-interleaved ADC. Two different test chips were fabricated in 180nm CMOS to investigate the low jitter design technique. The low jitter delay line in two chips were designed in two different ways, but both of them utilized the low jitter design technique. In first test chip, the measured RMS jitter is 0.1061ps for each delay stage. The second chip uses the proposed low jitter Delay-Locked Loop can work from 80MHz to 120MHz, which means it can provide the time interleaved ADC with 2.4GHz to 3.6GHz low jitter sample clock, the measured delay stage jitter performance in second test chip is 0.1085ps
Energy-efficient wireline transceivers
Power-efficient wireline transceivers are highly demanded by many applications in high performance computation and communication systems. Apart from transferring a wide range of data rates to satisfy the interconnect bandwidth requirement, the transceivers have very tight power budget and are expected to be fully integrated. This thesis explores enabling techniques to implement such transceivers in both circuit and system levels. Specifically, three prototypes will be presented: (1) a 5Gb/s reference-less clock and data recovery circuit (CDR) using phase-rotating phase-locked loop (PRPLL) to conduct phase control so as to break several fundamental trade-offs in conventional receivers; (2) a 4-10.5Gb/s continuous-rate CDR with novel frequency acquisition scheme based on bang-bang phase detector (BBPD) and a ring oscillator-based fractional-N PLL as the low noise wide range DCO in the CDR loop; (3) a source-synchronous energy-proportional link with dynamic voltage and frequency scaling (DVFS) and rapid on/off (ROO) techniques to cut the link power wastage at system level. The receiver/transceiver architectures are highly digital and address the requirements of new receiver architecture development, wide operating range, and low power/area consumption while being fully integrated. Experimental results obtained from the prototypes attest the effectiveness of the proposed techniques
Digital Centric Multi-Gigabit SerDes Design and Verification
Advances in semiconductor manufacturing still lead to ever decreasing feature sizes and constantly allow higher degrees of integration in application specific integrated circuits (ASICs). Therefore the bandwidth requirements on the external interfaces of such systems on chips (SoC) are steadily growing. Yet, as the number of pins on these ASICs is not increasing in the same pace - known as pin limitation - the bandwidth per pin has to be increased.
SerDes (Serializer/Deserializer) technology, which allows to transfer data serially at very high data rates of 25Gbps and more is a key technology to overcome pin limitation and exploit the computing power that can be achieved in todays SoCs. As such SerDes blocks together with the digital logic interfacing them form complex mixed signal systems, verification of performance and functional correctness is very challenging.
In this thesis a novel mixed-signal design methodology is proposed, which tightly couples model and implementation in order to ensure consistency throughout the design cycles and hereby accelerate the overall implementation flow. A tool flow that has been developed is presented, which integrates well into state of the art electronic design automation (EDA) environments and enables the usage of this methodology in practice.
Further, the design space of todays high-speed serial links is analyzed and an architecture is proposed, which pushes complexity into the digital domain in order to achieve robustness, portability between manufacturing processes and scaling with advanced node technologies. The all digital phase locked loop (PLL) and clock data recovery (CDR), which have been developed are described in detail.
The developed design flow was used for the implementation of the SerDes architecture in a 28nm silicon process and proved to be indispensable for future projects
Design of energy efficient high speed I/O interfaces
Energy efficiency has become a key performance metric for wireline high speed I/O interfaces. Consequently, design of low power I/O interfaces has garnered large interest that has mostly been focused on active power reduction techniques at peak data rate. In practice, most systems exhibit a wide range of data transfer patterns. As a result, low energy per bit operation at peak data rate does not necessarily translate to overall low energy operation. Therefore, I/O interfaces that can scale their power consumption with data rate requirement are desirable. Rapid on-off I/O interfaces have a potential to scale power with data rate requirements without severely affecting either latency or the throughput of the I/O interface. In this work, we explore circuit techniques for designing rapid on-off high speed wireline I/O interfaces and digital fractional-N PLLs.
A burst-mode transmitter suitable for rapid on-off I/O interfaces is presented that achieves 6 ns turn-on time by utilizing a fast frequency settling ring oscillator in digital multiplying delay-locked loop and a rapid on-off biasing scheme for current mode output driver. Fabricated in 90 nm CMOS process, the prototype achieves 2.29 mW/Gb/s energy efficiency at peak data rate of 8 Gb/s. A 125X (8 Gb/s to 64 Mb/s) change in effective data rate results in 67X (18.29 mW to 0.27 mW) change in transmitter power consumption corresponding to only 2X (2.29 mW/Gb/s to 4.24 mW/Gb/s) degradation in energy efficiency for 32-byte long data bursts. We also present an analytical bit error rate (BER) computation technique for this transmitter under rapid on-off operation, which uses MDLL settling measurement data in conjunction with always-on transmitter measurements. This technique indicates that the BER bathtub width for 10^(−12) BER is 0.65 UI and 0.72 UI during rapid on-off operation and always-on operation, respectively.
Next, a pulse response estimation-based technique is proposed enabling burst-mode operation for baud-rate sampling receivers that operate over high loss channels. Such receivers typically employ discrete time equalization to combat inter-symbol interference. Implementation details are provided for a receiver chip, fabricated in 65nm CMOS technology, that demonstrates efficacy of the proposed technique. A low complexity pulse response estimation technique is also presented for low power receivers that do not employ discrete time equalizers.
We also present techniques for implementation of highly digital fractional-N PLL employing a phase interpolator based fractional divider to improve the quantization noise shaping properties of a 1-bit ∆Σ frequency-to-digital converter. Fabricated in 65nm CMOS process, the prototype calibration-free fractional-N Type-II PLL employs the proposed frequency-to-digital converter in place of a high resolution time-to-digital converter and achieves 848 fs rms integrated jitter (1 kHz-30 MHz) and -101 dBc/Hz in-band phase noise while generating 5.054 GHz output from 31.25 MHz input
Toward realizing power scalable and energy proportional high-speed wireline links
Growing computational demand and proliferation of cloud computing has placed high-speed
serial links at the center stage. Due to saturating energy efficiency improvements over the
last five years, increasing the data throughput comes at the cost of power consumption. Conventionally, serial link power can be reduced by optimizing individual building blocks such as
output drivers, receiver, or clock generation and distribution. However, this approach yields
very limited efficiency improvement. This dissertation takes an alternative approach toward
reducing the serial link power. Instead of optimizing the power of individual building blocks,
power of the entire serial link is reduced by exploiting serial link usage by the applications.
It has been demonstrated that serial links in servers are underutilized. On average, they
are used only 15% of the time, i.e. these links are idle for approximately 85% of the time.
Conventional links consume power during idle periods to maintain synchronization between
the transmitter and the receiver. However, by powering-off the link when idle and powering
it back when needed, power consumption of the serial link can be scaled proportionally to
its utilization. This approach of rapid power state transitioning is known as the rapid-on/off
approach. For the rapid-on/off to be effective, ideally the power-on time, off-state power,
and power state transition energy must all be close to zero. However, in practice, it is very
difficult to achieve these ideal conditions. Work presented in this dissertation addresses these
challenges.
When this research work was started (2011-12), there were only a couple of research papers
available in the area of rapid-on/off links. Systematic study or design of a rapid power state
transitioning in serial links was not available in the literature. Since rapid-on/off with
nanoseconds granularity is not a standard in any wireline communication, even the popular
test equipment does not support testing any such feature, neither any formal measurement methodology was available. All these circumstances made the beginning difficult. However,
these challenges provided a unique opportunity to explore new architectural techniques and
identify trade-offs. The key contributions of this dissertation are as follows.
The first and foremost contribution is understanding the underlying limitations of saturating energy efficiency improvements in serial links and why there is a compelling need to
find alternative ways to reduce the serial link power.
The second contribution is to identify potential power saving techniques and evaluate the
challenges they pose and the opportunities they present.
The third contribution is the design of a 5Gb/s transmitter with a rapid-on/off feature.
The transmitter achieves rapid-on/off capability in voltage mode output driver by using
a fast-digital regulator, and in the clock multiplier by accurate frequency pre-setting and
periodic reference insertion. To ease timing requirements, an improved edge replacement
logic circuit for the clock multiplier is proposed. Mathematical modeling of power-on time
as a function of various circuit parameters is also discussed. The proposed transmitter
demonstrates energy proportional operation over wide variations of link utilization, and is,
therefore, suitable for energy efficient links. Fabricated in 90nm CMOS technology, the
voltage mode driver, and the clock multiplier achieve power-on-time of only 2ns and 10ns,
respectively. This dissertation highlights key trade-off in the clock multiplier architecture,
to achieve fast power-on-lock capability at the cost of jitter performance.
The fourth contribution is the design of a 7GHz rapid-on/off LC-PLL based clock multi-
plier. The phase locked loop (PLL) based multiplier was developed to overcome the limita-
tions of the MDLL based approach. Proposed temperature compensated LC-PLL achieves
power-on-lock in 1ns.
The fifth and biggest contribution of this dissertation is the design of a 7Gb/s embedded
clock transceiver, which achieves rapid-on/off capability in LC-PLL, current-mode transmit-
ter and receiver. It was the first reported design of a complete transceiver, with an embedded
clock architecture, having rapid-on/off capability. Background phase calibration technique in
PLL and CDR phase calibration logic in the receiver enable instantaneous lock on power-on.
The proposed transceiver demonstrates power scalability with a wide range of link utiliza-
tion and, therefore, helps in improving overall system efficiency. Fabricated in 65nm CMOS technology, the 7Gb/s transceiver achieves power-on-lock in less than 20ns. The transceiver
achieves power scaling by 44x (63.7mW-to-1.43mW) and energy efficiency degradation by
only 2.2x (9.1pJ/bit-to-20.5pJ/bit), when the effective data rate (link utilization) changes
by 100x (7Gb/s-to-70Mb/s).
The sixth and final contribution is the design of a temperature sensor to compensate
the frequency drifts due to temperature variations, during long power-off periods, in the
fast power-on-lock LC-PLL. The proposed self-referenced VCO-based temperature sensor
is designed with all digital logic gates and achieves low supply sensitivity. This sensor is
suitable for integration in processor and DRAM environments. The proposed sensor works
on the principle of directly converting temperature information to frequency and finally
to digital bits. A novel sensing technique is proposed in which temperature information
is acquired by creating a threshold voltage difference between the transistors used in the
oscillators. Reduced supply sensitivity is achieved by employing junction capacitance, and
the overhead of voltage regulators and an external ideal reference frequency is avoided. The
effect of VCO phase noise on the sensor resolution is mathematically evaluated. Fabricated
in the 65nm CMOS process, the prototype can operate with a supply ranging from 0.85V
to 1.1V, and it achieves a supply sensitivity of 0.034oC/mV and an inaccuracy of ±0.9oC
and ±2.3oC from 0-100oC after 2-point calibration, with and without static nonlinearity
correction, respectively. It achieves a resolution of 0.3oC, resolution FoM of 0.3(nJ/conv)res2 ,
and measurement (conversion) time of 6.5μs
- …