83 research outputs found
Design of Energy-Efficient A/D Converters with Partial Embedded Equalization for High-Speed Wireline Receiver Applications
As the data rates of wireline communication links increases, channel impairments such as skin effect, dielectric loss, fiber dispersion, reflections and cross-talk become more pronounced. This warrants more interest in analog-to-digital converter (ADC)-based serial link receivers, as they allow for more complex and flexible back-end digital signal processing (DSP) relative to binary or mixed-signal receivers. Utilizing this back-end DSP allows for complex digital equalization and more bandwidth-efficient modulation schemes, while also displaying reduced process/voltage/temperature (PVT) sensitivity. Furthermore, these architectures offer straightforward design translation and can directly leverage the area and power scaling offered by new CMOS technology nodes. However, the power consumption of the ADC front-end and subsequent digital signal processing is a major issue. Embedding partial equalization inside the front-end ADC can potentially result in lowering the complexity of back-end DSP and/or decreasing the ADC resolution requirement, which results in a more energy-effcient receiver. This dissertation presents efficient implementations for multi-GS/s time-interleaved ADCs with partial embedded equalization. First prototype details a 6b 1.6GS/s ADC with a novel embedded redundant-cycle 1-tap DFE structure in 90nm CMOS. The other two prototypes explain more complex 6b 10GS/s ADCs with efficiently embedded feed-forward equalization (FFE) and decision feedback equalization (DFE) in 65nm CMOS. Leveraging a time-interleaved successive approximation ADC architecture, new structures for embedded DFE and FFE are proposed with low power/area overhead. Measurement results over FR4 channels verify the effectiveness of proposed embedded equalization schemes. The comparison of fabricated prototypes against state-of-the-art general-purpose ADCs at similar speed/resolution range shows comparable performances, while the proposed architectures include embedded equalization as well
Analysis and equalization of data-dependent jitter
Data-dependent jitter limits the bit-error rate (BER) performance of broadband communication systems and aggravates synchronization in phase- and delay-locked loops used for data recovery. A method for calculating the data-dependent jitter in broadband systems from the pulse response is discussed. The impact of jitter on conventional clock and data recovery circuits is studied in the time and frequency domain. The deterministic nature of data-dependent jitter suggests equalization techniques suitable for high-speed circuits. Two equalizer circuit implementations are presented. The first is a SiGe clock and data recovery circuit modified to incorporate a deterministic jitter equalizer. This circuit demonstrates the reduction of jitter in the recovered clock. The second circuit is a MOS implementation of a jitter equalizer with independent control of the rising and falling edge timing. This equalizer demonstrates improvement of the timing margins that achieve 10/sup -12/ BER from 30 to 52 ps at 10 Gb/s
Hybrid NRZ/Multi-Tone Signaling for High-Speed Low-Power Wireline Transceivers
Over the past few decades, incessant growth of Internet networking traffic and High-Performance Computing (HPC) has led to a tremendous demand for data bandwidth. Digital communication technologies combined with advanced integrated circuit scaling trends have enabled the semiconductor and microelectronic industry to dramatically scale the bandwidth of high-loss interfaces such as Ethernet, backplane, and Digital Subscriber Line (DSL). The key to achieving higher bandwidth is to employ equalization technique to compensate the channel impairments such as Inter-Symbol Interference (ISI), crosstalk, and environmental noise. Therefore, todayĂąs advanced input/outputs (I/Os) has been equipped with sophisticated equalization techniques to push beyond the uncompensated bandwidth of the system. To this end, process scaling has continually increased the data processing capability and improved the I/O performance over the last 15 years. However, since the channel bandwidth has not scaled with the same pace, the required signal processing and equalization circuitry becomes more and more complicated. Thereby, the energy efficiency improvements are largely offset by the energy needed to compensate channel impairments. In this design paradigm, re-thinking about the design strategies in order to not only satisfy the bandwidth performance, but also to improve power-performance becomes an important necessity. It is well known in communication theory that coding and signaling schemes have the potential to provide superior performance over band-limited channels. However, the choice of the optimum data communication algorithm should be considered by accounting for the circuit level power-performance trade-offs. In this thesis we have investigated the application of new algorithm and signaling schemes in wireline communications, especially for communication between microprocessors, memories, and peripherals. A new hybrid NRZ/Multi-Tone (NRZ/MT) signaling method has been developed during the course of this research. The system-level and circuit-level analysis, design, and implementation of the proposed signaling method has been performed in the frame of this work, and the silicon measurement results have proved the efficiency and the robustness of the proposed signaling methodology for wireline interfaces. In the first part of this work, a 7.5 Gb/s hybrid NRZ/MT transceiver (TRX) for multi-drop bus (MDB) memory interfaces is designed and fabricated in 40 nm CMOS technology. Reducing the complexity of the equalization circuitry on the receiver (RX) side, the proposed architecture achieves 1 pJ/bit link efficiency for a MDB channel bearing 45 dB loss at 2.5 GHz. The measurement results of the first prototype confirm that NRZ/MT serial data TRX can offer an energy-efficient solution for MDB memory interfaces. Motivated by the satisfying results of the first prototype, in the second phase of this research we have exploited the properties of multi-tone signaling, especially orthogonality among different sub-bands, to reduce the effect of crosstalk in high-dense wireline interconnects. A four-channel transceiver has been implemented in a standard CMOS 40 nm technology in order to demonstrate the performance of NRZ/MT signaling in presence of high channel loss and strong crosstalk noise. The proposed system achieves 1 pJ/bit power efficiency, while communicating over a MDB memory channel at 36 Gb/s aggregate data rate
Recommended from our members
Architectures and Circuits Leveraging Injection-Locked Oscillators for Ultra-Low Voltage Clock Synthesis and Reference-less Receivers for Dense Chip-to-Chip Communications
High performance computing is critical for the needs of scientific discovery and economic competitiveness. An extreme-scale computing system at 1000x the performance of todayâs petaflop machines will exhibit massive parallelism on multiple vertical fronts, from thousands of computational units on a single processor to thousands of processors in a single data center. To facilitate such a massively-parallel extreme-scale computing, a key challenge is power. The challenge is not power associated with base computation but rather the problem of transporting data from one chip to another at high enough rates. This thesis presents architectures and techniques to achieve low power and area footprint while achieving high data rates in a dense very-short reach (VSR) chip-to-chip (C2C) communication network. High-speed serial communication operating at ultra-low supplies improves the energy-efficiency and lowers the power envelop of a system doing an exaflop of loops. One focus area of this thesis is clock synthesis for such energy-efficient interconnect applications operating at high speeds and ultra-low supplies. A sub-integer clockfrequency synthesizer is presented that incorporates a multi-phase injection-locked ring-oscillator-based prescaler for operation at an ultra-low supply voltage of 0.5V, phase-switching based programmable division for sub-integer clock-frequency synthesis, and automatic calibration to ensure injection lock. A record speed of 9GHz has been demonstrated at 0.5V in 45nm SOI CMOS. It consumes 3.5mW of power at 9.12GHz and 0.052 of area, while showing an output phase noise of -100dBc/Hz at 1MHz offset and RMS jitter of 325fs; it achieves a net of -186.5 in a 45-nm SOI CMOS process. This thesis also describes a receiver with a reference-less clocking architecture for high-density VSR-C2C links. This architecture simplifies clock-tree planning in dense extreme-scaling computing environments and has high-bandwidth CDR to enable SSC for suppressing EMI and to mitigate TX jitter requirements. It features clock-less DFE and a high-bandwidth CDR based on master-slave ILOs for phase generation/rotation. The RX is implemented in 14nm CMOS and characterized at 19Gb/s. It is 1.5x faster that previous reference-less embedded-oscillator based designs with greater than 100MHz jitter tolerance bandwidth and recovers error-free data over VSR-C2C channels. It achieves a power-efficiency of 2.9pJ/b while recovering error-free data (BER 200MHz and the INL of the ILO-based phase-rotator (32- Steps/UI) is <1-LSB. Lastly, this thesis develops a time-domain delay-based modeling of injection locking to describe injection-locking phenomena in nonharmonic oscillators. The model is used to predict the locking bandwidth, and the locking dynamics of the locked oscillator. The model predictions are verified against simulations and measurements of a four-stage differential ring oscillator. The model is further used to predict the injection-locking behavior of a single-ended CMOS inverter based ring oscillator, the lock range of a multi-phase injection-locked ring-oscillator-based prescaler, as well as the dynamics of tracking injection phase perturbations in injection-locked masterslave oscillators; demonstrating its versatility in application to any nonharmonic oscillator
Design of Low-Power NRZ/PAM-4 Wireline Transmitters
Rapid growing demand for instant multimedia access in a myriad of digital devices has pushed
the need for higher bandwidth in modern communication hardwares ranging from short-reach (SR)
memory/storage interfaces to long-reach (LR) data center Ethernets. At the same time, comprehensive
design optimization of link system that meets the energy-efficiency is required for mobile
computing and low operational cost at datacenters. This doctoral study consists of design of two
low-swing wireline transmitters featuring a low-power clock distribution and 2-tap equalization in
energy-efficient manners up to 20-Gb/s operation. In spite of the reduced signaling power in the
voltage-mode (VM) transmit driver, the presence of the segment selection logic still diminishes the
power saving benefit.
The first work presents a scalable VM transmitter which offers low static power dissipation
and adopts an impedance-modulated 2-tap equalizer with analog tap control, thereby obviating
driver segmentation and reducing pre-driver complexity and dynamic power. Per-channel quadrature
clock generation with injection-locked oscillators (ILO) allows the generation of rail-to-rail
quadrature clocks. Energy efficiency is further improved with capacitively driven low-swing global
clock distribution and supply scaling at lower data rates, while output eye quality is maintained at
low voltages with automatic phase calibration of the local ILO-generated quarter-rate clocks. A
prototype fabricated in a general purpose 65 nm CMOS process includes a 2 mm global clock
distribution network and two transmitters that support an output swing range of 100-300mV with
up to 12-dB of equalization. The transmitters achieve 8-16 Gb/s operation at 0.65-1.05 pJ/b energy
efficiency.
The second work involves a dual-mode NRZ/PAM-4 differential low-swing voltage-mode (VM)
transmitter. The pulse-selected output multiplexing allows reduction of power supply and deterministic
jitter caused by large on-chip parasitic inherent in the transmission-gate-based multiplexers
in the earlier work. Analog impedance control replica circuits running in the background produce
gate-biasing voltages that control the peaking ratio for 2-tap feed-forward equalization and
PAM-4 symbol levels for high-linearity. This analog control also allows for efficient generation of
the middle levels in PAM-4 operation with good linearity quantified by level separation mismatch
ratio of 95%. In NRZ mode, 2-tap feedforward equalization is configurable in high-performance
controlled-impedance or energy-efficient impedance-modulated settings to provide performance
scalability. Analytic design consideration on dynamic power, data-rate, mismatch, and output
swing brings optimal performance metric on the given technology node. The proof-of-concept
prototype is verified on silicon with 65 nm CMOS process with improved performance in speed
and energy-efficiency owing to double-stack NMOS transistors in the output stage. The transmitter consumes as low as 29.6mW in 20-Gb/s NRZ and 25.5mW in the 28-Gb/s PAM-4 operations
Recommended from our members
Design of Power-Efficient Optical Transceivers and Design of High-Linearity Wireless Wideband Receivers
The combination of silicon photonics and advanced heterogeneous integration is promising for next-generation disaggregated data centers that demand large scale, high throughput, and low power. In this dissertation, we discuss the design and theory of power-efficient optical transceivers with System-in-Package (SiP) 2.5D integration. Combining prior arts and proposed circuit techniques, a receiver chip and a transmitter chip including two 10 Gb/s data channels and one 2.5 GHz clocking channel are designed and implemented in 28 nm CMOS technology.
An innovative transimpedance amplifier (TIA) and a single-ended to differential (S2D) converter are proposed and analyzed for a low-voltage high-sensitivity receiver; a four-to-one serializer, programmable output drivers, AC coupling units, and custom pads are implemented in a low-power transmitter; an improved quadrature locked loop (QLL) is employed to generate accurate quadrature clocks. In addition, we present an analysis for inverter-based shunt-feedback TIA to explicitly depict the trade-off among sensitivity, data rate, and power consumption. At last, the research on CDR-basedâ clocking schemes for optical links is also discussed. We introduce prior arts and propose a power-efficient clocking scheme based on an injection-locked phase rotator. Next, we analyze injection-locked ring oscillators (ILROs) that have been widely used for quadrature clock generators (QCGs) in multi-lane optical or wireline transceivers due to their low power, low area, and technology scalability. The asymmetrical or partial injection locking from 2 phases to 4 phases results in imbalances in amplitude and phase. We propose a modified frequency-domain analysis to provide intuitive insight into the performance design trade-offs. The analysis is validated by comparing analytical predictions with simulations for an ILRO-based QCG in 28 nm CMOS technology.
This dissertation also discusses the design of high-linearity wireless wideband receivers. An out-of-band (OB) IM3 cancellation technique is proposed and analyzed. By exploiting a baseband auxiliary path (AP) with a high-pass feature, the in-band (IB) desired signal and out-of-band interferers are split. OB third-order intermodulation products (IM3) are reconstructed in the AP and cancelled in the baseband (BB). A 0.5-2.5 GHz frequency-translational noise-cancelling (FTNC) receiver is implemented in 65nm CMOS to demonstrate the proposed approach. It consumes 36 mW without cancellation at 1 GHz LO frequency and 1.2 V supply, and it achieves 8.8 MHz baseband bandwidth, 40dB gain, 3.3dB NF, 5dBm OB IIP3, and â6.5dBm OB B1dB. After IM3 cancellation, the effective OB-IIP3 increases to 32.5 dBm with an extra 34 mW for narrow-band interferers (two tones). For wideband interferers, 18.8 dB cancellation is demonstrated over 10 MHz with two â15 dBm modulated interferers. The local oscillator (LO) leakage is â92 dBm and â88 dB at 1 GHz and 2 GHz LO respectively. In summary, this technique achieves both high OB linearity and good LO isolation
A 3.125 Gb/s 5-TAP CMOS Transversal Equalizer
Recently, there is growing interest in high speed circuits for broadband communication, especially in wired networks. As the data rate increases beyond 1 GB/s conventional materials used as communication channels such as PCB traces, coaxial cables, and unshielded twisted pair (UTP) cables, etc. attenuate and distort the transmitted signal causing bit errors in the receiver end. Bit errors make the communication less reliable and in many cases even impossible.
The goal of this work was to analyze, and design an channel equalizer capable of restoring the received signal back to the original transmitted signal. The equalizer was designed in a standard CMOS 0.18 ”m process and it is capable of compensating up to 20 dBâs of attenuation at 1.5625 GHz for 15 and 20 meters of RG-58 A/U coaxial cables. The equalizer is able to remove 0.5 UI ( 160 ps ) of peak-to-peak jitter and output a signal with 0.1 UI ( 32 ps ) for 15 meters of cable at 3.125 Gb/s. The equalizer draws 18 mA from a 1.8 V power supply which is lower than publications [1, 2] for CMOS transversal equalizers
Design of energy efficient high speed I/O interfaces
Energy efficiency has become a key performance metric for wireline high speed I/O interfaces. Consequently, design of low power I/O interfaces has garnered large interest that has mostly been focused on active power reduction techniques at peak data rate. In practice, most systems exhibit a wide range of data transfer patterns. As a result, low energy per bit operation at peak data rate does not necessarily translate to overall low energy operation. Therefore, I/O interfaces that can scale their power consumption with data rate requirement are desirable. Rapid on-off I/O interfaces have a potential to scale power with data rate requirements without severely affecting either latency or the throughput of the I/O interface. In this work, we explore circuit techniques for designing rapid on-off high speed wireline I/O interfaces and digital fractional-N PLLs.
A burst-mode transmitter suitable for rapid on-off I/O interfaces is presented that achieves 6 ns turn-on time by utilizing a fast frequency settling ring oscillator in digital multiplying delay-locked loop and a rapid on-off biasing scheme for current mode output driver. Fabricated in 90 nm CMOS process, the prototype achieves 2.29 mW/Gb/s energy efficiency at peak data rate of 8 Gb/s. A 125X (8 Gb/s to 64 Mb/s) change in effective data rate results in 67X (18.29 mW to 0.27 mW) change in transmitter power consumption corresponding to only 2X (2.29 mW/Gb/s to 4.24 mW/Gb/s) degradation in energy efficiency for 32-byte long data bursts. We also present an analytical bit error rate (BER) computation technique for this transmitter under rapid on-off operation, which uses MDLL settling measurement data in conjunction with always-on transmitter measurements. This technique indicates that the BER bathtub width for 10^(â12) BER is 0.65 UI and 0.72 UI during rapid on-off operation and always-on operation, respectively.
Next, a pulse response estimation-based technique is proposed enabling burst-mode operation for baud-rate sampling receivers that operate over high loss channels. Such receivers typically employ discrete time equalization to combat inter-symbol interference. Implementation details are provided for a receiver chip, fabricated in 65nm CMOS technology, that demonstrates efficacy of the proposed technique. A low complexity pulse response estimation technique is also presented for low power receivers that do not employ discrete time equalizers.
We also present techniques for implementation of highly digital fractional-N PLL employing a phase interpolator based fractional divider to improve the quantization noise shaping properties of a 1-bit âÎŁ frequency-to-digital converter. Fabricated in 65nm CMOS process, the prototype calibration-free fractional-N Type-II PLL employs the proposed frequency-to-digital converter in place of a high resolution time-to-digital converter and achieves 848 fs rms integrated jitter (1 kHz-30 MHz) and -101 dBc/Hz in-band phase noise while generating 5.054 GHz output from 31.25 MHz input
- âŠ