33 research outputs found

    Design of High-Speed Power-Efficient Transmitter with Time-Based Equalization

    Get PDF
    ๋ณธ ๋…ผ๋ฌธ์€ ๊ณ ์†, ์ €์ „๋ ฅ์œผ๋กœ ๋™์ž‘ํ•˜๋Š” ์œ ์„  ์†ก์‹ ๊ธฐ์˜ ์„ค๊ณ„์— ๋Œ€ํ•ด ์„ค๋ช…ํ•˜๊ณ  ์žˆ๋‹ค. ๋ถ„๋ฆฌ๋˜์ง€ ์•Š์€ ์ถœ๋ ฅ ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ์žˆ๋Š” ์—๋„ˆ์ง€ ํšจ์œจ์ ์ธ ์ „์•• ๋ชจ๋“œ ์†ก์‹ ๊ธฐ๋Š” ์œ„์ƒ ์ง€์—ฐ ๋ถ„์„์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์‹œ๊ฐ„ ์˜์—ญ์—์„œ ์ฑ„๋„ ์†์‹ค์„ ๋ณด์ƒํ•œ๋‹ค. ์ง๋ ฌํ™”๋œ ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆผ์ด ์•„๋‹Œ ์†ก์‹  ํด๋Ÿญ์˜ ์œ„์ƒ์„ ๋ณ€์กฐํ•จ์œผ๋กœ์จ ์ œ์•ˆ๋œ ์†ก์‹ ๊ธฐ๋Š” ๋ฐ์ดํ„ฐ ์˜์กด์  ์ง€ํ„ฐ๋ฅผ ํฌ๊ฒŒ ์ค„์ธ๋‹ค. ์ˆ˜ํ‰ ์•„์ด ์˜คํ”„๋‹์€ ์ „์†ก๋œ ๋ฐ์ดํ„ฐ์˜ ์‹คํ–‰ ๊ธธ์ด์— ๋”ฐ๋ผ ์ œ๋กœ ํฌ๋กœ์‹ฑ ์‹œ๊ฐ„ ๋ณ€๋™์„ ๋ณด์ƒํ•จ์œผ๋กœ์จ ๊ฐœ์„ ๋œ๋‹ค. ์ œ์•ˆ๋œ ๋ฐฉ์‹์€ ํฐ ์‹ ํ˜ธ ๋ฐ ์Šค์œ„์นญ ์ „๋ ฅ์„ ์†Œ๋น„ํ•˜๋Š” ๋งŽ์€ ๋“œ๋ผ์ด๋ฒ„ ์Šฌ๋ผ์ด์Šค๋ฅผ ์ œ๊ฑฐํ•จ์œผ๋กœ์จ ๋“œ๋ผ์ด๋ฒ„ ๋ณต์žก์„ฑ์„ ํฌ๊ฒŒ ์ค„์ธ๋‹ค. ํ”„๋กœํ† ํƒ€์ž… ์นฉ์€ 28 nm CMOS ๊ณต์ •์œผ๋กœ ์ œ์ž‘๋˜์—ˆ์œผ๋ฉฐ 0.045 mm2 ์˜ ์‹ค์ œ ๋ฉด์ ์„ ์ฐจ์ง€ํ•œ๋‹ค. ์ธก์ •๋œ ๊ฒฐ๊ณผ๋Š” ์ œ์•ˆ๋œ ์†ก์‹ ๊ธฐ๊ฐ€ 1.0 V ๊ณต๊ธ‰์—์„œ 440 mVppd์˜ ์ถœ๋ ฅ ์Šค์œ™์œผ๋กœ 22 Gb/s์˜ ์†๋„์—์„œ 0.95 pJ/b์˜ ์—๋„ˆ์ง€ ํšจ์œจ์„ ๋‹ฌ์„ฑํ•จ์„ ๋ณด์—ฌ์ค€๋‹ค. ๋˜ํ•œ ํ”ผํฌ ๋Œ€ ํ”ผํฌ ์ง€ํ„ฐ๋Š” 15.0 dB ์†์‹ค์˜ ์ฑ„๋„์— ๋Œ€ํ•ด ์ œ์•ˆ๋œ ์œ„์ƒ ์ง€์—ฐ ๋ณด์ƒ์„ ํ†ตํ•ด 22 Gb/s์˜ ์†๋„์—์„œ 34 ps์—์„œ 20 ps๋กœ ๊ฐ์†Œ๋œ๋‹ค.In this thesis, a design of high-speed, power-efficient wireline transmitter is reported. An energy-efficient voltage-mode transmitter with an un-segmented output driver equalizes channel loss in the time-domain based on the phase de-lay analysis. By modulating the phase of the transmitting clock rather than the serialized data stream, the proposed transmitter significantly reduces the data-dependent jitter. The horizontal eye-opening is improved by compensating for the zero-crossing time variation dependent on the run-length of the transmitted data. The proposed scheme significantly reduces the driver complexity by elim-inating many driver slices that consume significant signaling and switching power. The prototype chip has been fabricated in a 28-nm CMOS process and occupies an active area of 0.045 mm2. The measured results show that the pro-posed transmitter achieves an energy efficiency of 0.95 pJ/b at 22 Gb/s with an output swing of 440 mVppd at 1.0 V supply. In addition, peak-to-peak jitter is reduced from 34 ps to 20 ps at 22 Gb/s with the proposed phase delay compen-sation over the channel with a 15.0 dB loss.CHAPTER 1 INTRODUCTION 1 1.1 MOTIVATION 1 1.2 THESIS ORGANIZATION 4 CHAPTER 2 BACKGROUNDS 5 2.1 OVERVIEW 5 2.2 FEED-FORWARD EQUALIZATION 7 2.2.1 AMPLITUDE-DOMAIN EQUALIZATION 7 2.2.2 PHASE-DOMAIN EQUALIZATION 12 2.2.3 PULSE-WIDTH MODULATION 18 2.3 ADAPTIVE FEED-FORWARD EQUALIZATION 21 2.3.1 AMPLITUDE-DOMAIN EQUALIZATION 21 2.3.2 PULSE-WIDTH MODULATION 24 CHAPTER 3 DESIGN OF THE TIME-BASED FEED-FORWARD EQUALIZATION OF THE TRANSMITTER 26 3.1 OVERVIEW 26 3.2 BASIC CONCEPT OF TIME-BASED FFE 28 3.2.1 ZERO-CROSSING TIME 28 3.2.2 PHASE DELAY 32 3.2.3 FINDING THE OPTIMUM COEFFICIENT 39 3.2.4 COMPARISON WITH CONVENTIONAL FFE 43 3.3 ADAPTIVE TIME-BASED FFE 50 3.3.1 OVERVIEW 50 3.3.2 BEHAVIORAL MODELING 51 3.3.3 SIMULATION RESULTS 53 3.4 TRANSMITTER IMPLEMENTATION 60 3.4.1 OVERVIEW 60 3.4.2 PHASE MODULATION 62 3.4.3 SERIALIZER AND CLOCK PATH 67 CHAPTER 4 MEASUREMENT 71 4.1 OVERVIEW 71 4.2 EYE DIAGRAM 76 4.3 POWER CONSUMPTION 81 CHAPTER 5 CONCLUSION 84 BIBLIOGRAPHY 86 ์ดˆ ๋ก 92๋ฐ•

    PAM4 Transmitter and Receiver Equalizers Optimization for High-Speed Serial Links

    Get PDF
    As the telecommunications markets evolves, the demand of faster data transfers and processing continue to increase. In order to confront this demand, the peripheral component interconnect express (PCIe) has been increasing the data rates from PCIe Gen 1(4 Gb/s) to PCIe Gen 5(32 Gb/s). This evolution has brought new challenges due to the high-speed interconnections effects which can cause data loss and intersymbol interference. Under these conditions the traditional non return to zero modulation (NRZ) scheme became a bottle neck due to bandwidth limitations in the high-speed interconnects. The pulse amplitude modulation 4-level (PAM4) scheme is been implemented in next generation of PCIe (PCIe6) doubling the data rate without increasing the channel bandwidth. However, while PAM4 solve the bandwidth problem it also brings new challenges in post silicon equalization. Tuning the transmitter (Tx) and receiver (Rx) across different interconnect channels can be a very time-consuming task due to multiple equalizers implemented in the serializer/deserializer (SerDes). Typical current industrial practices for SerDes equalizers tuning require massive lab measurements, since they are based on exhaustive enumeration methods, making the equalization process too lengthy and practically prohibitive under current silicon time-to-market commitments. In this masterโ€™s dissertation a numerical method is proposed to optimize the transmitter and receiver equalizers of a PCIe6 link. The experimental results, tested in a MATLAB simulation environment, demonstrate the effectiveness of the proposed approach by delivering optimal PAM4 eye diagrams margins while significantly reducing the jitter.ITESO, A.C

    Design of High-Speed SerDes Transceiver for Chip-to-Chip Communications in CMOS Process

    Get PDF
    With the continuous increase of on-chip computation capacities and exponential growth of data-intensive applications, the high-speed data transmission through serial links has become the backbone for modern communication systems. To satisfy the massive data-exchanging requirement, the data rate of such serial links has been updated from several Gb/s to tens of Gb/s. Currently, the commercial standards such as Ethernet 400GbE, InfiniBand high data rate (HDR), and common electrical interface (CEI)-56G has been developing towards 40+ Gb/s. As the core component within these links, the transceiver chipset plays a fundamental role in balancing the operation speed, power consumption, area occupation, and operation range. Meanwhile, the CMOS process has become the dominant technology in modern transceiver chip fabrications due to its large-scale digital integration capability and aggressive pricing advantage. This research aims to explore advanced techniques that are capable of exploiting the maximum operation speed of the CMOS process, and hence provides potential solutions for 40+ Gb/s CMOS transceiver designs. The major contributions are summarized as follows. A low jitter ring-oscillator-based injection-locked clock multiplier (RILCM) with a hybrid frequency tracking loop that consists of a traditional phase-locked loop (PLL), a timing-adjusted loop, and a loop selection state-machine is implemented in 65-nm C-MOS process. In the ring voltage-controlled oscillator, a full-swing pseudo-differential delay cell is proposed to lower the device noise to phase noise conversion. To obtain high operation speed and high detection accuracy, a compact timing-adjusted phase detector tightly combined with a well-matched charge pump is designed. Meanwhile, a lock-loss detection and lock recovery is devised to endow the RILCM with a similar lock-acquisition ability as conventional PLL, thus excluding the initial frequency set- I up aid and preventing the potential lock-loss risk. The experimental results show that the figure-of-merit of the designed RILCM reaches -247.3 dB, which is better than previous RILCMs and even comparable to the large-area LC-ILCMs. The transmitter (TX) and receiver (RX) chips are separately designed and fab- ricated in 65-nm CMOS process. The transmitter chip employs a quarter-rate multi-multiplexer (MUX)-based 4-tap feed-forward equalizer (FFE) to pre-distort the output. To increase the maximum operating speed, a bandwidth-enhanced 4:1 MUX with the capability of eliminating charge-sharing effect is proposed. To produce the quarter-rate parallel data streams with appropriate delays, a compact latch array associated with an interleaved-retiming technique is designed. The receiver chip employs a two-stage continuous-time linear equalizer (CTLE) as the analog front-end and integrates an improved clock data recovery to extract the sampling clocks and retime the incoming data. To automatically balance the jitter tracking and jitter suppression, passive low-pass filters with adaptively-adjusted bandwidth are introduced into the data-sampling path. To optimize the linearity of the phase interpolation, a time-averaging-based compensating phase interpolator is proposed. For equalization, a combined TX-FFE and RX-CTLE is applied to compensate for the channel loss, where a low-cost edge-data correlation-based sign zero-forcing adaptation algorithm is proposed to automatically adjust the TX-FFEโ€™s tap weights. Measurement results show that the fabricated transmitter/receiver chipset can deliver 40 Gb/s random data at a bit error rate of 16 dB loss at the half-baud frequency, while consuming a total power of 370 mW

    Design Techniques for High Pin Efficiency Wireline Transceivers

    Get PDF
    While the majority of wireline research investigates bandwidth improvement and how to overcome the high channel loss, pin efficiency is also critical in high-performance wireline applications. This dissertation proposes two different implementations for high pin efficiency wireline transceivers. The first prototype achieves twice pin efficiency than unidirectional signaling, which is 32Gb/s simultaneous bidirectional transceiver supporting transmission and reception on the same channel at the same time. It includes an efficient low-swing voltage-mode driver with an R-gm hybrid for signal separation, combining the continuous-time-linear-equalizer (CTLE) and echo cancellation (EC) in a single stage, and employing a low-complexity 5/4X CDA system. Support of a wide range of channels is possible with foreground adaptation of the EC finite impulse response (FIR) filter taps with a sign-sign least-mean-square (SSLMS) algorithm. Fabricated in TSMC 28-nm CMOS, the 32Gb/s SBD transceiver occupies 0.09mm20.09 mm^{2} area and achieves 16Gb/s uni-directional and 32Gb/s simultaneous bi-directional signals. 32Gb/s SBD operation consumes 1.83mW/Gb/s with 10.8dB channel loss at Nyquist rate. The second prototype presents an optical transmitter with a quantum-dot (QD) microring laser. This can support wavelength-division multiplexing allowing for high pin efficiency application by packing multiple high-bandwidth signals onto one optical channel. The development QD microring laser model accurately captures the intrinsic photonic high-speed dynamics and allows for the future co-design of the circuits and photonic device. To achieve higher bandwidth than intrinsic one, utilizing both techniques of optical injection locking (OIL) and 2-tap asymmetric Feed-forward equalizer (FFE) can perform 22Gb/s operation with 3.2mW/Gb/s. The first hybrid-integration directly-modulated OIL QD microring laser system is demonstrated

    Energy-Efficient Receiver Design for High-Speed Interconnects

    Get PDF
    High-speed interconnects are of vital importance to the operation of high-performance computing and communication systems, determining the ultimate bandwidth or data rates at which the information can be exchanged. Optical interconnects and the employment of high-order modulation formats are considered as the solutions to fulfilling the envisioned speed and power efficiency of future interconnects. One common key factor in bringing the success is the availability of energy-efficient receivers with superior sensitivity. To enhance the receiver sensitivity, improvement in the signal-to-noise ratio (SNR) of the front-end circuits, or equalization that mitigates the detrimental inter-symbol interference (ISI) is required. In this dissertation, architectural and circuit-level energy-efficient techniques serving these goals are presented. First, an avalanche photodetector (APD)-based optical receiver is described, which utilizes non-return-to-zero (NRZ) modulation and is applicable to burst-mode operation. For the purposes of improving the overall optical link energy efficiency as well as the link bandwidth, this optical receiver is designed to achieve high sensitivity and high reconfiguration speed. The high sensitivity is enabled by optimizing the SNR at the front-end through adjusting the APD responsivity via its reverse bias voltage, along with the incorporation of 2-tap feedforward equalization (FFE) and 2-tap decision feedback equalization (DFE) implemented in current-integrating fashion. The high reconfiguration speed is empowered by the proposed integrating dc and amplitude comparators, which eliminate the RC settling time constraints. The receiver circuits, excluding the APD die, are fabricated in 28-nm CMOS technology. The optical receiver achieves bit-error-rate (BER) better than 1Eโˆ’12 at โˆ’16-dBm optical modulation amplitude (OMA), 2.24-ns reconfiguration time with 5-dB dynamic range, and 1.37-pJ/b energy efficiency at 25 Gb/s. Second, a 4-level pulse amplitude modulation (PAM4) wireline receiver is described, which incorporates continuous time linear equalizers (CTLEs) and a 2-tap direct DFE dedicated to the compensation for the first and second post-cursor ISI. The direct DFE in a PAM4 receiver (PAM4-DFE) is made possible by the proposed CMOS track-and-regenerate slicer. This proposed slicer offers rail-to-rail digital feedback signals with significantly improved clock-to-Q delay performance. The reduced slicer delay relaxes the settling time constraint of the summer circuits and allows the stringent DFE timing constraint to be satisfied. With the availability of a direct DFE employing the proposed slicer, inductor-based bandwidth enhancement and loop-unrolling techniques, which can be power/area intensive, are not required. Fabricated in 28-nm CMOS technology, the PAM4 receiver achieves BER better than 1Eโˆ’12 and 1.1-pJ/b energy efficiency at 60 Gb/s, measured over a channel with 8.2-dB loss at Nyquist frequency. Third, digital neural-network-enhanced FFEs (NN-FFEs) for PAM4 analog-to-digital converter (ADC)-based optical interconnects are described. The proposed NN-FFEs employ a custom learnable piecewise linear (PWL) activation function to tackle the nonlinearities with short memory lengths. In contrast to the conventional Volterra equalizers where multipliers are utilized to generate the nonlinear terms, the proposed NN-FFEs leverage the custom PWL activation function for nonlinear operations and reduce the required number of multipliers, thereby improving the area and power efficiencies. Applications in the optical interconnects based on micro-ring modulators (MRMs) are demonstrated with simulation results of 50-Gb/s and 100-Gb/s links adopting PAM4 signaling. The proposed NN-FFEs and the conventional Volterra equalizers are synthesized with the standard-cell libraries in a commercial 28-nm CMOS technology, and their power consumptions and performance are compared. Better than 37% lower power overhead can be achieved by employing the proposed NN-FFEs, in comparison with the Volterra equalizer that leads to similar improvement in the symbol-error-rate (SER) performance.</p

    Modeling and Design of High-Speed CMOS Receivers for Short-Reach Photonic Links

    Get PDF
    This dissertation presents several research outcomes towards designing high-speed CMOS optical receivers for energy-efficient short-reach optical links. First, it provides a wide survey of recently published equalizer-based receivers and presents a novel methodology to accurately calculate their noise. The proposed methodology is then used to find the receiver that achieves the best sensitivity. Second, the trade-off between sensitivity and power dissipation of the receiver is optimized to reduce the energy consumption per bit of the overall link. Design trade-offs for the receiver, transmitter, and the overall link are presented, and comparisons are made to study how much receiver sensitivity can be sacrificed to save its power dissipation before this power reduction is outpaced by the transmitterโ€™s increase in power. Unlike conventional wisdom, our results show that energy-efficient links require low-power receivers with input capacitance much smaller than that required for noise-optimum performance. Third, the thesis presents a novel equalization technique for optical receivers. A linear equalizer (LE) is realized by adding a pole in the feedback paths of an active feedback-based wideband amplifier. By embedding the peaking in the main amplifier (MA), the front-end meets the sensitivity and gain of conventional LE-based receivers with better energy efficiency by eliminating the standalone equalizer stage(s). Electrical measurements are presented to demonstrate the capability of the proposed technique in restoring the bandwidth and improving the performance over the conventional design

    ๋ฐ์ดํ„ฐ ์ „์†ก๋กœ ํ™•์žฅ์„ฑ๊ณผ ๋ฃจํ”„ ์„ ํ˜•์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚จ ๋‹ค์ค‘์ฑ„๋„ ์ˆ˜์‹ ๊ธฐ๋“ค์— ๊ด€ํ•œ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2013. 2. ์ •๋•๊ท .Two types of serial data communication receivers that adopt a multichannel architecture for a high aggregate I/O bandwidth are presented. Two techniques for collaboration and sharing among channels are proposed to enhance the loop-linearity and channel-expandability of multichannel receivers, respectively. The first proposed receiver employs a collaborative timing scheme recovery which relies on the sharing of all outputs of phase detectors (PDs) among channels to extract common information about the timing and multilevel signaling architecture of PAM-4. The shared timing information is processed by a common global loop filter and is used to update the phase of the voltage-controlled oscillator with better rejection of per-channel noise. In addition to collaborative timing recovery, a simple linearization technique for binary PDs is proposed. The technique realizes a high-rate oversampling PD while the hardware cost is equivalent to that of a conventional 2x-oversampling clock and data recovery. The first receiver exploiting the collaborative timing recovery architecture is designed using 45-nm CMOS technology. A single data lane occupies a 0.195-mm2 area and consumes a relatively low 17.9 mW at 6 Gb/s at 1.0V. Therefore, the power efficiency is 2.98 mW/Gb/s. The simulated jitter is about 0.034 UI RMS given an input jitter value of 0.03 UI RMS, while the relatively constant loop bandwidth with the PD linearization technique is about 7.3-MHz regardless of the data-stream noise. Unlike the first receiver, the second proposed multichannel receiver was designed to reduce the hardware complexity of each lane. The receiver employs shared calibration logic among channels and yet achieves superior channel expandability with slim data lanes. A shared global calibration control, which is used in a forwarded clock receiver based on a multiphase delay-locked loop, accomplishes skew calibration, equalizer adaptation, and the phase lock of all channels during a calibration period, resulting in reduced hardware overhead and less area required by each data lane. The second forwarded clock receiver is designed in 90-nm CMOS technology. It achieves error-free eye openings of more than 0.5 UI across 9โˆ’ 28 inch Nelco 4000-6 microstrips at 4โˆ’ 7 Gb/s and more than 0.42 UI at data rates of up to 9 Gb/s. The data lane occupies only 0.152 mm2 and consumes 69.8 mW, while the rest of the receiver occupies 0.297 mm2 and consumes 56 mW at a data rate of 7 Gb/s and a supply voltage of 1.35 V.1. Introduction 1 1.1 Motivations 1.2 Thesis Organization 2. Previous Receivers for Serial-Data Communications 2.1 Classification of the Links 2.2 Clocking architecture of transceivers 2.3 Components of receiver 2.3.1 Channel loss 2.3.2 Equalizer 2.3.3 Clock and data recovery circuit 2.3.3.1. Basic architecture 2.3.3.2. Phase detector 2.3.3.2.1. Linear phase detector 2.3.3.2.2. Binary phase detector 2.3.3.3. Frequency detector 2.3.3.4. Charge pump 2.3.3.5. Voltage controlled oscillator and delay-line 2.3.4 Loop dynamics of PLL 2.3.5 Loop dynamics of DLL 3. The Proposed PLL-Based Receiver with Loop Linearization Technique 3.1 Introduction 3.2 Motivation 3.3 Overview of binary phase detection 3.4 The proposed BBPD linearization technique 3.4.1 Architecture of the proposed PLL-based receiver 3.4.2 Linearization technique of binary phase detection 3.4.3 Rotational pattern of sampling phase offset 3.5 PD gain analysis and optimization 3.6 Loop Dynamics of the 2nd-order CDR 3.7 Verification with the time-accurate behavioral simulation 3.8 Summary 4. The Proposed DLL-Based Receiver with Forwarded-Clock 4.1 Introduction 4.2 Motivation 4.3 Design consideration 4.4 Architecture of the proposed forwarded-clock receiver 4.5 Circuit description 4.5.1 Analog multi-phase DLL 4.5.2 Dual-input interpolating deley cells 4.5.3 Dedicated half-rate data samplers 4.5.4 Cherry-Hooper continuous-time linear equalizer 4.5.5 Equalizer adaptation and phase-lock scheme 4.6 Measurement results 5. Conclusion 6. BibliographyDocto

    Converged wireline and wireless signal distribution in optical fiber access networks

    Get PDF

    Toward realizing power scalable and energy proportional high-speed wireline links

    Get PDF
    Growing computational demand and proliferation of cloud computing has placed high-speed serial links at the center stage. Due to saturating energy efficiency improvements over the last five years, increasing the data throughput comes at the cost of power consumption. Conventionally, serial link power can be reduced by optimizing individual building blocks such as output drivers, receiver, or clock generation and distribution. However, this approach yields very limited efficiency improvement. This dissertation takes an alternative approach toward reducing the serial link power. Instead of optimizing the power of individual building blocks, power of the entire serial link is reduced by exploiting serial link usage by the applications. It has been demonstrated that serial links in servers are underutilized. On average, they are used only 15% of the time, i.e. these links are idle for approximately 85% of the time. Conventional links consume power during idle periods to maintain synchronization between the transmitter and the receiver. However, by powering-off the link when idle and powering it back when needed, power consumption of the serial link can be scaled proportionally to its utilization. This approach of rapid power state transitioning is known as the rapid-on/off approach. For the rapid-on/off to be effective, ideally the power-on time, off-state power, and power state transition energy must all be close to zero. However, in practice, it is very difficult to achieve these ideal conditions. Work presented in this dissertation addresses these challenges. When this research work was started (2011-12), there were only a couple of research papers available in the area of rapid-on/off links. Systematic study or design of a rapid power state transitioning in serial links was not available in the literature. Since rapid-on/off with nanoseconds granularity is not a standard in any wireline communication, even the popular test equipment does not support testing any such feature, neither any formal measurement methodology was available. All these circumstances made the beginning difficult. However, these challenges provided a unique opportunity to explore new architectural techniques and identify trade-offs. The key contributions of this dissertation are as follows. The first and foremost contribution is understanding the underlying limitations of saturating energy efficiency improvements in serial links and why there is a compelling need to find alternative ways to reduce the serial link power. The second contribution is to identify potential power saving techniques and evaluate the challenges they pose and the opportunities they present. The third contribution is the design of a 5Gb/s transmitter with a rapid-on/off feature. The transmitter achieves rapid-on/off capability in voltage mode output driver by using a fast-digital regulator, and in the clock multiplier by accurate frequency pre-setting and periodic reference insertion. To ease timing requirements, an improved edge replacement logic circuit for the clock multiplier is proposed. Mathematical modeling of power-on time as a function of various circuit parameters is also discussed. The proposed transmitter demonstrates energy proportional operation over wide variations of link utilization, and is, therefore, suitable for energy efficient links. Fabricated in 90nm CMOS technology, the voltage mode driver, and the clock multiplier achieve power-on-time of only 2ns and 10ns, respectively. This dissertation highlights key trade-off in the clock multiplier architecture, to achieve fast power-on-lock capability at the cost of jitter performance. The fourth contribution is the design of a 7GHz rapid-on/off LC-PLL based clock multi- plier. The phase locked loop (PLL) based multiplier was developed to overcome the limita- tions of the MDLL based approach. Proposed temperature compensated LC-PLL achieves power-on-lock in 1ns. The fifth and biggest contribution of this dissertation is the design of a 7Gb/s embedded clock transceiver, which achieves rapid-on/off capability in LC-PLL, current-mode transmit- ter and receiver. It was the first reported design of a complete transceiver, with an embedded clock architecture, having rapid-on/off capability. Background phase calibration technique in PLL and CDR phase calibration logic in the receiver enable instantaneous lock on power-on. The proposed transceiver demonstrates power scalability with a wide range of link utiliza- tion and, therefore, helps in improving overall system efficiency. Fabricated in 65nm CMOS technology, the 7Gb/s transceiver achieves power-on-lock in less than 20ns. The transceiver achieves power scaling by 44x (63.7mW-to-1.43mW) and energy efficiency degradation by only 2.2x (9.1pJ/bit-to-20.5pJ/bit), when the effective data rate (link utilization) changes by 100x (7Gb/s-to-70Mb/s). The sixth and final contribution is the design of a temperature sensor to compensate the frequency drifts due to temperature variations, during long power-off periods, in the fast power-on-lock LC-PLL. The proposed self-referenced VCO-based temperature sensor is designed with all digital logic gates and achieves low supply sensitivity. This sensor is suitable for integration in processor and DRAM environments. The proposed sensor works on the principle of directly converting temperature information to frequency and finally to digital bits. A novel sensing technique is proposed in which temperature information is acquired by creating a threshold voltage difference between the transistors used in the oscillators. Reduced supply sensitivity is achieved by employing junction capacitance, and the overhead of voltage regulators and an external ideal reference frequency is avoided. The effect of VCO phase noise on the sensor resolution is mathematically evaluated. Fabricated in the 65nm CMOS process, the prototype can operate with a supply ranging from 0.85V to 1.1V, and it achieves a supply sensitivity of 0.034oC/mV and an inaccuracy of ยฑ0.9oC and ยฑ2.3oC from 0-100oC after 2-point calibration, with and without static nonlinearity correction, respectively. It achieves a resolution of 0.3oC, resolution FoM of 0.3(nJ/conv)res2 , and measurement (conversion) time of 6.5ฮผs
    corecore