56 research outputs found

    High-speed equalization and transmission in electrical interconnections

    Get PDF
    The relentless growth of data traffic and increasing digital signal processing capabilities of integrated circuits (IC) are demanding ever faster chip-to-chip / chip-to-module serial electrical interconnects. As data rates increase, the signal quality after transmission over printed circuit board (PCB) interconnections is severely impaired. Frequency-dependent loss and crosstalk noise lead to a reduced eye opening, a reduced signal-to-noise ratio and an increased inter-symbol interference (ISI). This, in turn, requires the use of improved signal processing or PCB materials, in order to overcome the bandwidth (BW) limitations and to improve signal integrity. By applying an optimal combination of equalizer and receiver electronics together with BW-efficient modulation schemes, the transmission rate over serial electrical interconnections can be pushed further. At the start of this research, most industrial backplane connectors, meeting the IEEE and OIF specifications such as manufactured by e.g. FCI or TE connectivity, had operational capabilities of up to 25 Gb/s. This research was mainly performed under the IWT ShortTrack project. The goal of this research was to increase the transmission speed over electrical backplanes up to 100 Gb/s per channel for next-generation telecom systems and data centers. This requirement greatly surpassed the state-ofthe-art reported in previous publications, considering e.g. 25 Gb/s duobinary and 42.8 Gb/s PAM-4 transmission over a low-loss Megtron 6 electrical backplane using off-line processing. The successful implementation of the integrated transmitter (TX) and receiver (RX) (1) , clearly shows the feasibility of single lane interconnections beyond 80 Gb/s and opens the potential of realizing industrial 100 Gb/s links using a recent IC technology process. Besides the advancement of the state-of-the-art in the field of high-speed transceivers and backplane transmission systems, which led to several academic publications, the output of this work also attracts a lot of attention from the industry, showing the potential to commercialize the developed chipset and technologies used in this research for various applications: not only in high-speed electrical transmission links, but also in high-speed opto-electronic communications such as access, active optical cables and optical backplanes. In this dissertation, the background of this research, an overview of this work and the thesis organization are illustrated in Chapter 1. In Chapter 2, a system level analysis is presented, showing that the channel losses are limiting the transmission speed over backplanes. In order to enhance the serial data rate over backplanes and to eliminate the signal degradation, several technologies are discussed, such as signal equalization and modulation techniques. First, a prototype backplane channel, from project partner FCI, implemented with improved backplane connectors is characterized. Second, an integrated transversal filter as a feed-forward equalizer (FFE) is selected to perform the signal equalization, based on a comprehensive consideration of the backplane channel performance, equalization capabilities, implementation complexity and overall power consumption. NRZ, duobinary and PAM-4 are the three most common modulation schemes for ultra-high speed electrical backplane communication. After a system-level simulation and comparison, the duobinary format is selected due to its high BW efficiency and reasonable circuit complexity. Last, different IC technology processes are compared and the ST microelectronics BiCMOS9MW process (featuring a fT value of over 200 GHz) is selected, based on a trade-off between speed and chip cost. Meanwhile it also has a benefit for providing an integrated microstrip model, which is utilized for the delay elements of the FFE. Chapter 3 illustrates the chip design of the high-speed backplane TX, consisting of a multiplexer (MUX) and a 5-tap FFE. The 4:1 MUX combines four lower rate streams into a high-speed differential NRZ signal up to 100 Gb/s as the FFE input. The 5-tap FFE is implemented with a novel topology for improved testability, such that the FFE performance can be individually characterized, in both frequency- and time-domain, which also helps to perform the coefficient optimization of the FFE. Different configurations for the gain cell in the FFE are compared. The gilbert configuration shows most advantages, in both a good high-frequency performance and an easy way to implement positive / negative amplification. The total chip, including the MUX and the FFE, consumes 750mW from a 2.5V supply and occupies an area of 4.4mm × 1.4 mm. In Chapter 4, the TX chip is demonstrated up to 84 Gb/s. First, the FFE performance is characterized in the frequency domain, showing that the FFE is able to work up to 84 Gb/s using duobinary formats. Second, the combination of the MUX and the FFE is tested. The equalized TX outputs are captured after different channels, for both NRZ and duobinary signaling at speeds from 64 Gb/s to 84 Gb/s. Then, by applying the duobinary RX 2, a serial electrical transmission link is demonstrated across a pair of 10 cm coax cables and across a 5 cm FX-2 differential stripline. The 5-tap FFE compensates a total loss between the TX and the RX chips of about 13.5 dB at the Nyquist frequency, while the RX receives the equalized signal and decodes the duobinary signal to 4 quarter rate NRZ streams. This shows a chip-to-chip data link with a bit error rate (BER) lower than 10−11. Last, the electrical data transmission between the TX and the RX over two commercial backplanes is demonstrated. An error-free, serial duobinary transmission across a commercial Megtron 6, 11.5 inch backplane is demonstrated at 48 Gb/s, which indicates that duobinary outperforms NRZ for attaining higher speed or longer reach backplane applications. Later on, using an ExaMAX® backplane demonstrator, duobinary transmission performance is verified and the maximum allowed channel loss at 40 Gb/s transmission is explored. The eye diagram and BER measurements over a backplane channel up to 26.25 inch are performed. The results show that at 40 Gb/s, a total channel loss up to 37 dB at the Nyquist frequency allows for error-free duobinary transmission, while a total channel loss of 42 dB was overcome with a BER below 10−8. An overview of the conclusions is summarized in Chapter 5, along with some suggestions for further research in this field. (1) The duobinary receiver was developed by my colleague Timothy De Keulenaer, as described in his PhD dissertation. (2) Described in the PhD dissertation of Timothy De Keulenaer

    State of the art in chip-to-chip interconnects

    Get PDF
    This thesis presents a study of short-range links for chips mounted in the same package, on printed circuit boards or interposers. Implemented in CMOS technology between 7 and 250 nm, with links that operate at a data rate between 0,4 and 112 Gb/s/pin and with energy efficiencies from 0,3 to 67,7 pJ/bit. The links operate on channels with an attenuation lower than 50 dB. A comparison is made with graphical representations between the different articles that shows the correlation between the different essential metrics of chip-to-chip interconnects, as well as its evolution over the last 20 years.Esta tesis presenta un estudio de enlaces de corto alcance para chips montados en un mismo paquete, en placas de circuito impreso o intercaladores. Implementado en tecnología CMOS entre 7 y 250 nm, con enlaces que operan a una velocidad de datos entre 0,4 y 112 Gb/s/pin y con eficiencias energéticas de 0,3 a 67,7 pJ/bit. Los enlaces operan en canales con una atenuación inferior a 50 dB. Se realiza una comparación con representaciones gráficas entre los diferentes artículos que muestra la correlación entre las distintas métricas esenciales de las interconexiones chip a chip, así como su evolución en los últimos 20 años.Aquesta tesi presenta un estudi d'enllaços de curt abast per a xips muntats en el mateix paquet, en plaques de circuits impresos o interposers. Implementat en tecnologia CMOS entre 7 i 250 nm, amb enllaços que funcionen a una velocitat de dades entre 0,4 i 112 Gb/s/pin i amb eficiències energètiques de 0,3 a 67,7 pJ/bit. Els enllaços funcionen en canals amb una atenuació inferior a 50 dB. Es fa una comparació amb representacions gràfiques entre els diferents articles que mostra la correlació entre les diferents mètriques essencials d'interconnexions xip a xip, així com la seva evolució en els darrers 20 anys

    High Speed Reconfigurable NRZ/PAM4 Transceiver Design Techniques

    Get PDF
    While the majority of wireline standards use simple binary non-return-to-zero (NRZ) signaling, four-level pulse-amplitude modulation (PAM4) standards are emerging to increase bandwidth density. This dissertation proposes efficient implementations for high speed NRZ/PAM4 transceivers. The first prototype includes a dual-mode NRZ/PAM4 serial I/O transmitter which can support both modulations with minimum power and hardware overhead. A source-series-terminated (SST) transmitter achieves 1.2Vpp output swing and employs lookup table (LUT) control of a 31-segment output digital-to-analog converter (DAC) to implement 4/2-tap feed-forward equalization (FFE) in NRZ/PAM4 modes, respectively. Transmitter power is improved with low-overhead analog impedance control in the DAC cells and a quarter-rate serializer based on a tri-state inverter-based mux with dynamic pre-driver gates. The transmitter is designed to work with a receiver that implements an NRZ/PAM4 decision feedback equalizer (DFE) that employs 1 finite impulse response (FIR) and 2 infinite impulse response (IIR) taps for first post-cursor and long-tail ISI cancellation, respectively. Fabricated in GP 65-nm CMOS, the transmitter occupies 0.060mm² area and achieves 16Gb/s NRZ and 32Gb/s PAM4 operation at 10.4 and 4.9 mW/Gb/s while operating over channels with 27.6 and 13.5dB loss at Nyquist, respectively. The second prototype presents a 56Gb/s four-level pulse amplitude modulation (PAM4) quarter-rate wireline receiver which is implemented in a 65nm CMOS process. The frontend utilize a single stage continuous time linear equalizer (CTLE) to boost the main cursor and relax the pre-cursor cancelation requirement, requiring only a 2-tap pre-cursor feed-forward equalization (FFE) on the transmitter side. A 2-tap decision feedback equalizer (DFE) with one finite impulse response (FIR) tap and one infinite impulse response (IIR) tap is employed to cancel first post-cursor and longtail inter-symbol interference (ISI). The FIR tap direct feedback is implemented inside the CML slicers to relax the critical timing of DFE and maximize the achievable data-rate. In addition to the per-slice main 3 data samplers, an error sampler is utilized for background threshold control and an edge-based sampler performs both PLL-based CDR phase detection and generates information for background DFE tap adaptation. The receiver consumes 4.63mW/Gb/s and compensates for up to 20.8dB loss when operated with a 2- tap FFE transmitter. The experimental results and comparison with state-of-the-art shows superior power efficiency of the presented prototypes for similar data-rate and channel loss. The usage of proposed design techniques are not limited to these specific prototypes and can be applied for any wireline transceiver with different modulation, data-rate and CMOS technology

    Design of Energy-Efficient A/D Converters with Partial Embedded Equalization for High-Speed Wireline Receiver Applications

    Get PDF
    As the data rates of wireline communication links increases, channel impairments such as skin effect, dielectric loss, fiber dispersion, reflections and cross-talk become more pronounced. This warrants more interest in analog-to-digital converter (ADC)-based serial link receivers, as they allow for more complex and flexible back-end digital signal processing (DSP) relative to binary or mixed-signal receivers. Utilizing this back-end DSP allows for complex digital equalization and more bandwidth-efficient modulation schemes, while also displaying reduced process/voltage/temperature (PVT) sensitivity. Furthermore, these architectures offer straightforward design translation and can directly leverage the area and power scaling offered by new CMOS technology nodes. However, the power consumption of the ADC front-end and subsequent digital signal processing is a major issue. Embedding partial equalization inside the front-end ADC can potentially result in lowering the complexity of back-end DSP and/or decreasing the ADC resolution requirement, which results in a more energy-effcient receiver. This dissertation presents efficient implementations for multi-GS/s time-interleaved ADCs with partial embedded equalization. First prototype details a 6b 1.6GS/s ADC with a novel embedded redundant-cycle 1-tap DFE structure in 90nm CMOS. The other two prototypes explain more complex 6b 10GS/s ADCs with efficiently embedded feed-forward equalization (FFE) and decision feedback equalization (DFE) in 65nm CMOS. Leveraging a time-interleaved successive approximation ADC architecture, new structures for embedded DFE and FFE are proposed with low power/area overhead. Measurement results over FR4 channels verify the effectiveness of proposed embedded equalization schemes. The comparison of fabricated prototypes against state-of-the-art general-purpose ADCs at similar speed/resolution range shows comparable performances, while the proposed architectures include embedded equalization as well

    Power-efficient high-speed interface circuit techniques

    Get PDF
    Inter- and intra-chip connections have become the new challenge to enable the scaling of computing systems, ranging from mobile devices to high-end servers. Demand for aggregate I/O bandwidth has been driven by applications including high-speed ethernet, backplane micro-servers, memory, graphics, chip-to-chip and network onchip. I/O circuitry is becoming the major power consumer in SoC processors and memories as the increasing bandwidth demands larger per-pin data rate or larger I/O pin count per component. The aggregate I/O bandwidth has approximately doubled every three to four years across a diverse range of standards in different applications. However, in order to keep pace with these standards enabled in part by process-technology scaling, we will require more than just device scaling in the near future. New energy-efficient circuit techniques must be proposed to enable the next generations of handheld and high-performance computers, given the thermal and system-power limits they start facing. ^ In this work, we are proposing circuit architectures that improve energy efficiency without decreasing speed performance for the most power hungry circuits in high speed interfaces. By the introduction of a new kind of logic operators in CMOS, called implication operators, we implemented a new family of high-speed frequency dividers/prescalers with reduced footprint and power consumption. New techniques and circuits for clock distribution, for pre-emphasis and for driver at the transmitter side of the I/O circuitry have been proposed and implemented. At the receiver side, new DFE architecture and CDR have been proposed and have been proven experimentally

    Design of High-Speed SerDes Transceiver for Chip-to-Chip Communications in CMOS Process

    Get PDF
    With the continuous increase of on-chip computation capacities and exponential growth of data-intensive applications, the high-speed data transmission through serial links has become the backbone for modern communication systems. To satisfy the massive data-exchanging requirement, the data rate of such serial links has been updated from several Gb/s to tens of Gb/s. Currently, the commercial standards such as Ethernet 400GbE, InfiniBand high data rate (HDR), and common electrical interface (CEI)-56G has been developing towards 40+ Gb/s. As the core component within these links, the transceiver chipset plays a fundamental role in balancing the operation speed, power consumption, area occupation, and operation range. Meanwhile, the CMOS process has become the dominant technology in modern transceiver chip fabrications due to its large-scale digital integration capability and aggressive pricing advantage. This research aims to explore advanced techniques that are capable of exploiting the maximum operation speed of the CMOS process, and hence provides potential solutions for 40+ Gb/s CMOS transceiver designs. The major contributions are summarized as follows. A low jitter ring-oscillator-based injection-locked clock multiplier (RILCM) with a hybrid frequency tracking loop that consists of a traditional phase-locked loop (PLL), a timing-adjusted loop, and a loop selection state-machine is implemented in 65-nm C-MOS process. In the ring voltage-controlled oscillator, a full-swing pseudo-differential delay cell is proposed to lower the device noise to phase noise conversion. To obtain high operation speed and high detection accuracy, a compact timing-adjusted phase detector tightly combined with a well-matched charge pump is designed. Meanwhile, a lock-loss detection and lock recovery is devised to endow the RILCM with a similar lock-acquisition ability as conventional PLL, thus excluding the initial frequency set- I up aid and preventing the potential lock-loss risk. The experimental results show that the figure-of-merit of the designed RILCM reaches -247.3 dB, which is better than previous RILCMs and even comparable to the large-area LC-ILCMs. The transmitter (TX) and receiver (RX) chips are separately designed and fab- ricated in 65-nm CMOS process. The transmitter chip employs a quarter-rate multi-multiplexer (MUX)-based 4-tap feed-forward equalizer (FFE) to pre-distort the output. To increase the maximum operating speed, a bandwidth-enhanced 4:1 MUX with the capability of eliminating charge-sharing effect is proposed. To produce the quarter-rate parallel data streams with appropriate delays, a compact latch array associated with an interleaved-retiming technique is designed. The receiver chip employs a two-stage continuous-time linear equalizer (CTLE) as the analog front-end and integrates an improved clock data recovery to extract the sampling clocks and retime the incoming data. To automatically balance the jitter tracking and jitter suppression, passive low-pass filters with adaptively-adjusted bandwidth are introduced into the data-sampling path. To optimize the linearity of the phase interpolation, a time-averaging-based compensating phase interpolator is proposed. For equalization, a combined TX-FFE and RX-CTLE is applied to compensate for the channel loss, where a low-cost edge-data correlation-based sign zero-forcing adaptation algorithm is proposed to automatically adjust the TX-FFE’s tap weights. Measurement results show that the fabricated transmitter/receiver chipset can deliver 40 Gb/s random data at a bit error rate of 16 dB loss at the half-baud frequency, while consuming a total power of 370 mW

    Modeling, Optimization and Power Efficiency Comparison of High-speed Inter-chip Electrical and Optical Interconnect Architectures in Nanometer CMOS Technologies

    Get PDF
    Inter-chip input-output (I/O) communication bandwidth demand, which rapidly scaled with integrated circuit scaling, has leveraged equalization techniques to operate reliably on band-limited channels at additional power and area complexity. High-bandwidth inter-chip optical interconnect architectures have the potential to address this increasing I/O bandwidth. Considering future tera-scale systems, power dissipation of the high-speed I/O link becomes a significant concern. This work presents a design flow for the power optimization and comparison of high-speed electrical and optical links at a given data rate and channel type in 90 nm and 45 nm CMOS technologies. The electrical I/O design framework combines statistical link analysis techniques, which are used to determine the link margins at a given bit-error rate (BER), with circuit power estimates based on normalized transistor parameters extracted with a constant current density methodology to predict the power-optimum equalization architecture, circuit style, and transmit swing at a given data rate and process node for three different channels. The transmitter output swing is scaled to operate the link at optimal power efficiency. Under consideration for optical links are a near-term architecture consisting of discrete vertical-cavity surface-emitting lasers (VCSEL) with p-i-n photodetectors (PD) and three long-term integrated photonic architectures that use waveguide metal-semiconductor-metal (MSM) photodetectors and either electro-absorption modulator (EAM), ring resonator modulator (RRM), or Mach-Zehnder modulator (MZM) sources. The normalized transistor parameters are applied to jointly optimize the transmitter and receiver circuitry to minimize total optical link power dissipation for a specified data rate and process technology at a given BER. Analysis results shows that low loss channel characteristics and minimal circuit complexity, together with scaling of transmitter output swing, allows electrical links to achieve excellent power efficiency at high data rates. While the high-loss channel is primarily limited by severe frequency dependent losses to 12 Gb/s, the critical timing path of the first tap of the decision feedback equalizer (DFE) limits the operation of low-loss channels above 20 Gb/s. Among the optical links, the VCSEL-based link is limited by its bandwidth and maximum power levels to a data rate of 24 Gb/s whereas EAM and RRM are both attractive integrated photonic technologies capable of scaling data rates past 30 Gb/s achieving excellent power efficiency in the 45 nm node and are primarily limited by coupling and device insertion losses. While MZM offers robust operation due to its wide optical bandwidth, significant improvements in power efficiency must be achieved to become applicable for high density applications

    Design Techniques for High Performance Serial Link Transceivers

    Get PDF
    Increasing data rates over electrical channels with significant frequency-dependent loss is difficult due to excessive inter-symbol interference (ISI). In order to achieve sufficient link margins at high rates, I/O system designers implement equalization in the transmitters and are motivated to consider more spectrally-efficient modulation formats relative to the common PAM-2 scheme, such as PAM-4 and duobinary. The first work, reviews when to consider PAM-4 and duobinary formats, as the modulation scheme which yields the highest system margins at a given data rate is a function of the channel loss profile, and presents a 20Gb/s triple-mode transmitter capable of efficiently implementing these three modulation schemes and three-tap feedforward equalization. A statistical link modeling tool, which models ISI, crosstalk, random noise, and timing jitter, is developed to compare the three common modulation formats operating on electrical backplane channel models. In order to improve duobinary modulation efficiency, a low-power quarter-rate duobinary precoder circuit is proposed which provides significant timing margin improvement relative to full-rate precoders. Also as serial I/O data rates scale above 10 Gb/s, crosstalk between neighboring channels degrades system bit-error rate (BER) performance. The next work presents receive-side circuitry which merges the cancellation of both near-end and far-end crosstalk (NEXT/FEXT) and can automatically adapt to different channel environments and variations in process, voltage, and temperature. NEXT cancellation is realized with a novel 3-tap FIR filter which combines two traditional FIR filter taps and a continuous-time band-pass filter IIR tap for efficient crosstalk cancellation, with all filter tap coefficients automatically determined via an ondie sign-sign least-mean-square (SS-LMS) adaptation engine. FEXT cancellation is realized by coupling the aggressor signal through a differentiator circuit whose gain is automatically adjusted with a power-detection-based adaptation loop. In conclusion, the proposed architectures in the transmitter side and receiver side together are to be good solution in the high speed I/O serial links to improve the performance by overcome the physical channel loss and adjacent channel noise as the system becomes complicated
    corecore