14 research outputs found

    Design Techniques for High Performance Serial Link Transceivers

    Get PDF
    Increasing data rates over electrical channels with significant frequency-dependent loss is difficult due to excessive inter-symbol interference (ISI). In order to achieve sufficient link margins at high rates, I/O system designers implement equalization in the transmitters and are motivated to consider more spectrally-efficient modulation formats relative to the common PAM-2 scheme, such as PAM-4 and duobinary. The first work, reviews when to consider PAM-4 and duobinary formats, as the modulation scheme which yields the highest system margins at a given data rate is a function of the channel loss profile, and presents a 20Gb/s triple-mode transmitter capable of efficiently implementing these three modulation schemes and three-tap feedforward equalization. A statistical link modeling tool, which models ISI, crosstalk, random noise, and timing jitter, is developed to compare the three common modulation formats operating on electrical backplane channel models. In order to improve duobinary modulation efficiency, a low-power quarter-rate duobinary precoder circuit is proposed which provides significant timing margin improvement relative to full-rate precoders. Also as serial I/O data rates scale above 10 Gb/s, crosstalk between neighboring channels degrades system bit-error rate (BER) performance. The next work presents receive-side circuitry which merges the cancellation of both near-end and far-end crosstalk (NEXT/FEXT) and can automatically adapt to different channel environments and variations in process, voltage, and temperature. NEXT cancellation is realized with a novel 3-tap FIR filter which combines two traditional FIR filter taps and a continuous-time band-pass filter IIR tap for efficient crosstalk cancellation, with all filter tap coefficients automatically determined via an ondie sign-sign least-mean-square (SS-LMS) adaptation engine. FEXT cancellation is realized by coupling the aggressor signal through a differentiator circuit whose gain is automatically adjusted with a power-detection-based adaptation loop. In conclusion, the proposed architectures in the transmitter side and receiver side together are to be good solution in the high speed I/O serial links to improve the performance by overcome the physical channel loss and adjacent channel noise as the system becomes complicated

    High-Speed Low-Voltage Line Driver for SerDes Applications

    Get PDF
    The driving factor behind this research was to design & develop a line driver capable of meeting the demanding specifications of the next generation of SerDes devices. In this thesis various line driver topologies were analysed to identify a topology suited for a high-speed low-voltage operating environment. This thesis starts of by introducing a relatively new high-speed communication Device called SerDes. SerDes is used in wired chip-to-chip communications and operates by converting a parallel data stream in a serial data stream that can be then transmitted at a higher bit rate, existing SerDes devices operate up to 12.5Gbps. A matching SerDes device at the destination will then convert the serial data stream back into a parallel data stream to be read by the destination ASIC. SerDes typically uses a line driver with a differential output. Using a differential line driver increases the resilience to outside sources of noise and reduces the amount of EM radiation produced by transmission. The focus of this research is to design and develop a line driver that can operate at 40Gbps and can function with a power supply of less than IV. This demanding specification was decided to be an accurate representation of future requirements that a line driver in a SerDes device will have to conform to. A suitable line driver with a differential output was identified to meet the demanding specifications and was modified so that it can perfonn an equalisation technique called pre-distortion. Two variations of the new topology were outlined and a behavioural model was created for both using Matlab Simulink. The behavioural model for both variants proved the concept, however only one variant maintained its perfomance once the designs were implemented at transistor level in Cadence, using a 65nm CMOS technology provided by Texas Instruments. The final line driver design was then converted into a layout design, again using Cadence, and RC parasitics were extracted to perfom a post-layout simulation. The post layout simulation shows that the novel line driver can operate at 40Gbps with a power supply of 1 V - O.8V and has a power consumption of 4.54m W /Gbps. The Deterministic Jitter added by the line driver is 12.9ps

    Bridging the gap : an optimization-based framework for fast, simultaneous circuit & system design space exploration

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 107-110).Design of modern mixed signal integrated circuits is becoming increasingly difficult. Continued MOSFET scaling is approaching the global power dissipation limits while increasing transistor variability, thus requiring careful allocation of power and area resources to achieve increasingly more aggressive performance specifications. In this tightly constrained environment traditional iterative system-to-circuit redesign loop, is becoming inefficient. With complex system architectures and circuit specifications approaching technological limits of the process employed, the designers have less room to margin for the overhead of strict system and circuit design interdependencies. Severely constrained modern mixed IC design can take many iterations to converge in such a design flow. This is an expensive and time consuming process. The situation is particularly acute in high-speed links. As an important building block of many systems (high speed I/O, on-chip communication, ...) power efficiency and area footprint are of utmost importance. Design of these systems is challenging in both system and circuit domain. On one hand system architectures are becoming increasingly complex to provide necessary performance increase. On the other, circuit implementation of these increasingly complicated systems is difficult to achieve under tight power and area budget. To bridge this gap between system and circuit design, we formulate a circuit-to-system optimization-driven framework. It is an equation-based description, powered by a human designer. Provided with equation-based model we use fast optimization tools to quickly scout the available design space. Presence of a designer in the flow is invaluable resource enabling significant saving by simplifying the models to capture only the relevant information and constraining the search space to areas where meaningful solutions might be expected to be found.(cont) Thus, the computational effort overhead that plagues the simulation-based design space exploration and design optimization is greatly reduced. The flow is powered by a signomial optimization engine. The key challenge is to bring, from the modeling point of view, very different problems such as circuit design and system design into the realm of an optimization engine that can solve them jointly, thus breaking the re-design loop or at least cutting it shorter. Relying on signomial programming is necessary in order to accurately model all the necessary phenomenons that arise in electrical circuits and at system level. For example, defining regions of operation of transistors under polarization conditions can not be modeled accurately with simpler type of equations. Similarly, calculating the effect of filtering to a signal also requires possibility to handle signomial equations. Thus, signomial programming is necessary yet not fully explored and finding suitable formulation might take some experimenting as we will see in this thesis. Signomial programming, as a general non-convex optimization problem, is still an active research area. Most of the solutions proposed so far involve local convexification of the problem in addition to branch & bound type of search. Furthermore, most of the non-convex problems are solved for one particular system of equations, and general methodology that is reliable and efficient is not known. Thus, a big part the work to be presented in this thesis is detailing how to construct a system formulation that the optimization engine can solve efficiently and reliably. We tested different formulations and their performance measured in terms of parsing and solving speed and accuracy. From these tests we motivate and explain how a series of transformations we introduce improve our formulation and arrive to a well-behaved and reliable form. We show how to apply our design flow in high-speed link design.(cont) By restructuring the traditional design flow we derive system and circuit abstractions. These sub-problems are interfaced through a set of well defined interface variables, which enables code level separation of problem descriptions, thus building a modular and easy to read and maintain system and circuit model. Finally we develop a set of scripts to automate formulating parametrized system level description. We explain how our transformations influence the speed of this process as well as the size of the model produced.by Ranko Sredojević.S.M

    Research and design of high-speed advanced analogue front-ends for fibre-optic transmission systems

    Get PDF
    In the last decade, we have witnessed the emergence of large, warehouse-scale data centres which have enabled new internet-based software applications such as cloud computing, search engines, social media, e-government etc. Such data centres consist of large collections of servers interconnected using short-reach (reach up to a few hundred meters) optical interconnect. Today, transceivers for these applications achieve up to 100Gb/s by multiplexing 10x 10Gb/s or 4x 25Gb/s channels. In the near future however, data centre operators have expressed a need for optical links which can support 400Gb/s up to 1Tb/s. The crucial challenge is to achieve this in the same footprint (same transceiver module) and with similar power consumption as todayā€™s technology. Straightforward scaling of the currently used space or wavelength division multiplexing may be difficult to achieve: indeed a 1Tb/s transceiver would require integration of 40 VCSELs (vertical cavity surface emitting laser diode, widely used for shortā€reach optical interconnect), 40 photodiodes and the electronics operating at 25Gb/s in the same module as todayā€™s 100Gb/s transceiver. Pushing the bit rate on such links beyond todayā€™s commercially available 100Gb/s/fibre will require new generations of VCSELs and their driver and receiver electronics. This work looks into a number of stateā€of-the-art technologies and investigates their performance restraints and recommends different set of designs, specifically targeting multilevel modulation formats. Several methods to extend the bandwidth using deep submicron (65nm and 28nm) CMOS technology are explored in this work, while also maintaining a focus upon reducing power consumption and chip area. The techniques used were pre-emphasis in rising and falling edges of the signal and bandwidth extensions by inductive peaking and different local feedback techniques. These techniques have been applied to a transmitter and receiver developed for advanced modulation formats such as PAM-4 (4 level pulse amplitude modulation). Such modulation format can increase the throughput per individual channel, which helps to overcome the challenges mentioned above to realize 400Gb/s to 1Tb/s transceivers

    Exploration and Design of High Performance Variation Tolerant On-Chip Interconnects

    Get PDF
    Siirretty Doriast

    Design of High-Speed SerDes Transceiver for Chip-to-Chip Communications in CMOS Process

    Get PDF
    With the continuous increase of on-chip computation capacities and exponential growth of data-intensive applications, the high-speed data transmission through serial links has become the backbone for modern communication systems. To satisfy the massive data-exchanging requirement, the data rate of such serial links has been updated from several Gb/s to tens of Gb/s. Currently, the commercial standards such as Ethernet 400GbE, InfiniBand high data rate (HDR), and common electrical interface (CEI)-56G has been developing towards 40+ Gb/s. As the core component within these links, the transceiver chipset plays a fundamental role in balancing the operation speed, power consumption, area occupation, and operation range. Meanwhile, the CMOS process has become the dominant technology in modern transceiver chip fabrications due to its large-scale digital integration capability and aggressive pricing advantage. This research aims to explore advanced techniques that are capable of exploiting the maximum operation speed of the CMOS process, and hence provides potential solutions for 40+ Gb/s CMOS transceiver designs. The major contributions are summarized as follows. A low jitter ring-oscillator-based injection-locked clock multiplier (RILCM) with a hybrid frequency tracking loop that consists of a traditional phase-locked loop (PLL), a timing-adjusted loop, and a loop selection state-machine is implemented in 65-nm C-MOS process. In the ring voltage-controlled oscillator, a full-swing pseudo-differential delay cell is proposed to lower the device noise to phase noise conversion. To obtain high operation speed and high detection accuracy, a compact timing-adjusted phase detector tightly combined with a well-matched charge pump is designed. Meanwhile, a lock-loss detection and lock recovery is devised to endow the RILCM with a similar lock-acquisition ability as conventional PLL, thus excluding the initial frequency set- I up aid and preventing the potential lock-loss risk. The experimental results show that the figure-of-merit of the designed RILCM reaches -247.3 dB, which is better than previous RILCMs and even comparable to the large-area LC-ILCMs. The transmitter (TX) and receiver (RX) chips are separately designed and fab- ricated in 65-nm CMOS process. The transmitter chip employs a quarter-rate multi-multiplexer (MUX)-based 4-tap feed-forward equalizer (FFE) to pre-distort the output. To increase the maximum operating speed, a bandwidth-enhanced 4:1 MUX with the capability of eliminating charge-sharing effect is proposed. To produce the quarter-rate parallel data streams with appropriate delays, a compact latch array associated with an interleaved-retiming technique is designed. The receiver chip employs a two-stage continuous-time linear equalizer (CTLE) as the analog front-end and integrates an improved clock data recovery to extract the sampling clocks and retime the incoming data. To automatically balance the jitter tracking and jitter suppression, passive low-pass filters with adaptively-adjusted bandwidth are introduced into the data-sampling path. To optimize the linearity of the phase interpolation, a time-averaging-based compensating phase interpolator is proposed. For equalization, a combined TX-FFE and RX-CTLE is applied to compensate for the channel loss, where a low-cost edge-data correlation-based sign zero-forcing adaptation algorithm is proposed to automatically adjust the TX-FFEā€™s tap weights. Measurement results show that the fabricated transmitter/receiver chipset can deliver 40 Gb/s random data at a bit error rate of 16 dB loss at the half-baud frequency, while consuming a total power of 370 mW

    Belle II Technical Design Report

    Full text link
    The Belle detector at the KEKB electron-positron collider has collected almost 1 billion Y(4S) events in its decade of operation. Super-KEKB, an upgrade of KEKB is under construction, to increase the luminosity by two orders of magnitude during a three-year shutdown, with an ultimate goal of 8E35 /cm^2 /s luminosity. To exploit the increased luminosity, an upgrade of the Belle detector has been proposed. A new international collaboration Belle-II, is being formed. The Technical Design Report presents physics motivation, basic methods of the accelerator upgrade, as well as key improvements of the detector.Comment: Edited by: Z. Dole\v{z}al and S. Un

    Hardware and Methods for Scaling Up Quantum Information Experiments

    Get PDF
    Quantum computation promises to solve presently intractable problems, with hopes of yielding solutions to pressing issues to society. Despite this, current machines are limited to tens of qubits. The field is in a state of continuous scaling, with groups around the world working on all aspects of this problem. The work of this thesis aims to contribute to this effort. It is motivated by the goal of increasing both the speed and bandwidth of experiments conducted within our laboratory. Low-loss radio-frequency multiplexers were characterised at cryogenic temperatures, with some shown to operate at below 7mK. The Analog Devices ADG904 was one of these, and its insertion loss was measured at <0.5dB up to 2GHz. Their heat load was measured, and it was found that a switching speed of 10 MHz with an RF signal power of -30dB dissipates 43uW. Installing these switches yields a benefit over installing extra cabling in our cryostat for a switching speed of up to 2MHz and RF power of -30dBm. A switch matrix was prototyped for cryogenic operation, enabling re-routing of wiring inside a cryostat with a minimally increased thermal load. This could be used to significantly increase the scale of high frequency experiments. This switch has also been embedded within a calibration routine, facilitating measurement of a specific feature of interest at millikelvin temperatures. As the field of quantum engineering scales, such measurements will be crucial to close the loop, providing feedback to fabrication and semiconductor growth efforts. Finally, a rapid-turnaround test rig has been developed which has 32 high frequency and 100 DC lines, enabling tests of significant scale in liquid helium. This reduces the time per experiment at 4.2 K to hours rather than days, enabling tests such as thermal cycling, as well as the evaluation of on-chip structures or active electronics and classical computing hardware; which are all necessary elements of any solid state quantum computing architecture
    corecore