6 research outputs found
Design of Low-Power NRZ/PAM-4 Wireline Transmitters
Rapid growing demand for instant multimedia access in a myriad of digital devices has pushed
the need for higher bandwidth in modern communication hardwares ranging from short-reach (SR)
memory/storage interfaces to long-reach (LR) data center Ethernets. At the same time, comprehensive
design optimization of link system that meets the energy-efficiency is required for mobile
computing and low operational cost at datacenters. This doctoral study consists of design of two
low-swing wireline transmitters featuring a low-power clock distribution and 2-tap equalization in
energy-efficient manners up to 20-Gb/s operation. In spite of the reduced signaling power in the
voltage-mode (VM) transmit driver, the presence of the segment selection logic still diminishes the
power saving benefit.
The first work presents a scalable VM transmitter which offers low static power dissipation
and adopts an impedance-modulated 2-tap equalizer with analog tap control, thereby obviating
driver segmentation and reducing pre-driver complexity and dynamic power. Per-channel quadrature
clock generation with injection-locked oscillators (ILO) allows the generation of rail-to-rail
quadrature clocks. Energy efficiency is further improved with capacitively driven low-swing global
clock distribution and supply scaling at lower data rates, while output eye quality is maintained at
low voltages with automatic phase calibration of the local ILO-generated quarter-rate clocks. A
prototype fabricated in a general purpose 65 nm CMOS process includes a 2 mm global clock
distribution network and two transmitters that support an output swing range of 100-300mV with
up to 12-dB of equalization. The transmitters achieve 8-16 Gb/s operation at 0.65-1.05 pJ/b energy
efficiency.
The second work involves a dual-mode NRZ/PAM-4 differential low-swing voltage-mode (VM)
transmitter. The pulse-selected output multiplexing allows reduction of power supply and deterministic
jitter caused by large on-chip parasitic inherent in the transmission-gate-based multiplexers
in the earlier work. Analog impedance control replica circuits running in the background produce
gate-biasing voltages that control the peaking ratio for 2-tap feed-forward equalization and
PAM-4 symbol levels for high-linearity. This analog control also allows for efficient generation of
the middle levels in PAM-4 operation with good linearity quantified by level separation mismatch
ratio of 95%. In NRZ mode, 2-tap feedforward equalization is configurable in high-performance
controlled-impedance or energy-efficient impedance-modulated settings to provide performance
scalability. Analytic design consideration on dynamic power, data-rate, mismatch, and output
swing brings optimal performance metric on the given technology node. The proof-of-concept
prototype is verified on silicon with 65 nm CMOS process with improved performance in speed
and energy-efficiency owing to double-stack NMOS transistors in the output stage. The transmitter consumes as low as 29.6mW in 20-Gb/s NRZ and 25.5mW in the 28-Gb/s PAM-4 operations
Design of Multi-Gigabit Network Interconnect Elements and Protocols for a Data Acquisition System in Radiation Environments
Modern High Energy Physics experiments (HEP) explore the fundamental nature
of matter in more depth than ever before and thereby benefit greatly from the
advances in the field of communication technology. The huge data volumes
generated by the increasingly precise detector setups pose severe problems for
the Data Acquisition Systems (DAQ), which are used to process and store this
information. In addition, detector setups and their read-out electronics need
to be synchronized precisely to allow a later correlation of experiment events
accurately in time. Moreover, the substantial presence of charged particles from
accelerator-generated beams results in strong ionizing radiation levels, which has
a severe impact on the electronic systems.
This thesis recommends an architecture for unified network protocol IP cores
with custom developed physical interfaces for the use of reliable data acquisition
systems in strong radiation environments. Special configured serial bidirectional
point-to-point interconnects are proposed to realize high speed data transmission,
slow control access, synchronization and global clock distribution on unified links
to reduce costs and to gain compact and efficient read-out setups. Special features
are the developed radiation hardened functional units against single and multiple
bit upsets, and the common interface for statistical error and diagnosis information,
which integrates well into the protocol capabilities and eases the error handling in
large experiment setups. Many innovative designs for several custom FPGA and
ASIC platforms have been implemented and are described in detail. Special focus
is placed on the physical layers and network interface elements from high-speed
serial LVDS interconnects up to 20 Gb/s SSTL links in state-of-the-art process
technology.
The developed IP cores are fully tested by an adapted verification environment for
electronic design automation tools and also by live application. They are available
in a global repository allowing a broad usage within further HEP experiments
Design of High-Speed SerDes Transceiver for Chip-to-Chip Communications in CMOS Process
With the continuous increase of on-chip computation capacities and exponential growth of data-intensive applications, the high-speed data transmission through serial links has become the backbone for modern communication systems. To satisfy the massive data-exchanging requirement, the data rate of such serial links has been updated from several Gb/s to tens of Gb/s. Currently, the commercial standards such as Ethernet 400GbE, InfiniBand high data rate (HDR), and common electrical interface (CEI)-56G has been developing towards 40+ Gb/s. As the core component within these links, the transceiver chipset plays a fundamental role in balancing the operation speed, power consumption, area occupation, and operation range. Meanwhile, the CMOS process has become the dominant technology in modern transceiver chip fabrications due to its large-scale digital integration capability and aggressive pricing advantage. This research aims to explore advanced techniques that are capable of exploiting the maximum operation speed of the CMOS process, and hence provides potential solutions for 40+ Gb/s CMOS transceiver designs. The major contributions are summarized as follows.
A low jitter ring-oscillator-based injection-locked clock multiplier (RILCM) with a hybrid frequency tracking loop that consists of a traditional phase-locked loop (PLL), a timing-adjusted loop, and a loop selection state-machine is implemented in 65-nm C-MOS process. In the ring voltage-controlled oscillator, a full-swing pseudo-differential delay cell is proposed to lower the device noise to phase noise conversion. To obtain high operation speed and high detection accuracy, a compact timing-adjusted phase detector tightly combined with a well-matched charge pump is designed. Meanwhile, a lock-loss detection and lock recovery is devised to endow the RILCM with a similar lock-acquisition ability as conventional PLL, thus excluding the initial frequency set- I up aid and preventing the potential lock-loss risk. The experimental results show that the figure-of-merit of the designed RILCM reaches -247.3 dB, which is better than previous RILCMs and even comparable to the large-area LC-ILCMs.
The transmitter (TX) and receiver (RX) chips are separately designed and fab- ricated in 65-nm CMOS process. The transmitter chip employs a quarter-rate multi-multiplexer (MUX)-based 4-tap feed-forward equalizer (FFE) to pre-distort the output. To increase the maximum operating speed, a bandwidth-enhanced 4:1 MUX with the capability of eliminating charge-sharing effect is proposed. To produce the quarter-rate parallel data streams with appropriate delays, a compact latch array associated with an interleaved-retiming technique is designed. The receiver chip employs a two-stage continuous-time linear equalizer (CTLE) as the analog front-end and integrates an improved clock data recovery to extract the sampling clocks and retime the incoming data. To automatically balance the jitter tracking and jitter suppression, passive low-pass filters with adaptively-adjusted bandwidth are introduced into the data-sampling path. To optimize the linearity of the phase interpolation, a time-averaging-based compensating phase interpolator is proposed. For equalization, a combined TX-FFE and RX-CTLE is applied to compensate for the channel loss, where a low-cost edge-data correlation-based sign zero-forcing adaptation algorithm is proposed to automatically adjust the TX-FFE’s tap weights. Measurement results show that the fabricated transmitter/receiver chipset can deliver 40 Gb/s random data at a bit error rate of 16 dB loss at the half-baud frequency, while consuming a total power of 370 mW
Reducing jitter utilising adaptive pre-emphasis FIR filter for high speed serial links
Jitter requirements have become more stringent with higher speed serial communication links. Reducing jitter, with the main focus on reducing data dependant jitter (DDJ), is presented by employing adaptive finite impulse response (FIR) filter pre-emphasis. The adaptive FIR pre-emphasis is implemented in the IBM 7WL 0.18 µm SiGe BiCMOS process. SiGe heterojunction bipolar transistors (HBTs) provide high bandwidth, low noise devices which could reduce the total system jitter. The trade-offs between utilising metal oxide semiconductor (MOS) current mode logic (CML) and SiGe bipolar CML are also discussed in comparison with a very high fT (IBM 8HP process with fT = 200 GHz) process. A reduction in total system jitter can be achieved by keeping the sub-components of the system jitter constant while optimising the DDJ. High speed CML circuits have been employed to allow data rates in excess of 5 Gb/s to be transmitted whilst still maintaining an internal voltage swing of at least 300 mV. This allows the final FIR filter adaptation scheme to minimise the DDJ within 12.5 % of a unit interval, at a data rate of 5 Gb/s implementing 6 FIR pre-emphasis filter taps, for a worst case copper backplane channel (30" FR-4 channel). The implemented integrated circuit (IC) designed as part of the verification process takes up less than 1 mm2 of silicon real estate. In this dissertation, SPICE simulation results are presented, as well as the novel IC implementation of the proposed FIR filter adaptation technique as part of the hypothesis verification procedure. The implemented transmitter and receiver were tested for functionality, and showed the successful functional behaviour of all the implemented CML gates associated with the first filter tap. However, due to the slow charge and discharge rate of the pulse generation circuit in both the transmitter and receiver, only the main operational state of the transmitter could be experimentally validated. As a result of the adaptation scheme implemented, the contribution in this research lies in that a designer utilising such an IC can optimise the DDJ, reducing the total system jitter, and hence increasing the data fidelity with minimal effort.Dissertation (MEng)--University of Pretoria, 2011.Electrical, Electronic and Computer Engineeringunrestricte
Topical Workshop on Electronics for Particle Physics
The purpose of the workshop was to present results and original concepts for electronics research and development relevant to particle physics experiments as well as accelerator and beam instrumentation at future facilities; to review the status of electronics for the LHC experiments; to identify and encourage common efforts for the development of electronics; and to promote information exchange and collaboration in the relevant engineering and physics communities