110 research outputs found
An Energy-Efficient Reconfigurable Mobile Memory Interface for Computing Systems
The critical need for higher power efficiency and bandwidth transceiver design has significantly increased as mobile devices, such as smart phones, laptops, tablets, and ultra-portable personal digital assistants continue to be constructed using heterogeneous intellectual properties such as central processing units (CPUs), graphics processing units (GPUs), digital signal processors, dynamic random-access memories (DRAMs), sensors, and graphics/image processing units and to have enhanced graphic computing and video processing capabilities. However, the current mobile interface technologies which support CPU to memory communication (e.g. baseband-only signaling) have critical limitations, particularly super-linear energy consumption, limited bandwidth, and non-reconfigurable data access. As a consequence, there is a critical need to improve both energy efficiency and bandwidth for future mobile devices.;The primary goal of this study is to design an energy-efficient reconfigurable mobile memory interface for mobile computing systems in order to dramatically enhance the circuit and system bandwidth and power efficiency. The proposed energy efficient mobile memory interface which utilizes an advanced base-band (BB) signaling and a RF-band signaling is capable of simultaneous bi-directional communication and reconfigurable data access. It also increases power efficiency and bandwidth between mobile CPUs and memory subsystems on a single-ended shared transmission line. Moreover, due to multiple data communication on a single-ended shared transmission line, the number of transmission lines between mobile CPU and memories is considerably reduced, resulting in significant technological innovations, (e.g. more compact devices and low cost packaging to mobile communication interface) and establishing the principles and feasibility of technologies for future mobile system applications. The operation and performance of the proposed transceiver are analyzed and its circuit implementation is discussed in details. A chip prototype of the transceiver was implemented in a 65nm CMOS process technology. In the measurement, the transceiver exhibits higher aggregate data throughput and better energy efficiency compared to prior works
High-Speed and Low-Energy On-Chip Communication Circuits.
Continuous technology scaling sharply reduces transistor delays, while fixed-length global wire delays have increased due to less wiring pitch with higher resistance and coupling capacitance. Due to this ever growing gap, long on-chip interconnects pose well-known latency, bandwidth, and energy challenges to high-performance VLSI systems. Repeaters effectively mitigate wire RC effects but do little to improve their energy costs. Moreover, the increased complexity and high level of integration requires higher wire densities, worsening crosstalk noise and power consumption of conventionally repeated interconnects.
Such increasing concerns in global on-chip wires motivate circuits to improve wire performance and energy while reducing the number of repeaters. This work presents circuit techniques and investigation for high-performance and energy-efficient on-chip communication in the aspects of encoding, data compression, self-timed current injection, signal pre-emphasis, low-swing signaling, and technology mapping. The improved bus designs also consider the constraints of robust operation and performance/energy gains across process corners and design space. Measurement results from 5mm links on 65nm and 90nm prototype chips validate 2.5-3X improvement in energy-delay product.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/75800/1/jseo_1.pd
Recommended from our members
Silicon Photonics for All-Optical Processing and High-Bandwidth-Density Interconnects
Silicon photonics has emerged in recent years as one of the leading technologies poised to enable penetration of optical communications deeper and more intimately into computing systems than ever before. The integration potential of power efficient WDM links at the first level package or even deeper has been a strong driver for the rapid development this field has seen in recent years. The integration of photonic communication modules with very high bandwidth densities and virtually no bandwidth-distance limitations at the short reach regime of high performance computers and data centers has the potential to alleviate many of the bandwidth bottlenecks currently faced by board, rack, and facility levels. While networks on chip for chip multiprocessors (CMP) were initially deemed the target application of silicon photonic components, it has become evident in recent years that the initial lower hanging fruit is the CMP's I/O links to memory as well as other CMPs. The first chapter of the thesis provides more detailed motivation for the integration of silicon photonic modules into compute systems and surveys some of the recent developments in the field. The second chapter then proceeds to detail a technical case study of silicon photonic microring-based WDM links' scalability and power efficiency for these chip I/O applications which could be developed in the intermediate future. The analysis, initiated originally for a workshop on optical and electrical board and rack level interconnects, looks into a detailed model of the optical power budget for such a link capturing both single-channel aspects as well as WDM-operation-related considerations which are unique for a microring physical characteristics. The holistic analysis for the full link captures the wavelength-channel-spacing dependent characteristics, provides some methodologies for device design in the WDM-operation context, and provides performance predictions based on current best-of-class silicon photonic devices. The key results of the analysis are the determination of upper bounds on the aggregate achievable communication bandwidth per link, identifying design trade-offs for bandwidth versus power efficiency, and highlighting the need for continued technological improvements in both laser as well as photodetector technologies to allow acceptable power efficiency operation of such systems.The third chapter, while continuing on the theme silicon photonic high bandwidth density links, proceeds to detail the first experimental demonstration and characterization of an on-chip spatial division multiplexing (SDM) scheme based on microrings for the multiplexing and demultiplexing functionalities. In the context of more forward looking optical network-on-chip environments, SDM-enabled WDM photonic interconnects can potentially achieve superior bandwidth densities per waveguide compared to WDM-only photonic interconnects. The microring-based implementation allows dynamic tuning of the multiplexing and demultiplexing characteristic of the system which allows operation on WDM grid as well device tuning to combat intra-channel crosstalk. The characterization focuses on the first reported power penalty measurements for on-chip silicon photonic SDM link showing minimal penalties achievable with 3 spatial modes concurrently operating on a single waveguide with 10-Gb/s data carried by each mode. The chapter also details the first demonstration of WDM combined with SDM operation with six separate wavelength-and-spatial 10-Gb/s channels with error free operation and low power penalties. The fourth, fifth, and sixth chapters shift in topic from the application of silicon photonics to communication links to the evolving use of silicon waveguides for nonlinear all-optical processing. The unique tight mode confinement in sub-micron cross-sections combined with the high response of silicon have motivated the development of four-wave mixing (FWM)-based processing silicon devices. The key feature of the silicon platform for these nonlinear processing platforms is the ability to finely and uniformly control the dispersive properties of the optical structures in a way that enables completely offsetting the material dispersion and achieve dispersion profiles required for effective parametric interaction of waves in the optical structures. Chapter four primarily introduces and motivates nonlinear processing in communication applications and focuses on recent achievements in non-silicon and silicon FWM platforms. Chapter five describes some of the author's contributions on parametric processing of high speed data in silicon nonlinear devices, with first of a kind demonstrations of wavelength conversion of 160-Gb/s optically time division multiplexed (OTDM) data as well as the wavelength-multicasting of a 320-Gb/s OTDM stream. The chapter then details a methodical characterization and demonstration of several record wavelength conversion experiments of data in silicon with 40-Gb/s data wavelength-converted across more than 100 nm with only 1.4-dB of power penalties as well as the wavelength and format conversion of 10-Gb/s data across up to 168 nm with sensitivity gains stemming from the format conversion of about 2 dB and a residual conversion penalty of only 0.1 dB, achieved by implementing an improved experimental setup. Both experiments highlight the performance uniformity of the conversion process for a wide range of probe-idler detuning settings, showcasing the silicon platform's unique broadband phase matching properties. The sixth chapter presents a slight shift in motivation for parametric processing from traditional telecom-wavelength applications to functionalities developed targeting mid-IR operation. Parametric-processing in the silicon platform at long wavelengths holds large potential for performance improvements due to the elimination of two-photon absorption in silicon at long wavelengths as well as silicon's dispersion engineering capabilities which uniquely position the silicon platform for effective phase matching of significantly wavelength detuned waves. Four-wave mixing signal generation and reception at mid-IR wavelengths are attractive candidates for tunable flexible operation with modulation and detection speeds which are currently only available at telecom wavelengths. With this vision in mind, several contributions detailing extension of FWM functionalities in silicon to operate at wavelengths close to 2 ÎĽm with performance equivalent to much smaller detuning setting measurements. The contributions detail the experimental demonstration of the first silicon optical processing functionalities achieved at such long wavelengths including the wavelength conversion and unicast of 10-Gb/s signals with up to 700 nm of probe-idler detuning, the combined two-stage 10-Gb/s FWM-link in which both data generation and detection at 1900 nm is facilitated by parametric processing in silicon with only 2.1-dB overall penalty, the first ever 40-Gb/s receiver at 1900 nm based on a FWM stage for simultaneous temporal demultiplexing and wavelength conversion, and lastly, the demonstration of a 40-Gb/s FWM-link operation with only 3.6 dB of penalty. The chapter concludes with a short discussion on possible extensions to enable silicon parametric processing at even longer wavelengths targeting the mid-IR spectral transmission window of 3-5 ÎĽm
Chip-based Brillouin processing for carrier recovery in coherent optical communications
Modern fiber-optic coherent communications employ advanced
spectrally-efficient modulation formats that require sophisticated narrow
linewidth local oscillators (LOs) and complex digital signal processing (DSP).
Here, we establish a novel approach to carrier recovery harnessing large-gain
stimulated Brillouin scattering (SBS) on a photonic chip for up to 116.82
Gbit/sec self-coherent optical signals, eliminating the need for a separate LO.
In contrast to SBS processing on-fiber, our solution provides phase and
polarization stability while the narrow SBS linewidth allows for a
record-breaking small guardband of ~265 MHz, resulting in higher
spectral-efficiency than benchmark self-coherent schemes. This approach reveals
comparable performance to state-of-the-art coherent optical receivers without
requiring advanced DSP. Our demonstration develops a low-noise and
frequency-preserving filter that synchronously regenerates a low-power
narrowband optical tone that could relax the requirements on very-high-order
modulation signaling and be useful in long-baseline interferometry for
precision optical timing or reconstructing a reference tone for quantum-state
measurements.Comment: Part of this work has been presented as a postdealine paper at CLEO
Pacific-Rim'2017 and OSA Optic
Mapping multiplexing technique (MMT): a novel intensity modulated transmission format for high-speed optical communication systems
There is a huge rapid growth in the deployment of data centers, mainly driven from the increasing demand of internet services as video streaming, e-commerce, Internet Of Things (IOT), social media, and cloud computing. This led data centers to experience an expeditious increase in the amount of network traffic that they have to sustain due to requirement of scaling with the processing speed of Complementary metal–oxide–semiconductor (CMOS) technology. On the other side, as more and more data centers and processing cores are on demand, as the power consumption is becoming a challenging issue. Unless novel power efficient methodologies are innovated, the information technology industry will be more liable to a future power crunch. As such, low complex novel transmission formats featuring both power efficiency and low cost are considered the major characteristics enabling large-scale, high performance data transmission environment for short-haul optical interconnects and metropolitan range data networks.
In this thesis, a novel high-speed Intensity-Modulated Direct-Detection (IM/DD) transmission format named “Mapping Multiplexing Technique (MMT)” for high-speed optical fiber networks, is proposed and presented. Conceptually, MMT design challenges the high power consumption issue that exists in high-speed short and medium range networks. The proposed novel scheme provides low complex means for increasing the power efficiency of optical transceivers at an impactful tradeoff between power efficiency, spectral efficiency, and cost. The novel scheme has been registered as a patent (Malaysia PI2012700631) that can be employed for applications related but not limited to, short-haul optical interconnects in data centers and Metropolitan Area networks (MAN).
A comprehensive mathematical model for N-channel MMT modulation format has been developed. In addition, a signal space model for the N-channel MMT has been presented to serve as a platform for comparison with other transmission formats under optical channel constraints. Especially, comparison with M-PAM, as meanwhile are of practical interest to expand the capacity for optical interconnects deployment which has been recently standardized for Ethernet IEEE 802.3bs 100Gb/s and in today ongoing investigation activities by IEEE 802.3 400Gb/s Ethernet Task Force.
Performance metrics have been considered by the derivation of the average electrical and optical power for N-channel MMT symbols in comparison with Pulse Amplitude Modulation (M-PAM) format with respect to the information capacity. Asymptotic power efficiency evaluation in multi-dimensional signal space has been considered. For information capacity of 2, 3 and 4 bits/symbol, 2-channel, 3-channel and 4-channel MMT modulation formats can reduce the power penalty by 1.76 dB, 2.2 dB and 4 dB compared with 4-PAM, 8-PAM and 16-PAM, respectively. This enhancement is equivalent to 53%, 60% and 71% energy per bit reduction to the transmission of 2, 3 and 4 bits per symbol employing 2-, 3- and 4-channel MMT compared with 4-, 8- and 16-PAM format, respectively.
One of the major dependable parameters that affect the immunity of a modulation format to fiber non-linearities, is the system baud rate. The propagation of pulses in fiber with bitrates in the order > 10G, is not only limited by the linear fiber impairments, however, it has strong proportionality with fiber intra-channel non-linearities (Self Phase Modulation (SPM), Intra-channel Cross-Phase Modulation (IXPM) and Intra-channel Four-Wave Mixing (IFWM)). Hence, in addition to the potential application of MMT in short-haul networks, the thesis validates the practicality of implementing N-channel MMT system accompanied by dispersion compensation methodologies to extend the reach of error free transmission (BER ≤ 10-12) for Metro-networks. N-Channel MMT has been validated by real environment simulation results to outperform the performance of M-PAM in tolerating fiber non-linearities.
By the employment of pre-post compensation to tolerate both residual chromatic dispersion and non-linearity, performance above the error free transmission limit at 40Gb/s bit rate have been attained for 2-, 3- and 4-channel MMT over spans lengths of up to 1200Km, 320 Km and 320 Km, respectively. While, at an aggregated bit rate of 100 Gb/s, error free transmission can be achieved for 2-, 3- and 4-channel MMT over spans lengths of up to 480 Km, 80 Km and 160 Km, respectively.
At the same spectral efficiency, 4-channel MMT has realized a single channel maximum error free transmission over span lengths up to 320 Km and 160 Km at 40Gb/s and 100Gb/s, respectively, in contrast with 4-PAM attaining 240 Km and 80 Km at 40Gb/s and 100Gb/s, respectively
A novel high-speed trellis-coded modulation encoder/decoder ASIC design
Trellis-coded Modulation (TCM) is used in bandlimited communication systems. TCM efficiency improves coding gain by combining modulation and forward error correction coding in one process. In TCM, the bandwidth expansion is not required because it uses the same symbol rate and power spectrum; the differences are the introduction of a redundancy bit and the use of a constellation with double points. In this thesis, a novel TCM encoder/decoder ASIC chip implementation is presented. This ASIC codec not only increases decoding speed but also reduces hardware complexity. The algorithm and technique are presented for a 16-state convolutional code which is used in standard 256-QAM wireless systems. In the decoder, a Hamming distance is used as a cost function to determine output in the maximum likelihood Viterbi decoder. Using the relationship between the delay states and the path state in the Trellis tree of the code, a pre-calculated Hamming distances are stored in a look-up table. In addition, an output look-up-table is generated to determine the decoder output. This table is established by the two relative delay states in the code. The thesis provides details of the algorithm and the structure of TCM codec chip. Besides using parallel processing, the ASIC implementation also uses pipelining to further increase decoding speed. The codec was implemented in ASIC using standard 0.18Ć’Ăťm CMOS technology; the ASIC core occupied a silicon area of 1.1mm2. All register transfer level code of the codec was simulated and synthesized. The chip layout was generated and the final chip was fabricated by Taiwan Semiconductor Manufacturing Company through the Canadian Microelectronics Corporation. The functional testing of the fabricated codec was performed partially successful; the timing testing has not been fully accomplished because the chip was not always stable
- …