51 research outputs found
Recommended from our members
High Performance Silicon Photonic Interconnected Systems
Advances in data-driven applications, particularly artificial intelligence and deep learning, are driving the explosive growth of computation and communication in today’s data centers and high-performance computing (HPC) systems. Increasingly, system performance is not constrained by the compute speed at individual nodes, but by the data movement between them. This calls for innovative architectures, smart connectivity, and extreme bandwidth densities in interconnect designs. Silicon photonics technology leverages mature complementary metal-oxide-semiconductor (CMOS) manufacturing infrastructure and is promising for low cost, high-bandwidth, and reconfigurable interconnects. Flexible and high-performance photonic switched architectures are capable of improving the system performance. The work in this dissertation explores various photonic interconnected systems and the associated optical switching functionalities, hardware platforms, and novel architectures. It demonstrates the capabilities of silicon photonics to enable efficient deep learning training.
We first present field programmable gate array (FPGA) based open-loop and closed-loop control for optical spectral-and-spatial switching of silicon photonic cascaded micro-ring resonator (MRR) switches. Our control achieves wavelength locking at the user-defined resonance of the MRR for optical unicast, multicast, and multiwavelength-select functionalities. Digital-to-analog converters (DACs) and analog-to-digital converters (ADCs) are necessary for the control of the switch. We experimentally demonstrate the optical switching functionalities using an FPGA-based switch controller through both traditional multi-bit DAC/ADC and novel single-wired DAC/ADC circuits. For system-level integration, interfaces to the switch controller in a network control plane are developed. The successful control and the switching functionalitiesachieved are essential for system-level architectural innovations as presented in the following sections.
Next, this thesis presents two novel photonic switched architectures using the MRR-based switches. First, a photonic switched memory system architecture was designed to address memory challenges in deep learning. The reconfigurable photonic interconnects provide scalable solutions and enable efficient use of disaggregated memory resources for deep learning training. An experimental testbed was built with a processing system and two remote memory nodes using silicon photonic switch fabrics and system performance improvements were demonstrated. The collective results and existing high-bandwidth optical I/Os show the potential of integrating the photonic switched memory to state-of-the-art processing systems. Second, the scaling trends of deep learning models and distributed training workloads are challenging network capacities in today’s data centers and HPCs. A system architecture that leverages SiP switch-enabled server regrouping is proposed to tackle the challenges and accelerate distributed deep learning training. An experimental testbed with a SiP switch-enabled reconfigurable fat tree topology was built to evaluate the network performance of distributed ring all-reduce and parameter server workloads. We also present system-scale simulations. Server regrouping and bandwidth steering were performed on a large-scale tapered fat tree with 1024 compute nodes to show the benefits of using photonic switched architectures in systems at scale.
Finally, this dissertation explores high-bandwidth photonic interconnect designs for disaggregated systems. We first introduce and discuss two disaggregated architectures leveraging extreme high bandwidth interconnects with optically interconnected computing resources. We present the concept of rack-scale graphics processing unit (GPU) disaggregation with optical circuit switches and electrical aggregator switches. The architecture can leverage the flexibility of high bandwidth optical switches to increase hardware utilization and reduce application runtimes. A testbed was built to demonstrate resource disaggregation and defragmentation. In addition, we also present an extreme high-bandwidth optical interconnect accelerated low-latency communication architecture for deep learning training. The disaggregated architecture utilizes comb laser sources and MRR-based cross-bar switching fabrics to enable an all-to-all high bandwidth communication with a constant latency cost for distributed deep learning training. We discuss emerging technologies in the silicon photonics platform, including light source, transceivers, and switch architectures, to accommodate extreme high bandwidth requirements in HPC and data center environments. A prototype hardware innovation - Optical Network Interface Cards (comprised of FPGA, photonic integrated circuits (PIC), electronic integrated circuits (EIC), interposer, and high-speed printed circuit board (PCB)) is presented to show the path toward fast lanes for expedited execution at 10 terabits.
Taken together, the work in this dissertation demonstrates the capabilities of high-bandwidth silicon photonic interconnects and innovative architectural designs to accelerate deep learning training in optically connected data center and HPC systems
Enabling Technologies for Optical Data Center Networks: Spatial Division Multiplexing
With the continuously growing popularity of cloud services, the traffic volume inside the\ua0data\ua0centers is dramatically increasing. As a result, a scalable and efficient infrastructure\ua0for\ua0data\ua0center\ua0networks\ua0(DCNs) is required. The current\ua0optical\ua0DCNs using either individual fibers or fiber ribbons are costly, bulky, hard to manage, and not scalable.\ua0Spatial\ua0division\ua0multiplexing\ua0(SDM) based on multicore or multimode (few-mode) fibers is recognized as a promising technology to increase the\ua0spatial\ua0efficiency\ua0for\ua0optical\ua0DCNs, which opens a new way towards high capacity and scalability. This tutorial provides an overview of the components, transmission options, and interconnect architectures\ua0for\ua0SDM-based DCNs, as well as potential technical challenges and future directions. It also covers the co-existence of SDM and other\ua0multiplexing\ua0techniques, such as wavelength-division\ua0multiplexing\ua0and flexible spectrum\ua0multiplexing, in\ua0optical\ua0DCNs
Future Energy Efficient Data Centers With Disaggregated Servers
The popularity of the Internet and the demand for 24/7 services uptime is driving system performance and reliability requirements to levels that today's data centers can no longer support. This paper examines the traditional monolithic conventional server (CS) design and compares it to a new design paradigm: the disaggregated server (DS) data center design. The DS design arranges data centers resources in physical pools, such as processing, memory, and IO module pools, rather than packing each subset of such resources into a single server box. In this paper, we study energy efficient resource provisioning and virtual machine (VM) allocation in DS-based data centers compared to CS-based data centers. First, we present our new design for the photonic DS-based data center architecture, supplemented with a complete description of the architectural components. Second, we develop a mixed integer linear programming (MILP) model to optimize VM allocation for the DS-based data center, including the data center communication fabric power consumption. Our results indicate that, in DS data centers, the optimum allocation of pooled resources and their communication power yields up to 42% average savings in total power consumption when compared with the CS approach. Due to the MILP high computational complexity, we developed an energy efficient resource provisioning heuristic for DS with communication fabric (EERP-DSCF), based on the MILP model insights, with comparable power efficiency to the MILP model. With EERP-DSCF, we can extend the number of served VMs, where the MILP model scalability for a large number of VMs is challenging. Furthermore, we assess the energy efficiency of the DS design under stringent conditions by increasing the CPU to memory traffic and by including high noncommunication power consumption to determine the conditions at which the DS and CS designs become comparable in power consumption. Finally, we present a complete analysis of the communication patterns in our new DS design and some recommendations for design and implementation challenges
Optical Networks and Interconnects
The rapid evolution of communication technologies such as 5G and beyond, rely
on optical networks to support the challenging and ambitious requirements that
include both capacity and reliability. This chapter begins by giving an
overview of the evolution of optical access networks, focusing on Passive
Optical Networks (PONs). The development of the different PON standards and
requirements aiming at longer reach, higher client count and delivered
bandwidth are presented. PON virtualization is also introduced as the
flexibility enabler. Triggered by the increase of bandwidth supported by access
and aggregation network segments, core networks have also evolved, as presented
in the second part of the chapter. Scaling the physical infrastructure requires
high investment and hence, operators are considering alternatives to optimize
the use of the existing capacity. This chapter introduces different planning
problems such as Routing and Spectrum Assignment problems, placement problems
for regenerators and wavelength converters, and how to offer resilience to
different failures. An overview of control and management is also provided.
Moreover, motivated by the increasing importance of data storage and data
processing, this chapter also addresses different aspects of optical data
center interconnects. Data centers have become critical infrastructure to
operate any service. They are also forced to take advantage of optical
technology in order to keep up with the growing capacity demand and power
consumption. This chapter gives an overview of different optical data center
network architectures as well as some expected directions to improve the
resource utilization and increase the network capacity
MCF-SMF Hybrid Low-Latency Circuit-Switched Optical Network for Disaggregated Data Centers
This paper proposes and experimentally evaluates a
fully developed novel architecture with purpose built low latency
communication protocols for next generation disaggregated data
centers (DDCs). In order to accommodate for capacity and
latency needs of disaggregated IT elements (i.e. CPU, memory),
this architecture makes use of a low latency and high capacity
circuit switched optical network for interconnecting various endpoints, that are equipped with multi-channel Silicon photonic
based integrated transceivers. In a move to further decrease the
perceived latency between various disaggregated IT elements,
this paper proposes a) a novel network topology, which cuts
down the latency over the optical network by 34% while
enhancing system scalability and b) channel bonding over multicore fiber (MCF) switched links to reduce head to tail latency
and in turn increase sustained memory bandwidth for
disaggregated remote memory. Furthermore, to reduce power
consumption and enhance space efficiency, the integration of
novel multi core fiber (MCF) based transceivers, fibers and
optical switches are proposed and experimentally validated at the
physical layer for this topology. It is shown that the integration of
MCF based subsystems in this topology can bring about an
improvement in energy efficiency of the optical switching layer
which is above 60%. Finally, the performance of this proposed
architecture and topology is evaluated experimentally at the
application layer where the perceived memory throughput for
accessing remote and local resources is measured and compared
using electrical circuit and packet switching. The results also
highlight a multi fold increase in application perceived memory
throughput over the proposed DDC topology by utilization and
bonding of multiple optical channels to interconnect
disaggregated IT elements that can be carried over MCF links
MONet: Heterogeneous Memory over Optical Network for Large-Scale Data Centre Resource Disaggregation
Memory over Optical Network (MONet) system is a disaggregated data center architecture where serial (HMC) / parallel (DDR4) memory resources can be accessed over optically switched interconnects within and between racks. An FPGA/ASIC-based custom hardware IP (ReMAT) supports heterogeneous memory pools, accommodates optical-to-electrical conversion for remote access, performs the required serial/parallel conversion and hosts the necessary local memory controller. Optically interconnected HMC-based (serial I/O type) memory card is accessed by a memory controller embedded in the compute card, simplifying the hardware near the memory modules. This substantially reduces overheads on latency, cost, power consumption and space. We characterize CPU-memory performance, by experimentally demonstrating the impact of distance, number of switching hops, transceivers, channel bonding and bit-rate per transceiver on bit-error rate, power consumption, additional latency, sustained remote memory bandwidth/throughput (using industry standard benchmark STREAMS) and cloud workload performance (such as operations per second, average added latency and retired instructions per second on memcached with YCSB cloud workloads). MONet pushes the CPU-memory operational limit from a few centimetres to 10s of metres, yet applications can experience as low as 10% performance penalty (at 36m) compared to a direct-attached equivalent. Using the proposed parallel topology, a system can support up to 100,000 disaggregated cards
Recommended from our members
Silicon Photonic Subsystems for Inter-Chip Optical Networks
The continuous growth of electronic compute and memory nodes in terms of the number of I/O pins, bandwidth, and areal throughput poses major integration and packaging challenges associated with offloading multi-Tbit/s data rates within the few pJ/bit targets. While integrated photonics are already deployed in long and short distances such as inter and intra data centers communications, the promising characteristics of the silicon photonic platform set it as the future technology for optical interconnects in ultra short inter-chip distances. The high index contrast between the waveguide and the cladding together with strong thermo-optic and carrier effects in silicon allows developing a wide range of micro-scale and low power optical devices compatible with the CMOS fabrication processes. Furthermore, the availability of photonic foundries and new electrical and optical co-packaging techniques further pushes this platform for the next steps of commercial deployment.
The work in this dissertation presents the current trends in high-performance memory and processor nodes and gives motivation for disaggregated and reconfigurable inter-chip network enabled with the silicon photonic layer. A dense WDM transceiver and broadband switch architectures are discussed to support a bi-directional network of ten hybrid-memory cubes (HMC) interconnected to ten processor nodes with an overall aggregated bandwidth of 9.6Tbit/s. Latency and energy consumption are key performance parameters in a processor to primary memory nodes connectivity. The transceiver design is based on energy-efficient micro-ring resonators, and the broadband switch is constructed with 2x2 Mach-Zehnder elements for nano-second reconfiguration. Each transceiver is based on hundreds of micro-rings to convert the native HMC electrical protocol to the optical domain and the switch is based on tens of hundreds of 2x2 elements to achieve non-blocking all-to-all connectivity.
The next chapters focus on developing methods for controlling and monitoring such complex and highly integrated silicon photonic subsystems. The thermo-optic effect is characterized and we show experimentally that the phase of the optical carrier can be reliably controlled with pulse-width modulation (PWM) signal, ultimately relaxing the need for hundreds of digital to analog converters (DACs). We further show that doped waveguide heaters can be utilized as \textit{in-line} optical power monitors by measuring photo-conductance current, which is an alternative for the conventional tapping and integration of photo-diodes.
The next part concerned with a common cascaded micro-ring resonator in a WDM transceiver design. We develop on an FPGA control algorithm that abstracts the physical layer and takes user-defined inputs to set the resonances to the desired wavelength in a unicast and multicast transmission modes. The associated sensitivities of these silicon ring resonators are presented and addressed with three closed-loop solutions. We first show a closed-loop operation based on tapping the error signal from the drop port of the micro-ring. The second solution presents a resonance wavelength locking with a single digital I/O for control and feedback signals. Lastly, we leverage the photo-conductance effect and demonstrate the locking procedure using only the doped heater for both control and feedback purposes.
To achieve the inter-chip reconfigurability we discuss recent advances of high-port-count SiP broadband switches for reconfigurable inter-chip networks. To ensure optimal operation in terms of low insertion loss, low cross-talk and high signal integrity per routing path, hundreds of 2x2 Mach-Zehnder elements need to be biased precisely for the cross and bar states. We address this challenge with a tapless and a design agnostic calibration approach based on the photo-conductance effect. The automated algorithm returns a look-up table for all for each 2x2 element and the associated calibrated biases. Each routing scenario is then tested for insertion loss, crosstalk and bit-error rate of 25Gbit/s 4-level pulse amplitude modulation signals. The last part utilizes the Mach-Zehnder interferometers in WDM transceiver applications. We demonstrate a polarization insensitive four-channel WDM receiver with 40Gbit/s per channel and a transmitter design generating 8-level pulse amplitude modulation signals at 30Gbit/s
Optical Technologies and Control Methods for Scalable Data Centre Networks
Attributing to the increasing adoption of cloud services, video services and associated machine learning applications, the traffic demand inside data centers is increasing exponentially, which necessitates an innovated networking infrastructure with high scalability and cost-efficiency. As a promising candidate to provide high capacity, low latency, cost-effective and scalable interconnections, optical technologies have been introduced to data center networks (DCNs) for approximately a decade. To further improve the DCN performance to meet the increasing traffic demand by using photonic technologies, two current trends are a)increasing the bandwidth density of the transmission links and b) maximizing IT and network resources utilization through disaggregated topologies and architectures. Therefore, this PhD thesis focuses on introducing and applying advanced and efficient technologies in these two fields to DCNs to improve their performance. On the one hand, at the link level, since the traditional single-mode fiber (SMF) solutions based on wavelength division multiplexing (WDM) over C+L band may fall short in satisfying the capacity, front panel density, power consumption, and cost requirements of high-performance DCNs, a space division multiplexing (SDM) based DCN using homogeneous multi-core fibers (MCFs) is proposed.With the exploited bi-directional model and proposed spectrum allocation algorithms, the proposed DCN shows great benefits over the SMF solution in terms of network capacity and spatial efficiency. In the meanwhile, it is found that the inter-core crosstalk (IC-XT) between the adjacent cores inside the MCF is dynamic rather than static, therefore, the behaviour of the IC-XT is experimentally investigated under different transmission conditions. On the other hand, an optically disaggregated DCN is developed and to ensure the performance of it, different architectures, topologies, resource routing and allocation algorithms are proposed and compared. Compared to the traditional server-based DCN, the resource utilization, scalability and the cost-efficiency are significantly improved
- …