64 research outputs found
Demonstrating Optically Interconnected Remote Serial and Parallel Memory in Disaggregated Data Centers
Remote serial and parallel memory using memory-over-network bridge and optical switched interconnect is demonstrated. Remote memory bandwidth of 93% (HMC) and 66% (DDR4) of the local 3.2 and 3.7 GB/s bandwidth is showcased
MONet: Heterogeneous Memory over Optical Network for Large-Scale Data Centre Resource Disaggregation
Memory over Optical Network (MONet) system is a disaggregated data center architecture where serial (HMC) / parallel (DDR4) memory resources can be accessed over optically switched interconnects within and between racks. An FPGA/ASIC-based custom hardware IP (ReMAT) supports heterogeneous memory pools, accommodates optical-to-electrical conversion for remote access, performs the required serial/parallel conversion and hosts the necessary local memory controller. Optically interconnected HMC-based (serial I/O type) memory card is accessed by a memory controller embedded in the compute card, simplifying the hardware near the memory modules. This substantially reduces overheads on latency, cost, power consumption and space. We characterize CPU-memory performance, by experimentally demonstrating the impact of distance, number of switching hops, transceivers, channel bonding and bit-rate per transceiver on bit-error rate, power consumption, additional latency, sustained remote memory bandwidth/throughput (using industry standard benchmark STREAMS) and cloud workload performance (such as operations per second, average added latency and retired instructions per second on memcached with YCSB cloud workloads). MONet pushes the CPU-memory operational limit from a few centimetres to 10s of metres, yet applications can experience as low as 10% performance penalty (at 36m) compared to a direct-attached equivalent. Using the proposed parallel topology, a system can support up to 100,000 disaggregated cards
Recommended from our members
High Performance Silicon Photonic Interconnected Systems
Advances in data-driven applications, particularly artificial intelligence and deep learning, are driving the explosive growth of computation and communication in today’s data centers and high-performance computing (HPC) systems. Increasingly, system performance is not constrained by the compute speed at individual nodes, but by the data movement between them. This calls for innovative architectures, smart connectivity, and extreme bandwidth densities in interconnect designs. Silicon photonics technology leverages mature complementary metal-oxide-semiconductor (CMOS) manufacturing infrastructure and is promising for low cost, high-bandwidth, and reconfigurable interconnects. Flexible and high-performance photonic switched architectures are capable of improving the system performance. The work in this dissertation explores various photonic interconnected systems and the associated optical switching functionalities, hardware platforms, and novel architectures. It demonstrates the capabilities of silicon photonics to enable efficient deep learning training.
We first present field programmable gate array (FPGA) based open-loop and closed-loop control for optical spectral-and-spatial switching of silicon photonic cascaded micro-ring resonator (MRR) switches. Our control achieves wavelength locking at the user-defined resonance of the MRR for optical unicast, multicast, and multiwavelength-select functionalities. Digital-to-analog converters (DACs) and analog-to-digital converters (ADCs) are necessary for the control of the switch. We experimentally demonstrate the optical switching functionalities using an FPGA-based switch controller through both traditional multi-bit DAC/ADC and novel single-wired DAC/ADC circuits. For system-level integration, interfaces to the switch controller in a network control plane are developed. The successful control and the switching functionalitiesachieved are essential for system-level architectural innovations as presented in the following sections.
Next, this thesis presents two novel photonic switched architectures using the MRR-based switches. First, a photonic switched memory system architecture was designed to address memory challenges in deep learning. The reconfigurable photonic interconnects provide scalable solutions and enable efficient use of disaggregated memory resources for deep learning training. An experimental testbed was built with a processing system and two remote memory nodes using silicon photonic switch fabrics and system performance improvements were demonstrated. The collective results and existing high-bandwidth optical I/Os show the potential of integrating the photonic switched memory to state-of-the-art processing systems. Second, the scaling trends of deep learning models and distributed training workloads are challenging network capacities in today’s data centers and HPCs. A system architecture that leverages SiP switch-enabled server regrouping is proposed to tackle the challenges and accelerate distributed deep learning training. An experimental testbed with a SiP switch-enabled reconfigurable fat tree topology was built to evaluate the network performance of distributed ring all-reduce and parameter server workloads. We also present system-scale simulations. Server regrouping and bandwidth steering were performed on a large-scale tapered fat tree with 1024 compute nodes to show the benefits of using photonic switched architectures in systems at scale.
Finally, this dissertation explores high-bandwidth photonic interconnect designs for disaggregated systems. We first introduce and discuss two disaggregated architectures leveraging extreme high bandwidth interconnects with optically interconnected computing resources. We present the concept of rack-scale graphics processing unit (GPU) disaggregation with optical circuit switches and electrical aggregator switches. The architecture can leverage the flexibility of high bandwidth optical switches to increase hardware utilization and reduce application runtimes. A testbed was built to demonstrate resource disaggregation and defragmentation. In addition, we also present an extreme high-bandwidth optical interconnect accelerated low-latency communication architecture for deep learning training. The disaggregated architecture utilizes comb laser sources and MRR-based cross-bar switching fabrics to enable an all-to-all high bandwidth communication with a constant latency cost for distributed deep learning training. We discuss emerging technologies in the silicon photonics platform, including light source, transceivers, and switch architectures, to accommodate extreme high bandwidth requirements in HPC and data center environments. A prototype hardware innovation - Optical Network Interface Cards (comprised of FPGA, photonic integrated circuits (PIC), electronic integrated circuits (EIC), interposer, and high-speed printed circuit board (PCB)) is presented to show the path toward fast lanes for expedited execution at 10 terabits.
Taken together, the work in this dissertation demonstrates the capabilities of high-bandwidth silicon photonic interconnects and innovative architectural designs to accelerate deep learning training in optically connected data center and HPC systems
MCF-SMF Hybrid Low-Latency Circuit-Switched Optical Network for Disaggregated Data Centers
This paper proposes and experimentally evaluates a
fully developed novel architecture with purpose built low latency
communication protocols for next generation disaggregated data
centers (DDCs). In order to accommodate for capacity and
latency needs of disaggregated IT elements (i.e. CPU, memory),
this architecture makes use of a low latency and high capacity
circuit switched optical network for interconnecting various endpoints, that are equipped with multi-channel Silicon photonic
based integrated transceivers. In a move to further decrease the
perceived latency between various disaggregated IT elements,
this paper proposes a) a novel network topology, which cuts
down the latency over the optical network by 34% while
enhancing system scalability and b) channel bonding over multicore fiber (MCF) switched links to reduce head to tail latency
and in turn increase sustained memory bandwidth for
disaggregated remote memory. Furthermore, to reduce power
consumption and enhance space efficiency, the integration of
novel multi core fiber (MCF) based transceivers, fibers and
optical switches are proposed and experimentally validated at the
physical layer for this topology. It is shown that the integration of
MCF based subsystems in this topology can bring about an
improvement in energy efficiency of the optical switching layer
which is above 60%. Finally, the performance of this proposed
architecture and topology is evaluated experimentally at the
application layer where the perceived memory throughput for
accessing remote and local resources is measured and compared
using electrical circuit and packet switching. The results also
highlight a multi fold increase in application perceived memory
throughput over the proposed DDC topology by utilization and
bonding of multiple optical channels to interconnect
disaggregated IT elements that can be carried over MCF links
Optimisation for Optical Data Centre Switching and Networking with Artificial Intelligence
Cloud and cluster computing platforms have become standard across almost every domain of business, and their scale quickly approaches servers in a single warehouse. However, the tier-based opto-electronically packet switched network infrastructure that is standard across these systems gives way to several scalability bottlenecks including resource fragmentation and high energy requirements. Experimental results show that optical circuit switched networks pose a promising alternative that could avoid these.
However, optimality challenges are encountered at realistic commercial scales. Where exhaustive optimisation techniques are not applicable for problems at the scale of Cloud-scale computer networks, and expert-designed heuristics are performance-limited and typically biased in their design, artificial intelligence can discover more scalable and better performing optimisation strategies.
This thesis demonstrates these benefits through experimental and theoretical work spanning all of component, system and commercial optimisation problems which stand in the way of practical Cloud-scale computer network systems. Firstly, optical components are optimised to gate in and are demonstrated in a proof-of-concept switching architecture for optical data centres with better wavelength and component scalability than previous demonstrations. Secondly, network-aware resource allocation schemes for optically composable data centres are learnt end-to-end with deep reinforcement learning and graph neural networks, where less networking resources are required to achieve the same resource efficiency compared to conventional methods. Finally, a deep reinforcement learning based method for optimising PID-control parameters is presented which generates tailored parameters for unseen devices in . This method is demonstrated on a market leading optical switching product based on piezoelectric actuation, where switching speed is improved with no compromise to optical loss and the manufacturing yield of actuators is improved. This method was licensed to and integrated within the manufacturing pipeline of this company. As such, crucial public and private infrastructure utilising these products will benefit from this work
Pluggable Optical Connector Interfaces for Electro-Optical Circuit Boards
A study is hereby presented on system embedded photonic interconnect technologies, which would address the communications bottleneck in modern exascale data centre systems driven by exponentially rising consumption of digital information and the associated complexity of intra-data centre network management along with dwindling data storage capacities. It is proposed that this bottleneck be addressed by adopting within the system electro-optical printed circuit boards (OPCBs), on which conventional electrical layers provide power distribution and static or low speed signaling, but high speed signals are conveyed by optical channels on separate embedded optical layers. One crucial prerequisite towards adopting OPCBs in modern data storage and switch systems is a reliable method of optically connecting peripheral cards and devices within the system to an OPCB backplane or motherboard in a pluggable manner. However the large mechanical misalignment tolerances between connecting cards and devices inherent to such systems are contrasted by the small sizes of optical waveguides required to support optical communication at the speeds defined by prevailing communication protocols. An innovative approach is therefore required to decouple the contrasting mechanical tolerances in the electrical and optical domains in the system in order to enable reliable pluggable optical connectivity.
This thesis presents the design, development and characterisation of a suite of new optical waveguide connector interface solutions for electro-optical printed circuit boards (OPCBs) based on embedded planar polymer waveguides and planar glass waveguides. The technologies described include waveguide receptacles allowing parallel fibre connectors to be connected directly to OPCB embedded planar waveguides and board-to-board connectors with embedded parallel optical transceivers allowing daughtercards to be orthogonally connected to an OPCB backplane.
For OPCBs based on embedded planar polymer waveguides and embedded planar glass waveguides, a complete demonstration platform was designed and developed to evaluate the connector interfaces and the associated embedded optical interconnect.
Furthermore a large portfolio of intellectual property comprising 19 patents and patent applications was generated during the course of this study, spanning the field of OPCBs, optical waveguides, optical connectors, optical assembly and system embedded optical interconnects
Dynamic Optical Networks for Data Centres and Media Production
This thesis explores all-optical networks for data centres, with a particular focus on network designs for live media production. A design for an all-optical data centre network is presented, with experimental verification of the feasibility of the network data plane. The design uses fast tunable (< 200 ns) lasers and coherent receivers across a passive optical star coupler core, forming a network capable of reaching over 1000 nodes. Experimental transmission of 25 Gb/s data across the network core, with combined wavelength switching and time division multiplexing (WS-TDM), is demonstrated. Enhancements to laser tuning time via current pre-emphasis are discussed, including experimental demonstration of fast wavelength switching (< 35 ns) of a single laser between all combinations of 96 wavelengths spaced at 50 GHz over a range wider than the optical C-band. Methods of increasing the overall network throughput by using a higher complexity modulation format are also described, along with designs for line codes to enable pulse amplitude modulation across the WS-TDM network core. The construction of an optical star coupler network core is investigated, by evaluating methods of constructing large star couplers from smaller optical coupler components. By using optical circuit switches to rearrange star coupler connectivity, the network can be partitioned, creating independent reserves of bandwidth and resulting in increased overall network throughput. Several topologies for constructing a star from optical couplers are compared, and algorithms for optimum construction methods are presented. All of the designs target strict criteria for the flexible and dynamic creation of multicast groups, which will enable future live media production workflows in data centres. The data throughput performance of the network designs is simulated under synthetic and practical media production traffic scenarios, showing improved throughput when reconfigurable star couplers are used compared to a single large star. An energy consumption evaluation shows reduced network power consumption compared to incumbent and other proposed data centre network technologies
SCADA and related technologies for irrigation district modernization
Presented at SCADA and related technologies for irrigation district modernization: a USCID water management conference on October 26-29, 2005 in Vancouver, Washington.Includes bibliographical references.Overview of Supervisory Control and Data Acquisition (SCADA) -- Total Channel Control™ - The value of automation in irrigation distribution systems -- Design and implementation of an irrigation canal SCADA -- All American Canal Monitoring Project -- Taking closed piping flowmeters to the next level - new technologies support trends in data logging and SCADA systems -- Real-time model-based dam automation: a case study of the Piute Dam -- Effective implementation of algorithm theory into PLCs -- Optimal fuzzy control for canal control structures -- SCADA over Zigbee™ -- Synchronous radio modem technology for affordable irrigation SCADA systems -- A suggested criteria for the selection of RTUs and sensors -- Irrigation canals in Spain: the integral process of modernization -- Ten years of SCADA data quality control and utilization for system management and planning modernization -- Moderately priced SCADA implementation -- Increasing peak power generation using SCADA and automation: a case study of the Kaweah River Power Authority -- Eastern Irrigation District canal automation and Supervisory Control and Data Acquisition (SCADA) -- Case study on design and construction of a regulating reservoir pumping station -- Saving water with Total Channel Control® in the Macalister Irrigation District, Australia -- Leveraging SCADA to modernize operations in the Klamath Irrigation Project -- A 2005 update on the installation of a VFD/SCADA system at Sutter Mutual Water Company -- Truckee Carson Irrigation District Turnout Water Measurement Program -- The myth of a "Turnkey" SCADA system and other lessons learned -- Canal modernization in Central California Irrigation District - case study -- Remote monitoring and operation at the Colorado River Irrigation District -- Web-based GIS decision support system for irrigation districts -- Using RiverWare as a real time river systems management tool -- Submerged venturi flume -- Ochoco Irrigation District telemetry case study -- Uinta Basin Replacement Project: a SCADA case study in managing multiple interests and adapting to loss of storage -- Training SCADA operators with real-time simulation -- Demonstration of gate control with SCADA system in Lower Rio Grande Valley, in Texas -- Incorporating sharp-crested weirs into irrigation SCADA systems
- …