Search CORE

115 research outputs found

3D-Hyper-FleX-LION: A Flat and Reconfigurable Hyper-X Network for Datacenters

Author: Fariborz M
Fotouhi P
Liu G
Proietti R
Xiao X
Yoo S.J.B
Publication venue: 'The Optical Society'
Publication date: 01/01/2020
Field of study

We propose a flat datacenter network using silicon photonic switches. Simulations show up to 2× improvement in throughput-per-watt over a non-oversubscribed Fat-Tree while providing > 2× reduction in the number of switching ASICs and transceivers

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

A Software-defined SoC Memory Bus Bridge Architecture for Disaggregated Computing

Author: Katrinis Kostas
Pinto Christian
Reale Andrea
Syrivelis Dimitris
Publication venue
Publication date: 11/01/2018
Field of study

Disaggregation and rack-scale systems have the potential of drastically decreasing TCO and increasing utilization of cloud datacenters, while maintaining performance. While the concept of organising resources in separate pools and interconnecting them together on demand is straightforward, its materialisation can be radically different in terms of performance and scale potential. In this paper, we present a memory bus bridge architecture which enables communication between 100s of masters and slaves in todays complex multiprocessor SoCs, that are physically intregrated in different chips and even different mainboards. The bridge tightly couples serial transceivers and a circuit network for chip-to-chip transfers. A key property of the proposed bridge architecture is that it is software-defined and thus can be configured at runtime, via a software control plane, to prepare and steer memory access transactions to remote slaves. This is particularly important because it enables datacenter orchestration tools to manage the disaggregated resource allocation. Moreover, we evaluate a bridge prototype we have build for ARM AXI4 memory bus interconnect and we discuss application-level observed performance.Comment: 3rd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems (AISTECS 2018, part of HiPEAC 2018

arXiv.org e-Print Archive

Crossref

Multicasting Optical Reconfigurable Switch

Author: Dinc Niyazi Ulas
Moser Christophe
Oguz Ilker
Psaltis Demetri
Yildirim Mustafa
Publication venue
Publication date: 28/02/2024
Field of study

Artificial Intelligence (AI) demands large data flows within datacenters, heavily relying on multicasting data transfers. As AI models scale, the requirement for high-bandwidth and low-latency networking compounds. The common use of electrical packet switching faces limitations due to optical-electrical-optical conversion bottlenecks. Optical switches, while bandwidth-agnostic and low-latency, suffer from having only unicast or non-scalable multicasting capability. This paper introduces an optical switching technique addressing this challenge. Our approach enables arbitrarily programmable simultaneous unicast and multicast connectivity, eliminating the need for optical splitters that hinder scalability due to optical power loss. We use phase modulation in multiple layers, tailored to implement any multicast connectivity map. Phase modulation also enables wavelength selectivity on top of spatial selectivity, resulting in an optical switch that implements space-wavelength routing. We conducted simulations and experiments to validate our approach. Our results affirm the concept's feasibility, effectiveness, and scalability, as a multicasting switch by experimentally demonstrating 16 spatial ports using 2 wavelength channels. Numerically, 64 spatial ports with 4 wavelength channels each were simulated, with approximately constant efficiency (< 3 dB) as ports and wavelength channels scale.Comment: 17 pages, 4 figures, articl

arXiv.org e-Print Archive

Efficient Intra-Rack Resource Disaggregation for HPC Using Co-Packaged DWDM Photonics

Author: Arafa Yehia
Badawy Abdel Hameed
Bergman Keren
Cook Brandon
Dai Liang Yuan
Glick Madeleine
Michelogiannakis George
Shalf John
Wang Yuyang
Publication venue
Publication date: 17/07/2023
Field of study

The diversity of workload requirements and increasing hardware heterogeneity in emerging high performance computing (HPC) systems motivate resource disaggregation. Resource disaggregation allows compute and memory resources to be allocated individually as required to each workload. However, it is unclear how to efficiently realize this capability and cost-effectively meet the stringent bandwidth and latency requirements of HPC applications. To that end, we describe how modern photonics can be co-designed with modern HPC racks to implement flexible intra-rack resource disaggregation and fully meet the bit error rate (BER) and high escape bandwidth of all chip types in modern HPC racks. Our photonic-based disaggregated rack provides an average application speedup of 11% (46% maximum) for 25 CPU and 61% for 24 GPU benchmarks compared to a similar system that instead uses modern electronic switches for disaggregation. Using observed resource usage from a production system, we estimate that an iso-performance intra-rack disaggregated HPC system using photonics would require 4x fewer memory modules and 2x fewer NICs than a non-disaggregated baseline.Comment: 15 pages, 12 figures, 4 tables. Published in IEEE Cluster 202

arXiv.org e-Print Archive

Machine-learning-aided cognitive reconfiguration for flexible-bandwidth HPC and data center networks [Invited]

Author: Chen XL
Fariborz M
Liu CY
Proietti R
Yoo SJB
Publication venue: 'The Optical Society'
Publication date: 01/01/2021
Field of study

This paper proposes a machine-learning (ML)-aided cognitive approach for effective bandwidth reconfiguration in optically interconnected datacenter/high-performance computing (HPC) systems. The proposed approach relies on a Hyper-X-like architecture augmented with flexible-bandwidth photonic interconnections at large scales using a hierarchical intra/inter-POD photonic switching layout. We first formulate the problem of the connectivity graph and routing scheme optimization as a mixed-integer linear programming model. A two-phase heuristic algorithm and a joint optimization approach are devised to solve the problem with low time complexity. Then, we propose an ML-based end-to-end performance estimator design to assist the network control plane with intelligent decision making for bandwidth reconfiguration. Numerical simulations using traffic distribution profiles extracted from HPC applications traces as well as random traffic matrices verify the accuracy performance of the ML design estimator (<9% error) and demonstrate up to 5 x throughput gain from the proposed approach compared with the baseline Hyper-X network using fixed all-to-all intra/inter-portable data center interconnects. (C) 2021 Optical Society of Americ

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Recommended from our members

High Performance Silicon Photonic Interconnected Systems

Author: Zhu Ziyi
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2022
Field of study

Advances in data-driven applications, particularly artificial intelligence and deep learning, are driving the explosive growth of computation and communication in today’s data centers and high-performance computing (HPC) systems. Increasingly, system performance is not constrained by the compute speed at individual nodes, but by the data movement between them. This calls for innovative architectures, smart connectivity, and extreme bandwidth densities in interconnect designs. Silicon photonics technology leverages mature complementary metal-oxide-semiconductor (CMOS) manufacturing infrastructure and is promising for low cost, high-bandwidth, and reconfigurable interconnects. Flexible and high-performance photonic switched architectures are capable of improving the system performance. The work in this dissertation explores various photonic interconnected systems and the associated optical switching functionalities, hardware platforms, and novel architectures. It demonstrates the capabilities of silicon photonics to enable efficient deep learning training. We first present field programmable gate array (FPGA) based open-loop and closed-loop control for optical spectral-and-spatial switching of silicon photonic cascaded micro-ring resonator (MRR) switches. Our control achieves wavelength locking at the user-defined resonance of the MRR for optical unicast, multicast, and multiwavelength-select functionalities. Digital-to-analog converters (DACs) and analog-to-digital converters (ADCs) are necessary for the control of the switch. We experimentally demonstrate the optical switching functionalities using an FPGA-based switch controller through both traditional multi-bit DAC/ADC and novel single-wired DAC/ADC circuits. For system-level integration, interfaces to the switch controller in a network control plane are developed. The successful control and the switching functionalitiesachieved are essential for system-level architectural innovations as presented in the following sections. Next, this thesis presents two novel photonic switched architectures using the MRR-based switches. First, a photonic switched memory system architecture was designed to address memory challenges in deep learning. The reconfigurable photonic interconnects provide scalable solutions and enable efficient use of disaggregated memory resources for deep learning training. An experimental testbed was built with a processing system and two remote memory nodes using silicon photonic switch fabrics and system performance improvements were demonstrated. The collective results and existing high-bandwidth optical I/Os show the potential of integrating the photonic switched memory to state-of-the-art processing systems. Second, the scaling trends of deep learning models and distributed training workloads are challenging network capacities in today’s data centers and HPCs. A system architecture that leverages SiP switch-enabled server regrouping is proposed to tackle the challenges and accelerate distributed deep learning training. An experimental testbed with a SiP switch-enabled reconfigurable fat tree topology was built to evaluate the network performance of distributed ring all-reduce and parameter server workloads. We also present system-scale simulations. Server regrouping and bandwidth steering were performed on a large-scale tapered fat tree with 1024 compute nodes to show the benefits of using photonic switched architectures in systems at scale. Finally, this dissertation explores high-bandwidth photonic interconnect designs for disaggregated systems. We first introduce and discuss two disaggregated architectures leveraging extreme high bandwidth interconnects with optically interconnected computing resources. We present the concept of rack-scale graphics processing unit (GPU) disaggregation with optical circuit switches and electrical aggregator switches. The architecture can leverage the flexibility of high bandwidth optical switches to increase hardware utilization and reduce application runtimes. A testbed was built to demonstrate resource disaggregation and defragmentation. In addition, we also present an extreme high-bandwidth optical interconnect accelerated low-latency communication architecture for deep learning training. The disaggregated architecture utilizes comb laser sources and MRR-based cross-bar switching fabrics to enable an all-to-all high bandwidth communication with a constant latency cost for distributed deep learning training. We discuss emerging technologies in the silicon photonics platform, including light source, transceivers, and switch architectures, to accommodate extreme high bandwidth requirements in HPC and data center environments. A prototype hardware innovation - Optical Network Interface Cards (comprised of FPGA, photonic integrated circuits (PIC), electronic integrated circuits (EIC), interposer, and high-speed printed circuit board (PCB)) is presented to show the path toward fast lanes for expedited execution at 10 terabits. Taken together, the work in this dissertation demonstrates the capabilities of high-bandwidth silicon photonic interconnects and innovative architectural designs to accelerate deep learning training in optically connected data center and HPC systems

Columbia University Academic Commons