Search CORE

389 research outputs found

Scaling PULSE Data Center Network Architecture and Scheduling Optical Circuits in Sub-Microseconds

Author: Benjamin JL
Zervas G
Publication venue: Optical Society of America
Publication date: 01/01/2020
Field of study

PULSE, an optical circuit switched data center network, employs custom ASIC schedulers to reconfigure circuits in 240 ns. The revised PULSE architecture scales to 10,000s blades, achieves >95% sustained throughput, with low median (1.23 µs) and tail (145 µs) latencies, while consuming 115 pJ/bit and costing $9.04/Gbps

Crossref

UCL Discovery

A Hybrid Beam Steering Free-Space and Fiber Based Optical Data Center Network

Author: Benjamin Joshua L
Liu Yanwu
Parsonson Christopher WF
Zervas Georgios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

Wireless data center networks (DCNs) are promising solutions to mitigate the cabling complexity in traditional wired DCNs and potentially reduce the end-to-end latency with faster propagation speed in free space. Yet, physical architectures in wireless DCNs must be carefully designed regarding wireless link blockage, obstacle bypassing, path loss, interference and spatial efficiency in a dense deployment. This paper presents the physical layer design of a hybrid FSO/in-fiber DCN while guaranteeing an all-optical, single hop, non-oversubscribed and full-bisection bandwidth network. We propose two layouts and analyze their scalability: (1) A static network utilizing only tunable sources which can scale up to 43 racks, 15,609 nodes and 15,609 channels; and (2) a re-configurable network with both tunable sources and piezoelectric actuator (PZT) based beam-steering which can scale up to 8 racks, 2,904 nodes and 185,856 channels at millisecond PZT switching time. Based on a traffic generation framework and a dynamic wavelength-timeslot scheduling algorithm, the system-level network performance is simulated for a 363-node subnet, reaching >99% throughput and 1.23 μ s average scheduler latency at 90% load

UCL Discovery

RAMP: A Flat Nanosecond Optical Network and MPI Operations for Distributed Deep Learning Systems

Author: Benjamin Joshua
Ottino Alessandro
Zervas Georgios
Publication venue
Publication date: 28/11/2022
Field of study

Distributed deep learning (DDL) systems strongly depend on network performance. Current electronic packet switched (EPS) network architectures and technologies suffer from variable diameter topologies, low-bisection bandwidth and over-subscription affecting completion time of communication and collective operations. We introduce a near-exascale, full-bisection bandwidth, all-to-all, single-hop, all-optical network architecture with nanosecond reconfiguration called RAMP, which supports large-scale distributed and parallel computing systems (12.8~Tbps per node for up to 65,536 nodes). For the first time, a custom RAMP-x MPI strategy and a network transcoder is proposed to run MPI collective operations across the optical circuit switched (OCS) network in a schedule-less and contention-less manner. RAMP achieves 7.6-171

\times

speed-up in completion time across all MPI operations compared to realistic EPS and OCS counterparts. It can also deliver a 1.3-16

\times

and 7.8-58

\times

reduction in Megatron and DLRM training time respectively} while offering 42-53

\times

and 3.3-12.4

\times

improvement in energy consumption and cost respectively

arXiv.org e-Print Archive

UCL Discovery

Control Plane Hardware Design for Optical Packet Switched Data Centre Networks

Author: Andreades Paris
Publication venue: UCL (University College London)
Publication date: 28/01/2020
Field of study

Optical packet switching for intra-data centre networks is key to addressing traffic requirements. Photonic integration and wavelength division multiplexing (WDM) can overcome bandwidth limits in switching systems. A promising technology to build a nanosecond-reconfigurable photonic-integrated switch, compatible with WDM, is the semiconductor optical amplifier (SOA). SOAs are typically used as gating elements in a broadcast-and-select (B\&S) configuration, to build an optical crossbar switch. For larger-size switching, a three-stage Clos network, based on crossbar nodes, is a viable architecture. However, the design of the switch control plane, is one of the barriers to packet switching; it should run on packet timescales, which becomes increasingly challenging as line rates get higher. The scheduler, used for the allocation of switch paths, limits control clock speed. To this end, the research contribution was the design of highly parallel hardware schedulers for crossbar and Clos network switches. On a field-programmable gate array (FPGA), the minimum scheduler clock period achieved was 5.0~ns and 5.4~ns, for a 32-port crossbar and Clos switch, respectively. By using parallel path allocation modules, one per Clos node, a minimum clock period of 7.0~ns was achieved, for a 256-port switch. For scheduler application-specific integrated circuit (ASIC) synthesis, this reduces to 2.0~ns; a record result enabling scalable packet switching. Furthermore, the control plane was demonstrated experimentally. Moreover, a cycle-accurate network emulator was developed to evaluate switch performance. Results showed a switch saturation throughput at a traffic load 60\% of capacity, with sub-microsecond packet latency, for a 256-port Clos switch, outperforming state-of-the-art optical packet switches

UCL Discovery

Recommended from our members

Hardware-Software Integrated Silicon Photonic Systems

Author: Calhoun David Mark
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2017
Field of study

Fabrication of integrated photonic devices and circuits in a CMOS-compatible process or foundry is the essence of the silicon photonic platform. Optical devices in this platform are enabled by the high index contrast between silicon and silicon on insulator. These devices offer potential benefits when integrated with existing and emerging high performance microelectronics. Integration of silicon photonics with small footprints and power-efficient and high-bandwidth operation has long been cited as a solution to existing issues in high performance interconnects for telecommunications and data communication. Stemming from this historic application in communications, new applications in sensing arrays, biochemistry, and even entertainment continue to grow. However, for many technologies to successfully adopt silicon photonics and reap the perceived benefits, the silicon photonic platform must extend toward development of a full ecosystem. Such extension includes implementation of low cost and robust electronic-photonic packaging techniques for all applications. In an ecosystem implemented with services ranging from device fabrication all the way to packaged products, ease-of-use and ease-of-deployment in systems that require many hardware and software components becomes possible. With the onset of the Internet of Things (IoT), nearly all technologies—sensors, compute, communication devices, etc.—persist in systems with some level of localized or distributed software interaction. These interactions often require a level of networked communications. For silicon photonics to penetrate technologies comprising IoT, it is advantageous to implement such devices in a hardware-software integrated way. Meaning, all functionalities and interactions related to the silicon photonic devices are well defined in terms of the physicality of the hardware. This hardware is then abstracted into various levels of software as needed in the system. The power of hardware-software integration allows many of the piece-wise demonstrated functionalities of silicon photonics to easily translate to commercial implementation. This work begins by briefly highlighting the challenges and solutions for transforming existing silicon photonic platforms to a full-fledged silicon photonic ecosystem. The highlighted solutions in development consist of tools for fabrication, testing, subsystem packaging, and system validation. Building off the knowledge of a silicon photonic ecosystem in development, this work continues by demonstrating various levels of hardware-software integration. These are primarily focused on silicon photonic interconnects. The first hardware-software integration-focused portion of this work explores silicon microring-based devices as a key building block for greater silicon photonic subsystems. The microring’s sensitivity to thermal fluctuations is identified not as a flaw, but as a tool for functionalization. A logical control system is implemented to mitigate thermal effects that would normally render a microring resonator inoperable. The mechanism to control the microring is extended and abstracted with software programmability to offer wavelength routing as a network primitive. This functionality, available through hardware-software integration, offers the possibility for ubiquitous deployment of such microring devices in future photonic interconnection networks. The second hardware-software integration-focused portion of this work explores dynamic silicon photonic switching devices and circuits. Specifically, interactions with and implications of high-speed data propagation and link layer control are demonstrated. The characteristics of photonic link setup include transients due to physical layer optical effects, latencies involved with initializing burst mode links, and optical link quality. The impacts on the functionalities and performance offered by photonic devices are explored. An optical network interface platform is devised using FPGAs to encapsulate hardware and software for controlling these characteristics using custom hardware description language, firmware, and software. A basic version of a silicon photonic network controller using FPGAs is used as a tool to demonstrate a highly scalable switch architecture using microring resonators. This architecture would not be possible without some semblance of this controller, combined with advanced electronic-photonic packaging. A more advanced deployment of the network interface platform is used to demonstrate a method for accelerating photonic links using out-of-band arbitration. A first demonstration of this platform is performed on a silicon photonic microring router network. A second demonstration is used to further explore the feasibility of full hardware-software integrated photonic device actuation, link layer control, and out-of-band arbitration. The demonstration is performed on a complete silicon photonic network with both spatial switching and wavelength routing functionalities. The aforementioned hardware-software integration mechanisms are rigorously tested for data communications applications. Capabilities are shown for very reliable, low latency, and dynamic high-speed data delivery using silicon photonic devices. Applying these mechanisms to complete electronic-photonic packaged subsystems provides a strong path to commercial manifestations of functional silicon photonic devices

Columbia University Academic Commons

Clock Synchronisation Assisted Clock and Data Recovery for Sub-Nanosecond Data Centre Optical Switching

Author: Clark Kari Aaron
Publication venue: UCL (University College London)
Publication date: 28/01/2021
Field of study

In current `Cloud' data centres, switching of data between servers is performed using deep hierarchies of interconnected electronic packet switches. Demand for network bandwidth from emerging data centre workloads, combined with the slowing of silicon transistor scaling, is leading to a widening gap between data centre traffic demand and electronically-switched data centre network capacity. All-optical switches could offer a future-proof alternative, with potentially under a third of the power consumption and cost of electronically-switched networks. However, the effective bandwidth of optical switches depends on their overall switching time. This is dominated by the clock and data recovery (CDR) locking time, which takes hundreds of nanoseconds in commercial receivers. Current data centre traffic is dominated by small packets that transmit in tens of nanoseconds, leading to low effective bandwidth, as a high proportion of receiver time is spent performing CDR locking instead of receiving data, removing the benefits of optical switching. High-performance optical switching requires sub-nanosecond CDR locking time to overcome this limitation. This thesis proposes, models, and demonstrates clock synchronisation assisted CDR, which can achieve this. This approach uses clock synchronisation to simplify the complexity of CDR versus previous asynchronous approaches. An analytical model of the technique is first derived that establishes its potential viability. Following this, two approaches to clock synchronisation assisted CDR are investigated: 1. Clock phase caching, which uses clock phase storage and regular updates in a 2km intra-building scale data centre network interconnected by single-mode optical fibre. 2. Single calibration clock synchronisation assisted CDR}, which leverages the 20 times lower thermal sensitivity of hollow core optical fibre versus single-mode fibre to synchronise a 100m cluster scale data centre network, with a single initial phase calibration step. Using a real-time FPGA-based optical switch testbed, sub-nanosecond CDR locking time was demonstrated for both approaches

UCL Discovery

Optical Switching for Scalable Data Centre Networks

Author: Gerard Thomas
Publication venue: UCL (University College London)
Publication date: 28/05/2021
Field of study

This thesis explores the use of wavelength tuneable transmitters and control systems within the context of scalable, optically switched data centre networks. Modern data centres require innovative networking solutions to meet their growing power, bandwidth, and scalability requirements. Wavelength routed optical burst switching (WROBS) can meet these demands by applying agile wavelength tuneable transmitters at the edge of a passive network fabric. Through experimental investigation of an example WROBS network, the transmitter is shown to determine system performance, and must support ultra-fast switching as well as power efficient transmission. This thesis describes an intelligent optical transmitter capable of wideband sub-nanosecond wavelength switching and low-loss modulation. A regression optimiser is introduced that applies frequency-domain feedback to automatically enable fast tuneable laser reconfiguration. Through simulation and experiment, the optimised laser is shown to support 122×50 GHz channels, switching in less than 10 ns. The laser is deployed as a component within a new wavelength tuneable source (WTS) composed of two time-interleaved tuneable lasers and two semiconductor optical amplifiers. Switching over 6.05 THz is demonstrated, with stable switch times of 547 ps, a record result. The WTS scales well in terms of chip-space and bandwidth, constituting the first demonstration of scalable, sub-nanosecond optical switching. The power efficiency of the intelligent optical transmitter is further improved by introduction of a novel low-loss split-carrier modulator. The design is evaluated using 112 Gb/s/λ intensity modulated, direct-detection signals and a single-ended photodiode receiver. The split-carrier transmitter is shown to achieve hard decision forward error correction ready performance after 2 km of transmission using a laser output power of just 0 dBm; a 5.2 dB improvement over the conventional transmitter. The results achieved in the course of this research allow for ultra-fast, wideband, intelligent optical transmitters that can be applied in the design of all-optical data centres for power efficient, scalable networking

UCL Discovery

Design of a fault tolerant airborne digital computer. Volume 2: Computational requirements and technology

Author: Clark C. B.
Goldberg J.
Ratner R. S.
Shapiro E. B.
Wahlstrom S. E.
Zeidler H. M.
Publication venue
Publication date
Field of study

This final report summarizes the work on the design of a fault tolerant digital computer for aircraft. Volume 2 is composed of two parts. Part 1 is concerned with the computational requirements associated with an advanced commercial aircraft. Part 2 reviews the technology that will be available for the implementation of the computer in the 1975-1985 period. With regard to the computation task 26 computations have been categorized according to computational load, memory requirements, criticality, permitted down-time, and the need to save data in order to effect a roll-back. The technology part stresses the impact of large scale integration (LSI) on the realization of logic and memory. Also considered was module interconnection possibilities so as to minimize fault propagation

NASA Technical Reports Server

A Cost and Power Feasibility Analysis of Quantum Annealing for NextG Cellular Wireless Networks

Author: Jamieson. Kyle
Kaewell John
Kasi Srikar
Warburton PA
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/01/2022
Field of study

In order to meet mobile cellular users' ever-increasing data demands, today's 4 G and 5 G wireless networks are designed mainly with the goal of maximizing spectral efficiency. While they have made progress in this regard, controlling the carbon footprint and operational costs of such networks remains a long-standing problem among network designers. This paper takes a long view on this problem, envisioning a NextG scenario where the network leverages quantum annealing for cellular baseband processing. We gather and synthesize insights on power consumption, computational throughput and latency, spectral efficiency, operational cost, and feasibility timelines surrounding quantum annealing technology. Armed with these data, we project the quantitative performance targets future quantum annealing hardware must meet in order to provide a computational and power advantage over CMOS hardware, while matching its whole-network spectral efficiency. Our quantitative analysis predicts that with 82.32 μ s problem latency and 2.68 M qubits, quantum annealing will achieve a spectral efficiency equal to CMOS while reducing power consumption by 41 kW (45% lower) in a Large MIMO base station with 400 MHz bandwidth and 64 antennas, and a 160 kW power reduction (55% lower) using 8.04 M qubits in a CRAN setting with three Large MIMO base stations

arXiv.org e-Print Archive

UCL Discovery