360 research outputs found

    Machine-learning-aided cognitive reconfiguration for flexible-bandwidth HPC and data center networks [Invited]

    Get PDF
    This paper proposes a machine-learning (ML)-aided cognitive approach for effective bandwidth reconfiguration in optically interconnected datacenter/high-performance computing (HPC) systems. The proposed approach relies on a Hyper-X-like architecture augmented with flexible-bandwidth photonic interconnections at large scales using a hierarchical intra/inter-POD photonic switching layout. We first formulate the problem of the connectivity graph and routing scheme optimization as a mixed-integer linear programming model. A two-phase heuristic algorithm and a joint optimization approach are devised to solve the problem with low time complexity. Then, we propose an ML-based end-to-end performance estimator design to assist the network control plane with intelligent decision making for bandwidth reconfiguration. Numerical simulations using traffic distribution profiles extracted from HPC applications traces as well as random traffic matrices verify the accuracy performance of the ML design estimator (<9% error) and demonstrate up to 5 x throughput gain from the proposed approach compared with the baseline Hyper-X network using fixed all-to-all intra/inter-portable data center interconnects. (C) 2021 Optical Society of Americ

    An Application-aware SDN Controller for Hybrid Optical-electrical DC Networks

    Get PDF
    The adoption of optical switching technologies in Data Centre Networks (DCNs) offers a solution for high speed traffic and energy efficiency in Data Centre (DC) operational management, enabling an easy scaling of DC infrastructures. Flexible, slotted allocation of optical resources is fundamental to efficiently support the dynamicity of DC traffic. In this context, the NEPHELE project proposes a Time Division Multiple Access approach for optical resource allocation, orchestrated through a Software Defined Networking controller which coordinates the DCN configuration based on real-time cloud application requests

    Optics and virtualization as data center network infrastructure

    Get PDF
    The emerging cloud services have motivated a fresh look at the design of data center network infrastructure in multiple layers. To transfer the huge amount of data generated by many data intensive applications, data center network has to be fast, scalable and power efficient. To support flexible and efficient sharing in cloud services, service providers deploy a virtualization layer as part of the data center infrastructure. This thesis explores the design and performance analysis of data center network infrastructure in both physical network and virtualization layer. On the physical network design front, we present a hybrid packet/circuit switched network architecture which uses circuit switched optics to augment traditional packet-switched Ethernet in modern data centers. We show that this technique has substantial potential to improve bisection bandwidth and application performance in a cost-effective manner. To push the adoption of optical circuits in real cloud data centers, we further explore and address the circuit control issues in shared data center environments. On the virtualization layer, we present an analytical study on the network performance of virtualized data centers. Using Amazon EC2 as an experiment platform, we quantify the impact of virtualization on network performance in commercial cloud. Our findings provide valuable insights to both cloud users in moving legacy application into cloud and service providers in improving the virtualization infrastructure to support better cloud services

    Machine-Learning-Aided Dynamic Reconfiguration in Optical DC/HPC Networks (Invited)

    Get PDF
    The high bandwidth and low latency requirements of modern computing applications with their dynamic and nonuniform traffic patterns impose severe challenges to current data center (DC) and high performance computing (HPC) networks. Therefore, we present a dynamic network reconfiguration mechanism that could satisfy the time-varying applications' demands in an optical DC/HPC network. We propose a direct and an indirect topology extraction methods based on a machine learning-Aided traffic prediction approach under multi-Application scenario. The traffic prediction for topology extraction and bandwidth reconfiguration (PredicTER) method could lead to frequent topology and bandwidth reconfiguration. In contrast, the indirect approach, namely traffic prediction with clustering for topology extraction and bandwidth reconfiguration (PrediCLUSTER), utilizes an unsupervised learning-based clustering model to first associate the predicted traffic to one of possible traffic clusters, and then extracts a common topology for the cluster. This restricts the reconfigured topology set to the number of traffic clusters. Our simulation results show that the time-Average of mean packet latencies (and total dropped packets) over 60 seconds of timevarying traffic under the PredicTER, PrediCLUSTER and a static topology are 37.7μs,41.2μs, and 50.2μs (and 37,967, 12,305, and 36,836), respectively. Overall, the PredicTER (and PrediCLUSTER) method(s) can improve the end-To-end packet latency by 24.9% (and 17.8%), and the packet loss rate by-3.1% (and 66.6%), as compared to the static flat Hyper-X-like topology

    Scalable electro-optical solutions for data center networks

    Get PDF
    Switching gears towards efficient datacenters with photonic

    Intra- Datacenter Challenges; System Perspective

    Get PDF
    Invited presentation at ECOC Sunday workshop with title: Data Center Networks: Meeting the emerging requirements for capacity, cost, energy consumption and reac

    HFOS <sub>L</sub>:hyper scale fast optical switch-based data center network with L-level sub-network

    Get PDF
    The ever-expanding growth of internet traffic enforces deployment of massive Data Center Networks (DCNs) supporting high performance communications. Optical switching is being studied as a promising approach to fulfill the surging requirements of large scale data centers. The tree-based optical topology limits the scalability of the interconnected network due to the limitations in the port count of optical switches and the lack of optical buffers. Alternatively, buffer-less Fast Optical Switch (FOS) was proposed to realize the nanosecond switching of optical DCNs. Although FOSs provide nanosecond optical switching, they still suffer from port count limitations to scale the DCN. To address the issue of scaling DCNs to more than two million servers, we propose the hyper scale FOS-based L-level DCNs (HFOSL) which is capable of building large networks with small radix switches. The numerical analysis shows L of 4 is the optimal level for HFOSL to obtain the lowest cost and power consumption. Specifically, under a network size of 160,000 servers, HFOS4 saves 36.2% in cost compared with the 2-level FOS-based DCN, while achieves 60% improvement for cost and 26.7% improvement for power consumption compared with Fat tree. Moreover, a wide range of simulations and analyses demonstrate that HFOS4 outperforms state-of-art FOS-based DCNs by up to 40% end-to-end latency under DCN size of 81920 servers.</p
    • …
    corecore