402 research outputs found
An Overview on Application of Machine Learning Techniques in Optical Networks
Today's telecommunication networks have become sources of enormous amounts of
widely heterogeneous data. This information can be retrieved from network
traffic traces, network alarms, signal quality indicators, users' behavioral
data, etc. Advanced mathematical tools are required to extract meaningful
information from these data and take decisions pertaining to the proper
functioning of the networks from the network-generated data. Among these
mathematical tools, Machine Learning (ML) is regarded as one of the most
promising methodological approaches to perform network-data analysis and enable
automated network self-configuration and fault management. The adoption of ML
techniques in the field of optical communication networks is motivated by the
unprecedented growth of network complexity faced by optical networks in the
last few years. Such complexity increase is due to the introduction of a huge
number of adjustable and interdependent system parameters (e.g., routing
configurations, modulation format, symbol rate, coding schemes, etc.) that are
enabled by the usage of coherent transmission/reception technologies, advanced
digital signal processing and compensation of nonlinear effects in optical
fiber propagation. In this paper we provide an overview of the application of
ML to optical communications and networking. We classify and survey relevant
literature dealing with the topic, and we also provide an introductory tutorial
on ML for researchers and practitioners interested in this field. Although a
good number of research papers have recently appeared, the application of ML to
optical networks is still in its infancy: to stimulate further work in this
area, we conclude the paper proposing new possible research directions
On Incentive-Driven VNF Service Chaining in Inter-Datacenter Elastic Optical Networks: A Hierarchical Game-Theoretic Mechanism
In this paper, we propose an incentive-driven virtual network function service chaining (VNF-SC) framework for optimizing the cross-stratum resource provisioning in multi-broker orchestrated inter-datacenter elastic optical networks (IDC-EONs). The proposed framework employs a non-cooperative hierarchical game-theoretic mechanism, where the resource brokers and the VNF-SC users play the leader and the follower games, respectively. In the leader game, the brokers calculate VNF-SC service schemes for users and compete for the provisioning tasks. While in the follower game, the users compete for VNF-SC services for jointly optimizing the resource cost and the received quality-of-service. We first elaborate on the modeling of the follower game, discuss the existence of Nash equilibrium and propose a mixed-strategy gaming approach enabled by an auxiliary graph-based algorithm to facilitate users selecting the most appropriate service schemes. Then, under the assumption that the brokers are aware of the principle of the follower game, we present the model for the leader game and develop a time-efficient heuristic algorithm for brokers to compete for the provisioning tasks. Simulations show that the proposed incentive-driven VNF-SC framework significantly improves the network throughput (i.e., >4.8× blocking reduction) while assisting users and brokers in achieving higher utilities compared with existing solutions
Machine-learning-aided cognitive reconfiguration for flexible-bandwidth HPC and data center networks [Invited]
This paper proposes a machine-learning (ML)-aided cognitive approach for effective bandwidth reconfiguration in optically interconnected datacenter/high-performance computing (HPC) systems. The proposed approach relies on a Hyper-X-like architecture augmented with flexible-bandwidth photonic interconnections at large scales using a hierarchical intra/inter-POD photonic switching layout. We first formulate the problem of the connectivity graph and routing scheme optimization as a mixed-integer linear programming model. A two-phase heuristic algorithm and a joint optimization approach are devised to solve the problem with low time complexity. Then, we propose an ML-based end-to-end performance estimator design to assist the network control plane with intelligent decision making for bandwidth reconfiguration. Numerical simulations using traffic distribution profiles extracted from HPC applications traces as well as random traffic matrices verify the accuracy performance of the ML design estimator (<9% error) and demonstrate up to 5 x throughput gain from the proposed approach compared with the baseline Hyper-X network using fixed all-to-all intra/inter-portable data center interconnects. (C) 2021 Optical Society of Americ
A Study on Efficient Service Function Chain Placement in Network Function Virtualization Environment
電気通信大å¦202
Recommended from our members
Measurement-Driven Algorithm and System Design for Wireless and Datacenter Networks
The growing number of mobile devices and data-intensive applications pose unique challenges for wireless access networks as well as datacenter networks that enable modern cloud-based services. With the enormous increase in volume and complexity of traffic from applications such as video streaming and cloud computing, the interconnection networks have become a major performance bottleneck. In this thesis, we study algorithms and architectures spanning several layers of the networking protocol stack that enable and accelerate novel applications and that are easily deployable and scalable. The design of these algorithms and architectures is motivated by measurements and observations in real world or experimental testbeds.
In the first part of this thesis, we address the challenge of wireless content delivery in crowded areas. We present the AMuSe system, whose objective is to enable scalable and adaptive WiFi multicast. AMuSe is based on accurate receiver feedback and incurs a small control overhead. This feedback information can be used by the multicast sender to optimize multicast service quality, e.g., by dynamically adjusting transmission bitrate. Specifically, we develop an algorithm for dynamic selection of a subset of the multicast receivers as feedback nodes which periodically send information about the channel quality to the multicast sender. Further, we describe the Multicast Dynamic Rate Adaptation (MuDRA) algorithm that utilizes AMuSe's feedback to optimally tune the physical layer multicast rate. MuDRA balances fast adaptation to channel conditions and stability, which is essential for multimedia applications.
We implemented the AMuSe system on the ORBIT testbed and evaluated its performance in large groups with approximately 200 WiFi nodes. Our extensive experiments demonstrate that AMuSe can provide accurate feedback in a dense multicast environment. It outperforms several alternatives even in the case of external interference and changing network conditions. Further, our experimental evaluation of MuDRA on the ORBIT testbed shows that MuDRA outperforms other schemes and supports high throughput multicast flows to hundreds of nodes while meeting quality requirements. As an example application, MuDRA can support multiple high quality video streams, where 90% of the nodes report excellent or very good video quality.
Next, we specifically focus on ensuring high Quality of Experience (QoE) for video streaming over WiFi multicast. We formulate the problem of joint adaptation of multicast transmission rate and video rate for ensuring high video QoE as a utility maximization problem and propose an online control algorithm called DYVR which is based on Lyapunov optimization techniques. We evaluated the performance of DYVR through analysis, simulations, and experiments using a testbed composed of Android devices and o the shelf APs. Our evaluation shows that DYVR can ensure high video rates while guaranteeing a low but acceptable number of segment losses, buffer underflows, and video rate switches.
We leverage the lessons learnt from AMuSe for WiFi to address the performance issues with LTE evolved Multimedia Broadcast/Multicast Service (eMBMS). We present the Dynamic Monitoring (DyMo) system which provides low-overhead and real-time feedback about eMBMS performance. DyMo employs eMBMS for broadcasting instructions which indicate the reporting rates as a function of the observed Quality of Service (QoS) for each UE. This simple feedback mechanism collects very limited QoS reports which can be used for network optimization. We evaluated the performance of DyMo analytically and via simulations. DyMo infers the optimal eMBMS settings with extremely low overhead, while meeting strict QoS requirements under different UE mobility patterns and presence of network component failures.
In the second part of the thesis, we study datacenter networks which are key enablers of the end-user applications such as video streaming and storage. Datacenter applications such as distributed file systems, one-to-many virtual machine migrations, and large-scale data processing involve bulk multicast flows. We propose a hardware and software system for enabling physical layer optical multicast in datacenter networks using passive optical splitters. We built a prototype and developed a simulation environment to evaluate the performance of the system for bulk multicasting. Our evaluation shows that the optical multicast architecture can achieve higher throughput and lower latency than IP multicast and peer-to-peer multicast schemes with lower switching energy consumption.
Finally, we study the problem of congestion control in datacenter networks. Quantized Congestion Control (QCN), a switch-supported standard, utilizes direct multi-bit feedback from the network for hardware rate limiting. Although QCN has been shown to be fast-reacting and effective, being a Layer-2 technology limits its adoption in IP-routed Layer 3 datacenters. We address several design challenges to overcome QCN feedback's Layer- 2 limitation and use it to design window-based congestion control (QCN-CC) and load balancing (QCN-LB) schemes. Our extensive simulations, based on real world workloads, demonstrate the advantages of explicit, multi-bit congestion feedback, especially in a typical environment where intra-datacenter traffic with short Round Trip Times (RTT: tens of s) run in conjunction with web-facing traffic with long RTTs (tens of milliseconds)
Recommended from our members
High Performance Silicon Photonic Interconnected Systems
Advances in data-driven applications, particularly artificial intelligence and deep learning, are driving the explosive growth of computation and communication in today’s data centers and high-performance computing (HPC) systems. Increasingly, system performance is not constrained by the compute speed at individual nodes, but by the data movement between them. This calls for innovative architectures, smart connectivity, and extreme bandwidth densities in interconnect designs. Silicon photonics technology leverages mature complementary metal-oxide-semiconductor (CMOS) manufacturing infrastructure and is promising for low cost, high-bandwidth, and reconfigurable interconnects. Flexible and high-performance photonic switched architectures are capable of improving the system performance. The work in this dissertation explores various photonic interconnected systems and the associated optical switching functionalities, hardware platforms, and novel architectures. It demonstrates the capabilities of silicon photonics to enable efficient deep learning training.
We first present field programmable gate array (FPGA) based open-loop and closed-loop control for optical spectral-and-spatial switching of silicon photonic cascaded micro-ring resonator (MRR) switches. Our control achieves wavelength locking at the user-defined resonance of the MRR for optical unicast, multicast, and multiwavelength-select functionalities. Digital-to-analog converters (DACs) and analog-to-digital converters (ADCs) are necessary for the control of the switch. We experimentally demonstrate the optical switching functionalities using an FPGA-based switch controller through both traditional multi-bit DAC/ADC and novel single-wired DAC/ADC circuits. For system-level integration, interfaces to the switch controller in a network control plane are developed. The successful control and the switching functionalitiesachieved are essential for system-level architectural innovations as presented in the following sections.
Next, this thesis presents two novel photonic switched architectures using the MRR-based switches. First, a photonic switched memory system architecture was designed to address memory challenges in deep learning. The reconfigurable photonic interconnects provide scalable solutions and enable efficient use of disaggregated memory resources for deep learning training. An experimental testbed was built with a processing system and two remote memory nodes using silicon photonic switch fabrics and system performance improvements were demonstrated. The collective results and existing high-bandwidth optical I/Os show the potential of integrating the photonic switched memory to state-of-the-art processing systems. Second, the scaling trends of deep learning models and distributed training workloads are challenging network capacities in today’s data centers and HPCs. A system architecture that leverages SiP switch-enabled server regrouping is proposed to tackle the challenges and accelerate distributed deep learning training. An experimental testbed with a SiP switch-enabled reconfigurable fat tree topology was built to evaluate the network performance of distributed ring all-reduce and parameter server workloads. We also present system-scale simulations. Server regrouping and bandwidth steering were performed on a large-scale tapered fat tree with 1024 compute nodes to show the benefits of using photonic switched architectures in systems at scale.
Finally, this dissertation explores high-bandwidth photonic interconnect designs for disaggregated systems. We first introduce and discuss two disaggregated architectures leveraging extreme high bandwidth interconnects with optically interconnected computing resources. We present the concept of rack-scale graphics processing unit (GPU) disaggregation with optical circuit switches and electrical aggregator switches. The architecture can leverage the flexibility of high bandwidth optical switches to increase hardware utilization and reduce application runtimes. A testbed was built to demonstrate resource disaggregation and defragmentation. In addition, we also present an extreme high-bandwidth optical interconnect accelerated low-latency communication architecture for deep learning training. The disaggregated architecture utilizes comb laser sources and MRR-based cross-bar switching fabrics to enable an all-to-all high bandwidth communication with a constant latency cost for distributed deep learning training. We discuss emerging technologies in the silicon photonics platform, including light source, transceivers, and switch architectures, to accommodate extreme high bandwidth requirements in HPC and data center environments. A prototype hardware innovation - Optical Network Interface Cards (comprised of FPGA, photonic integrated circuits (PIC), electronic integrated circuits (EIC), interposer, and high-speed printed circuit board (PCB)) is presented to show the path toward fast lanes for expedited execution at 10 terabits.
Taken together, the work in this dissertation demonstrates the capabilities of high-bandwidth silicon photonic interconnects and innovative architectural designs to accelerate deep learning training in optically connected data center and HPC systems
Self-Taught Anomaly Detection With Hybrid Unsupervised/Supervised Machine Learning in Optical Networks
This paper proposes a self-taught anomaly detection framework for optical networks. The proposed framework makes use of a hybrid unsupervised and supervised machine learning scheme. First, it employs an unsupervised data clustering module (DCM) to analyze the patterns of monitoring data. The DCM enables a self-learning capability that eliminates the requirement of prior knowledge of abnormal network behaviors and therefore can potentially detect unforeseen anomalies. Second, we introduce a self-taught mechanism that transfers the patterns learned by the DCM to a supervised data regression and classification module (DRCM). The DRCM, whose complexity is mainly related to the scale of the applied supervised learning model, can potentially facilitate more scalable and time-efficient online anomaly detection by avoiding excessively traversing the original dataset. We designed the DCM and DRCM based on the density-based clustering algorithm and the deep neural network structure, respectively. Evaluations with experimental data from two use cases (i.e., single-point detection and end-to-end detection) demonstrate that up to 99% anomaly detection accuracy can be achieved with a false positive rate below 1%
- …