7,137 research outputs found

    インターネットプロトコルネットワークにおけるリンク故障に対するリンク重み最適化モデル

    Get PDF
     As Internet traffic grows with little revenue for service providers, keeping the same service level agreements (SLAs) with limited capital expenditures (CAPEX) is challenging. Major internet service providers backbone traffic grows exponentially while revenue grows logarithmically. Under such situation, CAPEX reduction and an improvement of the infrastructure utilization efficiency are both needed. Link failures are common in Internet Protocol (IP) backbone networks and are an impediment to meeting required quality of service (QoS). After failure occurs, affected traffic is rerouted to adjacent links. This increase in network congestion leads to a reduction of addable traffic and sometimes an increase in packet drop rate. In this thesis network congestion refers to the highest link utilization over all the links in the network. An increase of network congestion may disrupt services with critical SLAs as allowable traffic becomes restricted and packet drop rate increases. Therefore, from a network operator point of view keeping a manageable congestion even under failure is desired A possible approach to deal with congestion increase is to augment link capacity until meeting the manageable congestion threshold. However CAPEX reduction is required. Therefore a minimization of the additional capacity is necessary. In IP networks where OSPF is widely used as a routing protocol, traffic paths are determined by link weights which are preconfigured in advance. Since traffic paths are decided by link weights, links weights therefore decide the links that will get congested. As result they determine the network congestion. Link weights can be optimized in order to minimize the additional capacity under the worst case failure. The worst case failure is the link failure case which generates the highest congestion in the network. In the basic model of link weight optimization, a preventive start-time optimization (PSO) scheme that determines a link weight set to minimize the worst congestion under any single-link failure was presented. Unfortunately, when there is no link failure, that link weight set leads to a congestion that may be higher than the manageable congestion. This is a penalty that will be carried on and thus become a burden especially in networks with few failures. The first part of this thesis proposes a penalty-aware (PA) model that determines a link weight set which reduces that penalty while also reducing the worst congestion by considering both failure and non-failure scenarios. In our PA model we present two simple and effective schemes: preventive start-time optimization without penalty (PSO-NP) and strengthen preventive start-time optimization (S-PSO). PSO-NP suppresses the penalty for the no failure case while reducing the worst congestion under failure, S-PSO minimizes the worst congestion under failure and tries to minimize the penalty compared to PSO for the no failure case. Simulation results show that in several networks, PSO-NP and S-PSO achieve substantial penalty reduction while showing a congestion closed to that of PSO under worst case failure. Despite these facts, PSO-NP and S-PSO do not guarantee an improvement of both the penalty and the worst congestion at the same time as they focus on fixed optimization conditions which restrict the emergence of upgraded solutions for that purpose. A relaxation of these fixed conditions may give us sub-optimal link weight sets that reduce the worst congestion under failure to nearly match that of PSO with a controlled penalty for the no failure case. To determine these sub-optimal sets we expand the penalty-aware model of link weight optimization. We design a scheme where the network operator can set a manageable penalty and find the link weight set that reduces most the worst congestion while maintaining the penalty. This enable network operators to choose more flexible link weight sets accordingly to their requirements under failure and non-failure scenarios. Since setting the penalty to zero would give the same results as PSO-NP, and not setting any penalty condition would give S-PSO, this scheme covers PSO-NP and S-PSO. For this reason we denote it: general preventive start-time optimization (GPSO). Simulation results show that GPSO determines link weight sets with worst congestion reduction equivalent to that of PSO under reduced penalty for the no failure case. GPSO is effective in finding a link weight set that reduces the congestion under both failure and non-failure cases. However it does not guarantee the manageable congestion as it considers penalty. In the second part of this thesis we propose a link-duplication (LD) model that aims to suppress link failure in the first place in order to always meet the manageable congestion. For this purpose we consider the duplication or reinforcement of links which is broadly used to make network reliable. Link duplication provides fast recovery as only switching from the failed link to the backup link will hide the failure at upper layers. However, due to capital expenditure constraints, every link cannot be duplicated. Giving priority to some selected links makes sense. As mentioned above, traffic routes are determined by link weights that are configured in advance. Therefore, choosing an appropriate set of link weights may reduce the number of links that actually need to be duplicated in order to keep a manageable congestion under any single-link failure scenario. Now, PSO also determines the link failure which creates the worst congestion after failure. Since by duplicating this link we can assume it no more fails, PSO can be used to find the smallest number of links to protect so as to guarantee a manageable congestion under any single link failure. The LD model considers multiple protection scenarios before optimizing link weights for the reduction of the overall number of protected links with the congestion of keeping the congestion below the manageable threshold. Simulation results show the LD model delivers a link weight set that requires few link protections to keep the manageable congestion under any single-link failure scenario at the cost of a computation time order L times that of PSO. L represents the number of links in the network. Since the LD model considers additional resources, a fair comparison with the PA model would require considering additional capacity in the PA mode as well. In the third part of this thesis we incorporate additional capacity in the PA model. For the PA model we introduce a mathematical formulation that aims to determine the minimal additional capacity to provide in order to maintain the manageable congestion under any single-link failure scenario. We then compare the LD model to the PA model that incorporates additional capacity features. Evaluation results show that the performance difference between the LD model and the PA model in terms of the required additional capacity depends on the network characteristics. The requirements of latency and continuity for traffic and geographical restriction of services should be taken into consideration when deciding which model to use.電気通信大学201

    QDQD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

    Full text link
    The paper considers a class of multi-agent Markov decision processes (MDPs), in which the network agents respond differently (as manifested by the instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. The paper investigates a distributed reinforcement learning setup with no prior information on the global state transition and local agent cost statistics. Specifically, with the agents' objective consisting of minimizing a network-averaged infinite horizon discounted cost, the paper proposes a distributed version of QQ-learning, QD\mathcal{QD}-learning, in which the network agents collaborate by means of local processing and mutual information exchange over a sparse (possibly stochastic) communication network to achieve the network goal. Under the assumption that each agent is only aware of its local online cost data and the inter-agent communication network is \emph{weakly} connected, the proposed distributed scheme is almost surely (a.s.) shown to yield asymptotically the desired value function and the optimal stationary control policy at each network agent. The analytical techniques developed in the paper to address the mixed time-scale stochastic dynamics of the \emph{consensus + innovations} form, which arise as a result of the proposed interactive distributed scheme, are of independent interest.Comment: Submitted to the IEEE Transactions on Signal Processing, 33 page

    Deep Reinforcement Learning for Real-Time Optimization in NB-IoT Networks

    Get PDF
    NarrowBand-Internet of Things (NB-IoT) is an emerging cellular-based technology that offers a range of flexible configurations for massive IoT radio access from groups of devices with heterogeneous requirements. A configuration specifies the amount of radio resource allocated to each group of devices for random access and for data transmission. Assuming no knowledge of the traffic statistics, there exists an important challenge in "how to determine the configuration that maximizes the long-term average number of served IoT devices at each Transmission Time Interval (TTI) in an online fashion". Given the complexity of searching for optimal configuration, we first develop real-time configuration selection based on the tabular Q-learning (tabular-Q), the Linear Approximation based Q-learning (LA-Q), and the Deep Neural Network based Q-learning (DQN) in the single-parameter single-group scenario. Our results show that the proposed reinforcement learning based approaches considerably outperform the conventional heuristic approaches based on load estimation (LE-URC) in terms of the number of served IoT devices. This result also indicates that LA-Q and DQN can be good alternatives for tabular-Q to achieve almost the same performance with much less training time. We further advance LA-Q and DQN via Actions Aggregation (AA-LA-Q and AA-DQN) and via Cooperative Multi-Agent learning (CMA-DQN) for the multi-parameter multi-group scenario, thereby solve the problem that Q-learning agents do not converge in high-dimensional configurations. In this scenario, the superiority of the proposed Q-learning approaches over the conventional LE-URC approach significantly improves with the increase of configuration dimensions, and the CMA-DQN approach outperforms the other approaches in both throughput and training efficiency

    A survey of machine learning techniques applied to self organizing cellular networks

    Get PDF
    In this paper, a survey of the literature of the past fifteen years involving Machine Learning (ML) algorithms applied to self organizing cellular networks is performed. In order for future networks to overcome the current limitations and address the issues of current cellular systems, it is clear that more intelligence needs to be deployed, so that a fully autonomous and flexible network can be enabled. This paper focuses on the learning perspective of Self Organizing Networks (SON) solutions and provides, not only an overview of the most common ML techniques encountered in cellular networks, but also manages to classify each paper in terms of its learning solution, while also giving some examples. The authors also classify each paper in terms of its self-organizing use-case and discuss how each proposed solution performed. In addition, a comparison between the most commonly found ML algorithms in terms of certain SON metrics is performed and general guidelines on when to choose each ML algorithm for each SON function are proposed. Lastly, this work also provides future research directions and new paradigms that the use of more robust and intelligent algorithms, together with data gathered by operators, can bring to the cellular networks domain and fully enable the concept of SON in the near future

    Enabling knowledge-defined networks : deep reinforcement learning, graph neural networks and network analytics

    Get PDF
    Significant breakthroughs in the last decade in the Machine Learning (ML) field have ushered in a new era of Artificial Intelligence (AI). Particularly, recent advances in Deep Learning (DL) have enabled to develop a new breed of modeling and optimization tools with a plethora of applications in different fields like natural language processing, or computer vision. In this context, the Knowledge-Defined Networking (KDN) paradigm highlights the lack of adoption of AI techniques in computer networks and – as a result – proposes a novel architecture that relies on Software-Defined Networking (SDN) and modern network analytics techniques to facilitate the deployment of ML-based solutions for efficient network operation. This dissertation aims to be a step forward in the realization of Knowledge-Defined Networks. In particular, we focus on the application of AI techniques to control and optimize networks more efficiently and automatically. To this end, we identify two components within the KDN context whose development may be crucial to achieve self-operating networks in the future: (i) the automatic control module, and (ii) the network analytics platform. The first part of this thesis is devoted to the construction of efficient automatic control modules. First, we explore the application of Deep Reinforcement Learning (DRL) algorithms to optimize the routing configuration in networks. DRL has recently demonstrated an outstanding capability to solve efficiently decision-making problems in other fields. However, first DRL-based attempts to optimize routing in networks have failed to achieve good results, often under-performing traditional heuristics. In contrast to previous DRL-based solutions, we propose a more elaborate network representation that facilitates DRL agents to learn efficient routing strategies. Our evaluation results show that DRL agents using the proposed representation achieve better performance and learn faster how to route traffic in an Optical Transport Network (OTN) use case. Second, we lay the foundations on the use of Graph Neural Networks (GNN) to build ML-based network optimization tools. GNNs are a newly proposed family of DL models specifically tailored to operate and generalize over graphs of variable size and structure. In this thesis, we posit that GNNs are well suited to model the relationships between different network elements inherently represented as graphs (e.g., topology, routing). Particularly, we use a custom GNN architecture to build a routing optimization solution that – unlike previous ML-based proposals – is able to generalize well to topologies, routing configurations, and traffic never seen during the training phase. The second part of this thesis investigates the design of practical and efficient network analytics solutions in the KDN context. Network analytics tools are crucial to provide the control plane with a rich and timely view of the network state. However this is not a trivial task considering that all this information turns typically into big data in real-world networks. In this context, we analyze the main aspects that should be considered when measuring and classifying traffic in SDN (e.g., scalability, accuracy, cost). As a result, we propose a practical solution that produces flow-level measurement reports similar to those of NetFlow/IPFIX in traditional networks. The proposed system relies only on native features of OpenFlow – currently among the most established standards in SDN – and incorporates mechanisms to maintain efficiently flow-level statistics in commodity switches and report them asynchronously to the control plane. Additionally, a system that combines ML and Deep Packet Inspection (DPI) identifies the applications that generate each traffic flow.La evolución del campo del Aprendizaje Maquina (ML) en la última década ha dado lugar a una nueva era de la Inteligencia Artificial (AI). En concreto, algunos avances en el campo del Aprendizaje Profundo (DL) han permitido desarrollar nuevas herramientas de modelado y optimización con múltiples aplicaciones en campos como el procesado de lenguaje natural, o la visión artificial. En este contexto, el paradigma de Redes Definidas por Conocimiento (KDN) destaca la falta de adopción de técnicas de AI en redes y, como resultado, propone una nueva arquitectura basada en Redes Definidas por Software (SDN) y en técnicas modernas de análisis de red para facilitar el despliegue de soluciones basadas en ML. Esta tesis pretende representar un avance en la realización de redes basadas en KDN. En particular, investiga la aplicación de técnicas de AI para operar las redes de forma más eficiente y automática. Para ello, identificamos dos componentes en el contexto de KDN cuyo desarrollo puede resultar esencial para conseguir redes operadas autónomamente en el futuro: (i) el módulo de control automático y (ii) la plataforma de análisis de red. La primera parte de esta tesis aborda la construcción del módulo de control automático. En primer lugar, se explora el uso de algoritmos de Aprendizaje Profundo por Refuerzo (DRL) para optimizar el encaminamiento de tráfico en redes. DRL ha demostrado una capacidad sobresaliente para resolver problemas de toma de decisiones en otros campos. Sin embargo, los primeros trabajos que han aplicado DRL a la optimización del encaminamiento en redes no han conseguido rendimientos satisfactorios. Frente a dichas soluciones previas, proponemos una representación más elaborada de la red que facilita a los agentes DRL aprender estrategias de encaminamiento eficientes. Nuestra evaluación muestra que cuando los agentes DRL utilizan la representación propuesta logran mayor rendimiento y aprenden más rápido cómo encaminar el tráfico en un caso práctico en Redes de Transporte Ópticas (OTN). En segundo lugar, se presentan las bases sobre la utilización de Redes Neuronales de Grafos (GNN) para construir herramientas de optimización de red. Las GNN constituyen una nueva familia de modelos de DL específicamente diseñados para operar y generalizar sobre grafos de tamaño y estructura variables. Esta tesis destaca la idoneidad de las GNN para modelar las relaciones entre diferentes elementos de red que se representan intrínsecamente como grafos (p. ej., topología, encaminamiento). En particular, utilizamos una arquitectura GNN específicamente diseñada para optimizar el encaminamiento de tráfico que, a diferencia de las propuestas anteriores basadas en ML, es capaz de generalizar correctamente sobre topologías, configuraciones de encaminamiento y tráfico nunca vistos durante el entrenamiento La segunda parte de esta tesis investiga el diseño de herramientas de análisis de red eficientes en el contexto de KDN. El análisis de red resulta esencial para proporcionar al plano de control una visión completa y actualizada del estado de la red. No obstante, esto no es una tarea trivial considerando que esta información representa una cantidad masiva de datos en despliegues de red reales. Esta parte de la tesis analiza los principales aspectos a considerar a la hora de medir y clasificar el tráfico en SDN (p. ej., escalabilidad, exactitud, coste). Como resultado, se propone una solución práctica que genera informes de medidas de tráfico a nivel de flujo similares a los de NetFlow/IPFIX en redes tradicionales. El sistema propuesto utiliza sólo funciones soportadas por OpenFlow, actualmente uno de los estándares más consolidados en SDN, y permite mantener de forma eficiente estadísticas de tráfico en conmutadores con características básicas y enviarlas de forma asíncrona hacia el plano de control. Asimismo, un sistema que combina ML e Inspección Profunda de Paquetes (DPI) identifica las aplicaciones que generan cada flujo de tráfico.Postprint (published version

    Feature engineering for deep reinforcement learning based routing

    Get PDF
    Recent advances in Deep Reinforcement Learning (DRL) techniques are providing a dramatic improvement in decision-making and automated control problems. As a result, we are witnessing a growing number of research works that are proposing ways of applying DRL techniques to network-related problems such as routing. However, such proposals failed to achieve good results, often under-performing traditional routing techniques. We argue that successfully applying DRL-based techniques to networking requires finding good representations of the network parameters: feature engineering. DRL agents need to represent both the state (e.g., link utilization) and the action space (e.g., changes to the routing policy). In this paper, we show that existing approaches use straightforward representations that lead to poor performance. We propose a novel representation of the state and action that outperforms existing ones and that is flexible enough to be applied to many networking use-cases. We test our representation in two different scenarios: (i) routing in optical transport networks and (ii) QoS-aware routing in IP networks. Our results show that the DRL agent achieves significantly better performance compared to existing state/action representations.This work has been supported by the Spanish MINECO under contract TEC2017-90034-C2-1-R (ALLIANCE) and the Catalan Institution for Research and Advanced Studies (ICREA).Peer ReviewedPostprint (author's final draft

    AdapINT: A Flexible and Adaptive In-Band Network Telemetry System Based on Deep Reinforcement Learning

    Full text link
    In-band Network Telemetry (INT) has emerged as a promising network measurement technology. However, existing network telemetry systems lack the flexibility to meet diverse telemetry requirements and are also difficult to adapt to dynamic network environments. In this paper, we propose AdapINT, a versatile and adaptive in-band network telemetry framework assisted by dual-timescale probes, including long-period auxiliary probes (APs) and short-period dynamic probes (DPs). Technically, the APs collect basic network status information, which is used for the path planning of DPs. To achieve full network coverage, we propose an auxiliary probes path deployment (APPD) algorithm based on the Depth-First-Search (DFS). The DPs collect specific network information for telemetry tasks. To ensure that the DPs can meet diverse telemetry requirements and adapt to dynamic network environments, we apply the deep reinforcement learning (DRL) technique and transfer learning method to design the dynamic probes path deployment (DPPD) algorithm. The evaluation results show that AdapINT can redesign the telemetry system according to telemetry requirements and network environments. AdapINT can reduce telemetry latency by 75\% in online games and video conferencing scenarios. For overhead-aware networks, AdapINT can reduce control overheads by 34\% in cloud computing services.Comment: 14 pages, 19 figure
    corecore