152 research outputs found

    Fatias de rede fim-a-fim : da extração de perfis de funções de rede a SLAs granulares

    Get PDF
    Orientador: Christian Rodolfo Esteve RothenbergTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Nos últimos dez anos, processos de softwarização de redes vêm sendo continuamente diversi- ficados e gradativamente incorporados em produção, principalmente através dos paradigmas de Redes Definidas por Software (ex.: regras de fluxos de rede programáveis) e Virtualização de Funções de Rede (ex.: orquestração de funções virtualizadas de rede). Embasado neste processo o conceito de network slice surge como forma de definição de caminhos de rede fim- a-fim programáveis, possivelmente sobre infrastruturas compartilhadas, contendo requisitos estritos de desempenho e dedicado a um modelo particular de negócios. Esta tese investiga a hipótese de que a desagregação de métricas de desempenho de funções virtualizadas de rede impactam e compõe critérios de alocação de network slices (i.e., diversas opções de utiliza- ção de recursos), os quais quando realizados devem ter seu gerenciamento de ciclo de vida implementado de forma transparente em correspondência ao seu caso de negócios de comu- nicação fim-a-fim. A verificação de tal assertiva se dá em três aspectos: entender os graus de liberdade nos quais métricas de desempenho de funções virtualizadas de rede podem ser expressas; métodos de racionalização da alocação de recursos por network slices e seus re- spectivos critérios; e formas transparentes de rastrear e gerenciar recursos de rede fim-a-fim entre múltiplos domínios administrativos. Para atingir estes objetivos, diversas contribuições são realizadas por esta tese, dentre elas: a construção de uma plataforma para automatização de metodologias de testes de desempenho de funções virtualizadas de redes; a elaboração de uma metodologia para análises de alocações de recursos de network slices baseada em um algoritmo classificador de aprendizado de máquinas e outro algoritmo de análise multi- critério; e a construção de um protótipo utilizando blockchain para a realização de contratos inteligentes envolvendo acordos de serviços entre domínios administrativos de rede. Por meio de experimentos e análises sugerimos que: métricas de desempenho de funções virtualizadas de rede dependem da alocação de recursos, configurações internas e estímulo de tráfego de testes; network slices podem ter suas alocações de recursos coerentemente classificadas por diferentes critérios; e acordos entre domínios administrativos podem ser realizados de forma transparente e em variadas formas de granularidade por meio de contratos inteligentes uti- lizando blockchain. Ao final deste trabalho, com base em uma ampla discussão as perguntas de pesquisa associadas à hipótese são respondidas, de forma que a avaliação da hipótese proposta seja realizada perante uma ampla visão das contribuições e trabalhos futuros desta teseAbstract: In the last ten years, network softwarisation processes have been continuously diversified and gradually incorporated into production, mainly through the paradigms of Software Defined Networks (e.g., programmable network flow rules) and Network Functions Virtualization (e.g., orchestration of virtualized network functions). Based on this process, the concept of network slice emerges as a way of defining end-to-end network programmable paths, possibly over shared network infrastructures, requiring strict performance metrics associated to a par- ticular business case. This thesis investigate the hypothesis that the disaggregation of network function performance metrics impacts and composes a network slice footprint incurring in di- verse slicing feature options, which when realized should have their Service Level Agreement (SLA) life cycle management transparently implemented in correspondence to their fulfilling end-to-end communication business case. The validation of such assertive takes place in three aspects: the degrees of freedom by which performance of virtualized network functions can be expressed; the methods of rationalizing the footprint of network slices; and transparent ways to track and manage network assets among multiple administrative domains. In order to achieve such goals, a series of contributions were achieved by this thesis, among them: the construction of a platform for automating methodologies for performance testing of virtual- ized network functions; an elaboration of a methodology for the analysis of footprint features of network slices based on a machine learning classifier algorithm and a multi-criteria analysis algorithm; and the construction of a prototype using blockchain to carry out smart contracts involving service level agreements between administrative systems. Through experiments and analysis we suggest that: performance metrics of virtualized network functions depend on the allocation of resources, internal configurations and test traffic stimulus; network slices can have their resource allocations consistently analyzed/classified by different criteria; and agree- ments between administrative domains can be performed transparently and in various forms of granularity through blockchain smart contracts. At the end of his thesis, through a wide discussion we answer all the research questions associated to the investigated hypothesis in such way its evaluation is performed in face of wide view of the contributions and future work of this thesisDoutoradoEngenharia de ComputaçãoDoutor em Engenharia ElétricaFUNCAM

    Dynamic service chain composition in virtualised environment

    Get PDF
    Network Function Virtualisation (NFV) has contributed to improving the flexibility of network service provisioning and reducing the time to market of new services. NFV leverages the virtualisation technology to decouple the software implementation of network appliances from the physical devices on which they run. However, with the emergence of this paradigm, providing data centre applications with an adequate network performance becomes challenging. For instance, virtualised environments cause network congestion, decrease the throughput and hurt the end user experience. Moreover, applications usually communicate through multiple sequences of virtual network functions (VNFs), aka service chains, for policy enforcement and performance and security enhancement, which increases the management complexity at to the network level. To address this problematic situation, existing studies have proposed high-level approaches of VNFs chaining and placement that improve service chain performance. They consider the VNFs as homogenous entities regardless of their specific characteristics. They have overlooked their distinct behaviour toward the traffic load and how their underpinning implementation can intervene in defining resource usage. Our research aims at filling this gap by finding out particular patterns on production and widely used VNFs. And proposing a categorisation that helps in reducing network latency at the chains. Based on experimental evaluation, we have classified firewalls, NAT, IDS/IPS, Flow monitors into I/O- and CPU-bound functions. The former category is mainly sensitive to the throughput, in packets per second, while the performance of the latter is primarily affected by the network bandwidth, in bits per second. By doing so, we correlate the VNF category with the traversing traffic characteristics and this will dictate how the service chains would be composed. We propose a heuristic called Natif, for a VNF-Aware VNF insTantIation and traFfic distribution scheme, to reconcile the discrepancy in VNF requirements based on the category they belong to and to eventually reduce network latency. We have deployed Natif in an OpenStack-based environment and have compared it to a network-aware VNF composition approach. Our results show a decrease in latency by around 188% on average without sacrificing the throughput

    Management And Security Of Multi-Cloud Applications

    Get PDF
    Single cloud management platform technology has reached maturity and is quite successful in information technology applications. Enterprises and application service providers are increasingly adopting a multi-cloud strategy to reduce the risk of cloud service provider lock-in and cloud blackouts and, at the same time, get the benefits like competitive pricing, the flexibility of resource provisioning and better points of presence. Another class of applications that are getting cloud service providers increasingly interested in is the carriers\u27 virtualized network services. However, virtualized carrier services require high levels of availability and performance and impose stringent requirements on cloud services. They necessitate the use of multi-cloud management and innovative techniques for placement and performance management. We consider two classes of distributed applications – the virtual network services and the next generation of healthcare – that would benefit immensely from deployment over multiple clouds. This thesis deals with the design and development of new processes and algorithms to enable these classes of applications. We have evolved a method for optimization of multi-cloud platforms that will pave the way for obtaining optimized placement for both classes of services. The approach that we have followed for placement itself is predictive cost optimized latency controlled virtual resource placement for both types of applications. To improve the availability of virtual network services, we have made innovative use of the machine and deep learning for developing a framework for fault detection and localization. Finally, to secure patient data flowing through the wide expanse of sensors, cloud hierarchy, virtualized network, and visualization domain, we have evolved hierarchical autoencoder models for data in motion between the IoT domain and the multi-cloud domain and within the multi-cloud hierarchy

    Management and orchestration of virtual network functions via deep reinforcement learning

    Get PDF
    Management and orchestration (MANO) of re-sources by virtual network functions (VNFs) represents one of thekey challenges towards a fully virtualized network architectureas envisaged by 5G standards. Current threshold-based policiesinefficiently over-provision network resources and under-utilizeavailable hardware, incurring high cost for network operators,and consequently, the users. In this work, we present a MANOalgorithm for VNFs allowing a central unit (CU) to learnto autonomously re-configure resources (processing power andstorage), deploy new VNF instances, or offload them to the cloud,depending on the network conditions, available pool of resources,and the VNF requirements, with the goal of minimizing a costfunction that takes into account the economical cost as wellas latency and the quality-of-service (QoS) experienced by theusers. First, we formulate the stochastic resource optimizationproblem as a parameterized action Markov decision process(PAMDP). Then, we propose a solution based on deep reinforce-ment learning (DRL). More precisely, we present a novel RLapproach, called parameterized action twin (PAT) deterministicpolicy gradient, which leverages anactor-critic architecturetolearn to provision resources to the VNFs in an online manner.Finally, we present numerical performance results, and map themto 5G key performance indicators (KPIs). To the best of ourknowledge, this is the first work that considers DRL for MANOof VNFs’ physical resources

    Analysis of scaling policies for NFV providing 5G/6G reliability levels with fallible servers

    Get PDF
    The softwarization of mobile networks enables an efficient use of resources, by dynamically scaling and re-assigning them following variations in demand. Given that the activation of additional servers is not immediate, scaling up resources should anticipate traffic demands to prevent service disruption. At the same time, the activation of more servers than strictly necessary results in a waste of resources, and thus should be avoided. Given the stringent reliability requirements of 5G applications (up to 6 nines) and the fallible nature of servers, finding the right trade-off between efficiency and service disruption is particularly critical. In this paper, we analyze a generic auto-scaling mechanism for communication services, used to de(activate) servers in a cluster, based on occupation thresholds. We model the impact of the activation delay and the finite lifetime of the servers on performance, in terms of power consumption and failure probability. Based on this model, we derive an algorithm to optimally configure the thresholds. Simulation results confirm the accuracy of the model both under synthetic and realistic traffic patterns as well as the effectiveness of the configuration algorithm. We also provide some insights on the best strategy to support an energy-efficient highly-reliable service: deploying a few powerful and reliable machines versus deploying many machines, but less powerful and reliable.The work of Jorge Ortin was funded in part by the Spanish Ministry of Science under Grant RTI2018-099063-B-I00, in part by the Gobierno de Aragon through Research Group under Grant T31_20R, in part by the European Social Fund (ESF), and in part by Centro Universitario de la Defensa under Grant CUD-2021_11. The work of Pablo Serrano was partly funded by the European Commission (EC) through the H2020 project Hexa-X (Grant Agreement no. 101015956), and in part by Spanish State Research Agency (TRUE5G project, PID2019-108713RB-C52PID2019-108713RB-C52/AEI/ 10.13039/501100011033). The work of Jaime Garcia-Reinoso was partially supported by the EC in the framework of H2020-EU.2.1.1. 5G EVE project (Grant agreement no. 815074). The work of Albert Banchs was partially supported by the EC in the framework of H2020-EU.2.1.1. 5G-TOURS project (Grant agreement no. 856950) also partially supported by the Spanish State Research Agency (TRUE5G project, PID2019-108713RB-C52PID2019- 108713RB-C52/AEI/10.13039/501100011033)

    Definition and specification of connectivity and QoE/QoS management mechanisms – final report

    Get PDF
    This document summarizes the WP5 work throughout the project, describing its functional architecture and the solutions that implement the WP5 concepts on network control and orchestration. For this purpose, we defined 3 innovative controllers that embody the network slicing and multi tenancy: SDM-C, SDM-X and SDM-O. The functionalities of each block are detailed with the interfaces connecting them and validated through exemplary network processes, highlighting thus 5G NORMA innovations. All the proposed modules are designed to implement the functionality needed to provide the challenging KPIs required by future 5G networks while keeping the largest possible compatibility with the state of the art

    Data-Driven Methods for Data Center Operations Support

    Get PDF
    During the last decade, cloud technologies have been evolving at an impressive pace, such that we are now living in a cloud-native era where developers can leverage on an unprecedented landscape of (possibly managed) services for orchestration, compute, storage, load-balancing, monitoring, etc. The possibility to have on-demand access to a diverse set of configurable virtualized resources allows for building more elastic, flexible and highly-resilient distributed applications. Behind the scenes, cloud providers sustain the heavy burden of maintaining the underlying infrastructures, consisting in large-scale distributed systems, partitioned and replicated among many geographically dislocated data centers to guarantee scalability, robustness to failures, high availability and low latency. The larger the scale, the more cloud providers have to deal with complex interactions among the various components, such that monitoring, diagnosing and troubleshooting issues become incredibly daunting tasks. To keep up with these challenges, development and operations practices have undergone significant transformations, especially in terms of improving the automations that make releasing new software, and responding to unforeseen issues, faster and sustainable at scale. The resulting paradigm is nowadays referred to as DevOps. However, while such automations can be very sophisticated, traditional DevOps practices fundamentally rely on reactive mechanisms, that typically require careful manual tuning and supervision from human experts. To minimize the risk of outages—and the related costs—it is crucial to provide DevOps teams with suitable tools that can enable a proactive approach to data center operations. This work presents a comprehensive data-driven framework to address the most relevant problems that can be experienced in large-scale distributed cloud infrastructures. These environments are indeed characterized by a very large availability of diverse data, collected at each level of the stack, such as: time-series (e.g., physical host measurements, virtual machine or container metrics, networking components logs, application KPIs); graphs (e.g., network topologies, fault graphs reporting dependencies among hardware and software components, performance issues propagation networks); and text (e.g., source code, system logs, version control system history, code review feedbacks). Such data are also typically updated with relatively high frequency, and subject to distribution drifts caused by continuous configuration changes to the underlying infrastructure. In such a highly dynamic scenario, traditional model-driven approaches alone may be inadequate at capturing the complexity of the interactions among system components. DevOps teams would certainly benefit from having robust data-driven methods to support their decisions based on historical information. For instance, effective anomaly detection capabilities may also help in conducting more precise and efficient root-cause analysis. Also, leveraging on accurate forecasting and intelligent control strategies would improve resource management. Given their ability to deal with high-dimensional, complex data, Deep Learning-based methods are the most straightforward option for the realization of the aforementioned support tools. On the other hand, because of their complexity, this kind of models often requires huge processing power, and suitable hardware, to be operated effectively at scale. These aspects must be carefully addressed when applying such methods in the context of data center operations. Automated operations approaches must be dependable and cost-efficient, not to degrade the services they are built to improve. i
    corecore