9,680 research outputs found

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    Performance Analytics of Cloud Networks

    Get PDF
    As the world becomes more inter-connected and dependent on the Internet, networks become ever more pervasive, and the stresses placed upon them more demanding. Similarly, the expectations of networks to maintain a high level of performance have also increased. Network performance is highly important to any business that operates online, depends on web traffic, runs any part of their infrastructure in a cloud environment, or even hosts their own network infrastructure. Depending upon the exact nature of a network, whether it be local or wide-area, 10 or 100 Gigabit, it will have distinct performance characteristics and it is important for a business or individual operating on the network to understand those performance characteristics and how they affect operations. To better understand our networks, it is necessary that we test them to measure their performance capabilities and track these metrics over time. In our work, we provide an in-depth analysis of how best to run cloud benchmarks to increase our network intelligence and how we can use the results of those benchmarks to predict future performance and identify performance anomalies. To achieve this, we explain how to effectively run cloud benchmarks and propose a scheduling algorithm for running large numbers of cloud benchmarks daily. We then use the performance data gathered from this method to conduct a thorough analysis of the performance characteristics of a cloud network, train neural networks to forecast future throughput based on historical results and detect performance anomalies as they occur

    Load shedding in network monitoring applications

    Get PDF
    Monitoring and mining real-time network data streams are crucial operations for managing and operating data networks. The information that network operators desire to extract from the network traffic is of different size, granularity and accuracy depending on the measurement task (e.g., relevant data for capacity planning and intrusion detection are very different). To satisfy these different demands, a new class of monitoring systems is emerging to handle multiple and arbitrary monitoring applications. Such systems must inevitably cope with the effects of continuous overload situations due to the large volumes, high data rates and bursty nature of the network traffic. These overload situations can severely compromise the accuracy and effectiveness of monitoring systems, when their results are most valuable to network operators. In this thesis, we propose a technique called load shedding as an effective and low-cost alternative to over-provisioning in network monitoring systems. It allows these systems to handle efficiently overload situations in the presence of multiple, arbitrary and competing monitoring applications. We present the design and evaluation of a predictive load shedding scheme that can shed excess load in front of extreme traffic conditions and maintain the accuracy of the monitoring applications within bounds defined by end users, while assuring a fair allocation of computing resources to non-cooperative applications. The main novelty of our scheme is that it considers monitoring applications as black boxes, with arbitrary (and highly variable) input traffic and processing cost. Without any explicit knowledge of the application internals, the proposed scheme extracts a set of features from the traffic streams to build an on-line prediction model of the resource requirements of each monitoring application, which is used to anticipate overload situations and control the overall resource usage by sampling the input packet streams. This way, the monitoring system preserves a high degree of flexibility, increasing the range of applications and network scenarios where it can be used. Since not all monitoring applications are robust against sampling, we then extend our load shedding scheme to support custom load shedding methods defined by end users, in order to provide a generic solution for arbitrary monitoring applications. Our scheme allows the monitoring system to safely delegate the task of shedding excess load to the applications and still guarantee fairness of service with non-cooperative users. We implemented our load shedding scheme in an existing network monitoring system and deployed it in a research ISP network. We present experimental evidence of the performance and robustness of our system with several concurrent monitoring applications during long-lived executions and using real-world traffic traces.Postprint (published version

    Role based behavior analysis

    Get PDF
    Tese de mestrado, Segurança Informática, Universidade de Lisboa, Faculdade de Ciências, 2009Nos nossos dias, o sucesso de uma empresa depende da sua agilidade e capacidade de se adaptar a condições que se alteram rapidamente. Dois requisitos para esse sucesso são trabalhadores proactivos e uma infra-estrutura ágil de Tecnologias de Informacão/Sistemas de Informação (TI/SI) que os consiga suportar. No entanto, isto nem sempre sucede. Os requisitos dos utilizadores ao nível da rede podem nao ser completamente conhecidos, o que causa atrasos nas mudanças de local e reorganizações. Além disso, se não houver um conhecimento preciso dos requisitos, a infraestrutura de TI/SI poderá ser utilizada de forma ineficiente, com excessos em algumas áreas e deficiências noutras. Finalmente, incentivar a proactividade não implica acesso completo e sem restrições, uma vez que pode deixar os sistemas vulneráveis a ameaças externas e internas. O objectivo do trabalho descrito nesta tese é desenvolver um sistema que consiga caracterizar o comportamento dos utilizadores do ponto de vista da rede. Propomos uma arquitectura de sistema modular para extrair informação de fluxos de rede etiquetados. O processo é iniciado com a criação de perfis de utilizador a partir da sua informação de fluxos de rede. Depois, perfis com características semelhantes são agrupados automaticamente, originando perfis de grupo. Finalmente, os perfis individuais são comprados com os perfis de grupo, e os que diferem significativamente são marcados como anomalias para análise detalhada posterior. Considerando esta arquitectura, propomos um modelo para descrever o comportamento de rede dos utilizadores e dos grupos. Propomos ainda métodos de visualização que permitem inspeccionar rapidamente toda a informação contida no modelo. O sistema e modelo foram avaliados utilizando um conjunto de dados reais obtidos de um operador de telecomunicações. Os resultados confirmam que os grupos projectam com precisão comportamento semelhante. Além disso, as anomalias foram as esperadas, considerando a população subjacente. Com a informação que este sistema consegue extrair dos dados em bruto, as necessidades de rede dos utilizadores podem sem supridas mais eficazmente, os utilizadores suspeitos são assinalados para posterior análise, conferindo uma vantagem competitiva a qualquer empresa que use este sistema.In our days, the success of a corporation hinges on its agility and ability to adapt to fast changing conditions. Proactive workers and an agile IT/IS infrastructure that can support them is a requirement for this success. Unfortunately, this is not always the case. The user’s network requirements may not be fully understood, which slows down relocation and reorganization. Also, if there is no grasp on the real requirements, the IT/IS infrastructure may not be efficiently used, with waste in some areas and deficiencies in others. Finally, enabling proactivity does not mean full unrestricted access, since this may leave the systems vulnerable to outsider and insider threats. The purpose of the work described on this thesis is to develop a system that can characterize user network behavior. We propose a modular system architecture to extract information from tagged network flows. The system process begins by creating user profiles from their network flows’ information. Then, similar profiles are automatically grouped into clusters, creating role profiles. Finally, the individual profiles are compared against the roles, and the ones that differ significantly are flagged as anomalies for further inspection. Considering this architecture, we propose a model to describe user and role network behavior. We also propose visualization methods to quickly inspect all the information contained in the model. The system and model were evaluated using a real dataset from a large telecommunications operator. The results confirm that the roles accurately map similar behavior. The anomaly results were also expected, considering the underlying population. With the knowledge that the system can extract from the raw data, the users network needs can be better fulfilled, the anomalous users flagged for inspection, giving an edge in agility for any company that uses it

    Distributed collaborative knowledge management for optical network

    Get PDF
    Network automation has been long time envisioned. In fact, the Telecommunications Management Network (TMN), defined by the International Telecommunication Union (ITU), is a hierarchy of management layers (network element, network, service, and business management), where high-level operational goals propagate from upper to lower layers. The network management architecture has evolved with the development of the Software Defined Networking (SDN) concept that brings programmability to simplify configuration (it breaks down high-level service abstraction into lower-level device abstractions), orchestrates operation, and automatically reacts to changes or events. Besides, the development and deployment of solutions based on Artificial Intelligence (AI) and Machine Learning (ML) for making decisions (control loop) based on the collected monitoring data enables network automation, which targets at reducing operational costs. AI/ML approaches usually require large datasets for training purposes, which are difficult to obtain. The lack of data can be compensated with a collective self-learning approach. In this thesis, we go beyond the aforementioned traditional control loop to achieve an efficient knowledge management (KM) process that enhances network intelligence while bringing down complexity. In this PhD thesis, we propose a general architecture to support KM process based on four main pillars, which enable creating, sharing, assimilating and using knowledge. Next, two alternative strategies based on model inaccuracies and combining model are proposed. To highlight the capacity of KM to adapt to different applications, two use cases are considered to implement KM in a purely centralized and distributed optical network architecture. Along with them, various policies are considered for evaluating KM in data- and model- based strategies. The results target to minimize the amount of data that need to be shared and reduce the convergence error. We apply KM to multilayer networks and propose the PILOT methodology for modeling connectivity services in a sandbox domain. PILOT uses active probes deployed in Central Offices (COs) to obtain real measurements that are used to tune a simulation scenario reproducing the real deployment with high accuracy. A simulator is eventually used to generate large amounts of realistic synthetic data for ML training and validation. We apply KM process also to a more complex network system that consists of several domains, where intra-domain controllers assist a broker plane in estimating accurate inter-domain delay. In addition, the broker identifies and corrects intra-domain model inaccuracies, as well as it computes an accurate compound model. Such models can be used for quality of service (QoS) and accurate end-to-end delay estimations. Finally, we investigate the application on KM in the context of Intent-based Networking (IBN). Knowledge in terms of traffic model and/or traffic perturbation is transferred among agents in a hierarchical architecture. This architecture can support autonomous network operation, like capacity management.La automatización de la red se ha concebido desde hace mucho tiempo. De hecho, la red de gestión de telecomunicaciones (TMN), definida por la Unión Internacional de Telecomunicaciones (ITU), es una jerarquía de capas de gestión (elemento de red, red, servicio y gestión de negocio), donde los objetivos operativos de alto nivel se propagan desde las capas superiores a las inferiores. La arquitectura de gestión de red ha evolucionado con el desarrollo del concepto de redes definidas por software (SDN) que brinda capacidad de programación para simplificar la configuración (descompone la abstracción de servicios de alto nivel en abstracciones de dispositivos de nivel inferior), organiza la operación y reacciona automáticamente a los cambios o eventos. Además, el desarrollo y despliegue de soluciones basadas en inteligencia artificial (IA) y aprendizaje automático (ML) para la toma de decisiones (bucle de control) en base a los datos de monitorización recopilados permite la automatización de la red, que tiene como objetivo reducir costes operativos. AI/ML generalmente requieren un gran conjunto de datos para entrenamiento, los cuales son difíciles de obtener. La falta de datos se puede compensar con un enfoque de autoaprendizaje colectivo. En esta tesis, vamos más allá del bucle de control tradicional antes mencionado para lograr un proceso eficiente de gestión del conocimiento (KM) que mejora la inteligencia de la red al tiempo que reduce la complejidad. En esta tesis doctoral, proponemos una arquitectura general para apoyar el proceso de KM basada en cuatro pilares principales que permiten crear, compartir, asimilar y utilizar el conocimiento. A continuación, se proponen dos estrategias alternativas basadas en inexactitudes del modelo y modelo de combinación. Para resaltar la capacidad de KM para adaptarse a diferentes aplicaciones, se consideran dos casos de uso para implementar KM en una arquitectura de red óptica puramente centralizada y distribuida. Junto a ellos, se consideran diversas políticas para evaluar KM en estrategias basadas en datos y modelos. Los resultados apuntan a minimizar la cantidad de datos que deben compartirse y reducir el error de convergencia. Aplicamos KM a redes multicapa y proponemos la metodología PILOT para modelar servicios de conectividad en un entorno aislado. PILOT utiliza sondas activas desplegadas en centrales de telecomunicación (CO) para obtener medidas reales que se utilizan para ajustar un escenario de simulación que reproducen un despliegue real con alta precisión. Un simulador se utiliza finalmente para generar grandes cantidades de datos sintéticos realistas para el entrenamiento y la validación de ML. Aplicamos el proceso de KM también a un sistema de red más complejo que consta de varios dominios, donde los controladores intra-dominio ayudan a un plano de bróker a estimar el retardo entre dominios de forma precisa. Además, el bróker identifica y corrige las inexactitudes de los modelos intra-dominio, así como también calcula un modelo compuesto preciso. Estos modelos se pueden utilizar para estimar la calidad de servicio (QoS) y el retardo extremo a extremo de forma precisa. Finalmente, investigamos la aplicación en KM en el contexto de red basada en intención (IBN). El conocimiento en términos de modelo de tráfico y/o perturbación del tráfico se transfiere entre agentes en una arquitectura jerárquica. Esta arquitectura puede soportar el funcionamiento autónomo de la red, como la gestión de la capacidad.Postprint (published version

    An Overview on Application of Machine Learning Techniques in Optical Networks

    Get PDF
    Today's telecommunication networks have become sources of enormous amounts of widely heterogeneous data. This information can be retrieved from network traffic traces, network alarms, signal quality indicators, users' behavioral data, etc. Advanced mathematical tools are required to extract meaningful information from these data and take decisions pertaining to the proper functioning of the networks from the network-generated data. Among these mathematical tools, Machine Learning (ML) is regarded as one of the most promising methodological approaches to perform network-data analysis and enable automated network self-configuration and fault management. The adoption of ML techniques in the field of optical communication networks is motivated by the unprecedented growth of network complexity faced by optical networks in the last few years. Such complexity increase is due to the introduction of a huge number of adjustable and interdependent system parameters (e.g., routing configurations, modulation format, symbol rate, coding schemes, etc.) that are enabled by the usage of coherent transmission/reception technologies, advanced digital signal processing and compensation of nonlinear effects in optical fiber propagation. In this paper we provide an overview of the application of ML to optical communications and networking. We classify and survey relevant literature dealing with the topic, and we also provide an introductory tutorial on ML for researchers and practitioners interested in this field. Although a good number of research papers have recently appeared, the application of ML to optical networks is still in its infancy: to stimulate further work in this area, we conclude the paper proposing new possible research directions
    corecore