22 research outputs found

    Fault Diagnosis in DSL Networks using Support Vector Machines

    Get PDF
    The adequate operation for a number of service distribution networks relies on the e�ective maintenance and fault management of their underlay DSL infrastructure. Thus, new tools are required in order to adequately monitor and further diagnose anomalies that other segments of the DSL network cannot identify due to the pragmatic issues raised by hardware or software misconfigurations. In this work we present a fundamentally new approach for classifying known DSL-level anomalies by exploiting the properties of novelty detection via the employment of one-class Support Vector Machines (SVMs). By virtue of the imbalance residing in the training samples that consequently lead to problematic prediction outcomes when used within two-class formulations, we adopt the properties of one-class classification and construct models for independently identifying and classifying a single type of a DSL-level anomaly. Given the fact that the greater number of the installed Digital Subscriber Line Access Multiplexers (DSLAMs) within the DSL network of a large European ISP were misconfigured, thus unable to accurately flag anomalous events, we utilize as inference solutions the models derived by the one-class SVM formulations built by the known labels as flagged by the much smaller number of correctly configured DSLAMs in the same network in order to aid the classification aspect against the monitored unlabelled events. By reaching an average over 95% on a number of classification accuracy metrics such as precision, recall and F-score we show that one-class SVM classifiers overcome the biased classification outcomes achieved by the traditional two-class formulations and that they may constitute as viable and promising components within the design of future network fault management strategies. In addition, we demonstrate their superiority over commonly used two-class machine learning approaches such as Decision Trees and Bayesian Networks that has been used in the same context within past solutions. Keywords: Network management, Support Vector Machines, supervised learning, one-class classifiers, DSL anomalie

    Fault diagnosis for IP-based network with real-time conditions

    Get PDF
    BACKGROUND: Fault diagnosis techniques have been based on many paradigms, which derive from diverse areas and have different purposes: obtaining a representation model of the network for fault localization, selecting optimal probe sets for monitoring network devices, reducing fault detection time, and detecting faulty components in the network. Although there are several solutions for diagnosing network faults, there are still challenges to be faced: a fault diagnosis solution needs to always be available and able enough to process data timely, because stale results inhibit the quality and speed of informed decision-making. Also, there is no non-invasive technique to continuously diagnose the network symptoms without leaving the system vulnerable to any failures, nor a resilient technique to the network's dynamic changes, which can cause new failures with different symptoms. AIMS: This thesis aims to propose a model for the continuous and timely diagnosis of IP-based networks faults, independent of the network structure, and based on data analytics techniques. METHOD(S): This research's point of departure was the hypothesis of a fault propagation phenomenon that allows the observation of failure symptoms at a higher network level than the fault origin. Thus, for the model's construction, monitoring data was collected from an extensive campus network in which impact link failures were induced at different instants of time and with different duration. These data correspond to widely used parameters in the actual management of a network. The collected data allowed us to understand the faults' behavior and how they are manifested at a peripheral level. Based on this understanding and a data analytics process, the first three modules of our model, named PALADIN, were proposed (Identify, Collection and Structuring), which define the data collection peripherally and the necessary data pre-processing to obtain the description of the network's state at a given moment. These modules give the model the ability to structure the data considering the delays of the multiple responses that the network delivers to a single monitoring probe and the multiple network interfaces that a peripheral device may have. Thus, a structured data stream is obtained, and it is ready to be analyzed. For this analysis, it was necessary to implement an incremental learning framework that respects networks' dynamic nature. It comprises three elements, an incremental learning algorithm, a data rebalancing strategy, and a concept drift detector. This framework is the fourth module of the PALADIN model named Diagnosis. In order to evaluate the PALADIN model, the Diagnosis module was implemented with 25 different incremental algorithms, ADWIN as concept-drift detector and SMOTE (adapted to streaming scenario) as the rebalancing strategy. On the other hand, a dataset was built through the first modules of the PALADIN model (SOFI dataset), which means that these data are the incoming data stream of the Diagnosis module used to evaluate its performance. The PALADIN Diagnosis module performs an online classification of network failures, so it is a learning model that must be evaluated in a stream context. Prequential evaluation is the most used method to perform this task, so we adopt this process to evaluate the model's performance over time through several stream evaluation metrics. RESULTS: This research first evidences the phenomenon of impact fault propagation, making it possible to detect fault symptoms at a monitored network's peripheral level. It translates into non-invasive monitoring of the network. Second, the PALADIN model is the major contribution in the fault detection context because it covers two aspects. An online learning model to continuously process the network symptoms and detect internal failures. Moreover, the concept-drift detection and rebalance data stream components which make resilience to dynamic network changes possible. Third, it is well known that the amount of available real-world datasets for imbalanced stream classification context is still too small. That number is further reduced for the networking context. The SOFI dataset obtained with the first modules of the PALADIN model contributes to that number and encourages works related to unbalanced data streams and those related to network fault diagnosis. CONCLUSIONS: The proposed model contains the necessary elements for the continuous and timely diagnosis of IPbased network faults; it introduces the idea of periodical monitorization of peripheral network elements and uses data analytics techniques to process it. Based on the analysis, processing, and classification of peripherally collected data, it can be concluded that PALADIN achieves the objective. The results indicate that the peripheral monitorization allows diagnosing faults in the internal network; besides, the diagnosis process needs an incremental learning process, conceptdrift detection elements, and rebalancing strategy. The results of the experiments showed that PALADIN makes it possible to learn from the network manifestations and diagnose internal network failures. The latter was verified with 25 different incremental algorithms, ADWIN as concept-drift detector and SMOTE (adapted to streaming scenario) as the rebalancing strategy. This research clearly illustrates that it is unnecessary to monitor all the internal network elements to detect a network's failures; instead, it is enough to choose the peripheral elements to be monitored. Furthermore, with proper processing of the collected status and traffic descriptors, it is possible to learn from the arriving data using incremental learning in cooperation with data rebalancing and concept drift approaches. This proposal continuously diagnoses the network symptoms without leaving the system vulnerable to failures while being resilient to the network's dynamic changes.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: José Manuel Molina López.- Secretario: Juan Carlos Dueñas López.- Vocal: Juan Manuel Corchado Rodrígue

    Automated network optimisation using data mining as support for economic decision systems

    Get PDF
    The evolution from wired voice communications to wireless and cloud computing services has led to the rapid growth of wireless communication companies attempting to meet consumer needs. While these companies have generally been able to achieve quality of service (QoS) high enough to meet most consumer demands, the recent growth in data hungry services in addition to wireless voice communication, has placed significant stress on the infrastructure and begun to translate into increased QoS issues. As a result, wireless providers are finding difficulty to meet demand and dealing with an overwhelming volume of mobile data. Many telecommunication service providers have turned to data analytics techniques to discover hidden insights for fraud detection, customer churn detection and credit risk analysis. However, most are illequipped to prioritise expansion decisions and optimise network faults and costs to ensure customer satisfaction and optimal profitability. The contribution of this thesis in the decision-making process is significant as it initially proposes a network optimisation scheme using data mining algorithms to develop a monitoring framework capable of troubleshooting network faults while optimising costs based on financial evaluations. All the data mining experiments contribute to the development of a super–framework that has been tested using real-data to demonstrate that data mining techniques play a crucial role in the prediction of network optimisation actions. Finally, the insights extracted from the super-framework demonstrate that machine learning mechanisms can draw out promising solutions for network optimisation decisions, customer segmentation, customers churn prediction and also in revenue management. The outputs of the thesis seek to help wireless providers to determine the QoS factors that should be addressed for an efficient network optimisation plan and also presents the academic contribution of this research

    Active self-diagnosis in telecommunication networks

    Get PDF
    Les réseaux de télécommunications deviennent de plus en plus complexes, notamment de par la multiplicité des technologies mises en œuvre, leur couverture géographique grandissante, la croissance du trafic en quantité et en variété, mais aussi de par l évolution des services fournis par les opérateurs. Tout ceci contribue à rendre la gestion de ces réseaux de plus en plus lourde, complexe, génératrice d erreurs et donc coûteuse pour les opérateurs. On place derrière le terme réseaux autonome l ensemble des solutions visant à rendre la gestion de ce réseau plus autonome. L objectif de cette thèse est de contribuer à la réalisation de certaines fonctions autonomiques dans les réseaux de télécommunications. Nous proposons une stratégie pour automatiser la gestion des pannes tout en couvrant les différents segments du réseau et les services de bout en bout déployés au-dessus. Il s agit d une approche basée modèle qui adresse les deux difficultés du diagnostic basé modèle à savoir : a) la façon d'obtenir un tel modèle, adapté à un réseau donné à un moment donné, en particulier si l'on souhaite capturer plusieurs couches réseau et segments et b) comment raisonner sur un modèle potentiellement énorme, si l'on veut gérer un réseau national par exemple. Pour répondre à la première difficulté, nous proposons un nouveau concept : l auto-modélisation qui consiste d abord à construire les différentes familles de modèles génériques, puis à identifier à la volée les instances de ces modèles qui sont déployées dans le réseau géré. La seconde difficulté est adressée grâce à un moteur d auto-diagnostic actif, basé sur le formalisme des réseaux Bayésiens et qui consiste à raisonner sur un fragment du modèle du réseau qui est augmenté progressivement en utilisant la capacité d auto-modélisation: des observations sont collectées et des tests réalisés jusqu à ce que les fautes soient localisées avec une certitude suffisante. Cette approche de diagnostic actif a été expérimentée pour réaliser une gestion multi-couches et multi-segments des alarmes dans un réseau IMS.While modern networks and services are continuously growing in scale, complexity and heterogeneity, the management of such systems is reaching the limits of human capabilities. Technically and economically, more automation of the classical management tasks is needed. This has triggered a significant research effort, gathered under the terms self-management and autonomic networking. The aim of this thesis is to contribute to the realization of some self-management properties in telecommunication networks. We propose an approach to automatize the management of faults, covering the different segments of a network, and the end-to-end services deployed over them. This is a model-based approach addressing the two weaknesses of model-based diagnosis namely: a) how to derive such a model, suited to a given network at a given time, in particular if one wishes to capture several network layers and segments and b) how to reason a potentially huge model, if one wishes to manage a nation-wide network for example. To address the first point, we propose a new concept called self-modeling that formulates off-line generic patterns of the model, and identifies on-line the instances of these patterns that are deployed in the managed network. The second point is addressed by an active self-diagnosis engine, based on a Bayesian network formalism, that consists in reasoning on a progressively growing fragment of the network model, relying on the self-modeling ability: more observations are collected and new tests are performed until the faults are localized with sufficient confidence. This active diagnosis approach has been experimented to perform cross-layer and cross-segment alarm management on an IMS network.RENNES1-Bibl. électronique (352382106) / SudocSudocFranceF

    Modelling of reliable service based operations support system (MORSBOSS)

    Get PDF
    Philosophiae Doctor - PhDThe underlying theme of this thesis is identification, classification, detection and prediction of cellular network faults using state of the art technologies, methods and algorithms

    Machine Learning-based Predictive Maintenance for Optical Networks

    Get PDF
    Optical networks provide the backbone of modern telecommunications by connecting the world faster than ever before. However, such networks are susceptible to several failures (e.g., optical fiber cuts, malfunctioning optical devices), which might result in degradation in the network operation, massive data loss, and network disruption. It is challenging to accurately and quickly detect and localize such failures due to the complexity of such networks, the time required to identify the fault and pinpoint it using conventional approaches, and the lack of proactive efficient fault management mechanisms. Therefore, it is highly beneficial to perform fault management in optical communication systems in order to reduce the mean time to repair, to meet service level agreements more easily, and to enhance the network reliability. In this thesis, the aforementioned challenges and needs are tackled by investigating the use of machine learning (ML) techniques for implementing efficient proactive fault detection, diagnosis, and localization schemes for optical communication systems. In particular, the adoption of ML methods for solving the following problems is explored: - Degradation prediction of semiconductor lasers, - Lifetime (mean time to failure) prediction of semiconductor lasers, - Remaining useful life (the length of time a machine is likely to operate before it requires repair or replacement) prediction of semiconductor lasers, - Optical fiber fault detection, localization, characterization, and identification for different optical network architectures, - Anomaly detection in optical fiber monitoring. Such ML approaches outperform the conventionally employed methods for all the investigated use cases by achieving better prediction accuracy and earlier prediction or detection capability

    Toward Automated Network Management and Operations.

    Full text link
    Network management plays a fundamental role in the operation and well-being of today's networks. Despite the best effort of existing support systems and tools, management operations in large service provider and enterprise networks remain mostly manual. Due to the larger scale of modern networks, more complex network functionalities, and higher network dynamics, human operators are increasingly short-handed. As a result, network misconfigurations are frequent, and can result in violated service-level agreements and degraded user experience. In this dissertation, we develop various tools and systems to understand, automate, augment, and evaluate network management operations. Our thesis is that by introducing formal abstractions, like deterministic finite automata, Petri-Nets and databases, we can build new support systems that systematically capture domain knowledge, automate network management operations, enforce network-wide properties to prevent misconfigurations, and simultaneously reduce manual effort. The theme for our systems is to build a knowledge plane based on the proposed abstractions, allowing network-wide reasoning and guidance for network operations. More importantly, the proposed systems require no modification to the existing Internet infrastructure and network devices, simplifying adoption. We show that our systems improve both timeliness and correctness in performing realistic and large-scale network operations. Finally, to address the current limitations and difficulty of evaluating novel network management systems, we have designed a distributed network testing platform that relies on network and device virtualization to provide realistic environments and isolation to production networks.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78837/1/chenxu_1.pd

    Logging Statements Analysis and Automation in Software Systems with Data Mining and Machine Learning Techniques

    Get PDF
    Log files are widely used to record runtime information of software systems, such as the timestamp of an event, the name or ID of the component that generated the log, and parts of the state of a task execution. The rich information of logs enables system developers (and operators) to monitor the runtime behavior of their systems and further track down system problems in development and production settings. With the ever-increasing scale and complexity of modern computing systems, the volume of logs is rapidly growing. For example, eBay reported that the rate of log generation on their servers is in the order of several petabytes per day in 2018 [17]. Therefore, the traditional way of log analysis that largely relies on manual inspection (e.g., searching for error/warning keywords or grep) has become an inefficient, a labor intensive, error-prone, and outdated task. The growth of the logs has initiated the emergence of automated tools and approaches for log mining and analysis. In parallel, the embedding of logging statements in the source code is a manual and error-prone task, and developers often might forget to add a logging statement in the software's source code. To address the logging challenge, many e orts have aimed to automate logging statements in the source code, and in addition, many tools have been proposed to perform large-scale log le analysis by use of machine learning and data mining techniques. However, the current logging process is yet mostly manual, and thus, proper placement and content of logging statements remain as challenges. To overcome these challenges, methods that aim to automate log placement and content prediction, i.e., `where and what to log', are of high interest. In addition, approaches that can automatically mine and extract insight from large-scale logs are also well sought after. Thus, in this research, we focus on predicting the log statements, and for this purpose, we perform an experimental study on open-source Java projects. We introduce a log-aware code-clone detection method to predict the location and description of logging statements. Additionally, we incorporate natural language processing (NLP) and deep learning methods to further enhance the performance of the log statements' description prediction. We also introduce deep learning based approaches for automated analysis of software logs. In particular, we analyze execution logs and extract natural language characteristics of logs to enable the application of natural language models for automated log le analysis. Then, we propose automated tools for analyzing log files and measuring the information gain from logs for different log analysis tasks such as anomaly detection. We then continue our NLP-enabled approach by leveraging the state-of-the-art language models, i.e., Transformers, to perform automated log parsing

    Autonomous Database Management at Scale: Automated Tuning, Performance Diagnosis, and Resource Decentralization

    Full text link
    Database administration has always been a challenging task, and is becoming even more difficult with the rise of public and private clouds. Today, many enterprises outsource their database operation to cloud service providers (CSPs) in order to reduce operating costs. CSPs, now tasked with managing an extremely large number of database instances, cannot simply rely on database administrators. In fact, humans have become a bottleneck in the scalability and profitability of cloud offerings. This has created a massive demand for building autonomous databases—systems that operate with little or zero human supervision. While autonomous databases have gained much attention in recent years in both academia and industry, many of the existing techniques remain limited to automating parameter tuning, backup/recovery, and monitoring. Consequently, there is much to be done before realizing a fully autonomous database. This dissertation examines and offers new automation techniques for three specific areas of modern database management. 1. Automated Tuning – We propose a new generation of physical database designers that are robust against uncertainty in future workloads. Given the rising popularity of approximate databases, we also develop an optimal, hybrid sampling strategy that enables efficient join processing on offline samples, a long-standing open problem in approximate query processing. 2. Performance Diagnosis – We design practical tools and algorithms for assisting database administrators in quickly and reliably diagnosing performance problems in their transactional databases. 3. Resource Decentralization – To achieve autonomy among database components in a shared environment, we propose a highly efficient, starvation-free, and fully decentralized distributed lock manager for distributed database clusters.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/153349/1/dyoon_1.pd
    corecore