16 research outputs found

    Executing Online Anomaly Detection in Complex Dynamic Systems

    Get PDF

    System failure prediction through rare-events elastic-net logistic regression

    Full text link
    Predicting failures in a distributed system based on previous events through logistic regression is a standard approach in literature. This technique is not reliable, though, in two situations: in the prediction of rare events, which do not appear in enough proportion for the algorithm to capture, and in environments where there are too many variables, as logistic regression tends to overfit on this situations; while manually selecting a subset of variables to create the model is error- prone. On this paper, we solve an industrial research case that presented this situation with a combination of elastic net logistic regression, a method that allows us to automatically select useful variables, a process of cross-validation on top of it and the application of a rare events prediction technique to reduce computation time. This process provides two layers of cross- validation that automatically obtain the optimal model complexity and the optimal mode l parameters values, while ensuring even rare events will be correctly predicted with a low amount of training instances. We tested this method against real industrial data, obtaining a total of 60 out of 80 possible models with a 90% average model accuracy

    Classification in sparse, high dimensional environments applied to distributed systems failure prediction

    Get PDF
    Network failures are still one of the main causes of distributed systems’ lack of reliability. To overcome this problem we present an improvement over a failure prediction system, based on Elastic Net Logistic Regression and the application of rare events prediction techniques, able to work with sparse, high dimensional datasets. Specifically, we prove its stability, fine tune its hyperparameter and improve its industrial utility by showing that, with a slight change in dataset creation, it can also predict the location of a failure, a key asset when trying to take a proactive approach to failure management

    States Prediction of Web Services Using Hidden Markov Model

    Get PDF
    Over the last few decades, service oriented architectures, in particularly web services, have grown in popularity in the context of enterprise level application integration. As a result, most of the enterprise level software systems tended to be developed with a flavor of web service components. However, like all other distributed software technologies, web services also fail. Therefore, proper mechanisms and tools to handle system failures are vital to avoid such exceptional behaviors. To address that problem, this paper investigates a state prediction mechanism for web services using Hidden Markov Model (HMM). This approach is capable of predicting the future exceptional behaviors of the web service by analyzing and identifying the error patterns generated by long-running web services. This research can be further extended with an automated system input to determine the system state

    A Novel System Anomaly Prediction System Based on Belief Markov Model and Ensemble Classification

    Get PDF
    Computer systems are becoming extremely complex, while system anomalies dramatically influence the availability and usability of systems. Online anomaly prediction is an important approach to manage imminent anomalies, and the high accuracy relies on precise system monitoring data. However, precise monitoring data is not easily achievable because of widespread noise. In this paper, we present a method which integrates an improved Evidential Markov model and ensemble classification to predict anomaly for systems with noise. Traditional Markov models use explicit state boundaries to build the Markov chain and then make prediction of different measurement metrics. A Problem arises when data comes with noise because even slight oscillation around the true value will lead to very different predictions. Evidential Markov chain method is able to deal with noisy data but is not suitable in complex data stream scenario. The Belief Markov chain that we propose has extended Evidential Markov chain and can cope with noisy data stream. This study further applies ensemble classification to identify system anomaly based on the predicted metrics. Extensive experiments on anomaly data collected from 66 metrics in PlanetLab have confirmed that our approach can achieve high prediction accuracy and time efficiency

    Seer: a lightweight online failure prediction approach

    Get PDF
    Online failure prediction aims to predict the manifestation of failures at runtime before the failures actually occur. Existing online failure prediction approaches typically operate on data which is either directly reported by the system under test or directly observable from outside system executions. These approaches generally refrain themselves from collecting internal execution data that can further improve the prediction quality. One reason behind this general trend is due to the runtime overhead cost incurred by the measurement instruments that are required to collect the data. In this work we conjecture that large cost reductions in collecting internal execution data for online failure prediction can derive from reducing the cost of the measurement instruments, while still supporting acceptable levels of prediction quality. To evaluate this conjecture, we present a lightweight online failure prediction approach, called Seer. Seer uses fast hardware performance counters to perform most of the data collection work. The data is augmented with further data collected by a minimal amount of software instrumentation that is added to the systems software. We refer to the data collected in this manner as hybrid spectra. We applied the proposed approach to three widely used open source subject applications and evaluated it by comparing and contrasting three types of hybrid spectra and two types of traditional software spectra. At the lowest level of runtime overheads attained in the experiments, the hybrid spectra predicted the failures about half way through the executions with an F-measure of 0.77 and a runtime overhead of 1.98%, on average. Comparing hybrid spectra to software spectra, we observed that, for comparable runtime overhead levels, the hybrid spectra provided significantly better prediction accuracies and earlier warnings for failures than the software spectra. Alternatively, for comparable accuracy levels, the hybrid spectra incurred significantly less runtime overheads and provided earlier warnings

    Data Driven Device Failure Prediction

    Get PDF
    As society becomes more dependent upon computer systems to perform increasingly critical tasks, ensuring those systems do not fail also becomes more important. Many organizations depend heavily on desktop computers for day to day operations. Unfortunately, the software that runs on these computers is still written by humans and as such, is still subject to human error and consequent failure. A natural solution is to use statistical machine learning to predict failure. However, since failure is still a relatively rare event, obtaining labeled training data to train these models is not trivial. This work presents new simulated fault loads with an automated framework to predict failure in the Microsoft enterprise authentication service and Apache web server in an effort to increase up-time and improve mission effectiveness. These new fault loads were successful in creating realistic failure conditions that are accurately identified by statistical learning models
    corecore