5,414 research outputs found

    Automatic Hyperparameter Tuning Method for Local Outlier Factor, with Applications to Anomaly Detection

    Full text link
    In recent years, there have been many practical applications of anomaly detection such as in predictive maintenance, detection of credit fraud, network intrusion, and system failure. The goal of anomaly detection is to identify in the test data anomalous behaviors that are either rare or unseen in the training data. This is a common goal in predictive maintenance, which aims to forecast the imminent faults of an appliance given abundant samples of normal behaviors. Local outlier factor (LOF) is one of the state-of-the-art models used for anomaly detection, but the predictive performance of LOF depends greatly on the selection of hyperparameters. In this paper, we propose a novel, heuristic methodology to tune the hyperparameters in LOF. A tuned LOF model that uses the proposed method shows good predictive performance in both simulations and real data sets.Comment: 15 pages, 5 figure

    Randomizing Ensemble-based approaches for Outlier

    Get PDF
    The data size is increasing dramatically every day, therefore, it has emerged the need of detecting abnormal behaviors, which can harm seriously our systems. Outlier detection refers to the process of identifying outlying activities, which diverge from the remaining group of data. This process, an integral part of data mining field, has experienced recently a substantial interest from the data mining community. An outlying activity or an outlier refers to a data point, which significantly deviates and appears to be inconsistent compared to other data members. Ensemble-based outlier detection is a line of research employed in order to reduce the model dependence from datasets or data locality by raising the robustness of the data mining procedures. The key principle of an ensemble approach is using the combination of individual detection results, which do not contain the same list of outliers in order to come up with a consensus finding. In this paper, we propose a novel strategy of constructing randomized ensemble outlier detection. This approach is an extension of the heuristic greedy ensemble construction previously built by the research community. We will focus on the core components of constructing an ensemble –based algorithm for outlier detection. The randomization will be performed by intervening into the pseudo code of greedy ensemble and implementing randomization in the respective java code through the ELKI data-mining platform. The key purpose of our approach is to improve the greedy ensemble and to overcome its local maxima problem. In order to induce diversity, it is performed randomization by initializing the search with a random outlier detector from the pool of detectors. Finally, the paper provides strong insights regarding the ongoing work of our randomized ensemble-based approach for outlier detection. Empirical results indicate that due to inducing diversity by employing various outlier detection algorithms, the randomized ensemble approach performs better than using only one outlier detector
    • …
    corecore