17,865 research outputs found

    Learning to locate relative outliers

    Full text link
    Outliers usually spread across regions of low density. However, due to the absence or scarcity of outliers, designing a robust detector to sift outliers from a given dataset is still very challenging. In this paper, we consider to identify relative outliers from the target dataset with respect to another reference dataset of normal data. Particularly, we employ Maximum Mean Discrepancy (MMD) for matching the distribution between these two datasets and present a novel learning framework to learn a relative outlier detector. The learning task is formulated as a Mixed Integer Programming (MIP) problem, which is computationally hard. To this end, we propose an effective procedure to find a largely violated labeling vector for identifying relative outliers from abundant normal patterns, and its convergence is also presented. Then, a set of largely violated labeling vectors are combined by multiple kernel learning methods to robustly locate relative outliers. Comprehensive empirical studies on real-world datasets verify that our proposed relative outlier detection outperforms existing methods. © 2011 S. Li & I.W. Tsang

    A Local Density-Based Approach for Local Outlier Detection

    Full text link
    This paper presents a simple but effective density-based outlier detection approach with the local kernel density estimation (KDE). A Relative Density-based Outlier Score (RDOS) is introduced to measure the local outlierness of objects, in which the density distribution at the location of an object is estimated with a local KDE method based on extended nearest neighbors of the object. Instead of using only kk nearest neighbors, we further consider reverse nearest neighbors and shared nearest neighbors of an object for density distribution estimation. Some theoretical properties of the proposed RDOS including its expected value and false alarm probability are derived. A comprehensive experimental study on both synthetic and real-life data sets demonstrates that our approach is more effective than state-of-the-art outlier detection methods.Comment: 22 pages, 14 figures, submitted to Pattern Recognition Letter
    corecore