9,929 research outputs found

    Steganographer Identification

    Full text link
    Conventional steganalysis detects the presence of steganography within single objects. In the real-world, we may face a complex scenario that one or some of multiple users called actors are guilty of using steganography, which is typically defined as the Steganographer Identification Problem (SIP). One might use the conventional steganalysis algorithms to separate stego objects from cover objects and then identify the guilty actors. However, the guilty actors may be lost due to a number of false alarms. To deal with the SIP, most of the state-of-the-arts use unsupervised learning based approaches. In their solutions, each actor holds multiple digital objects, from which a set of feature vectors can be extracted. The well-defined distances between these feature sets are determined to measure the similarity between the corresponding actors. By applying clustering or outlier detection, the most suspicious actor(s) will be judged as the steganographer(s). Though the SIP needs further study, the existing works have good ability to identify the steganographer(s) when non-adaptive steganographic embedding was applied. In this chapter, we will present foundational concepts and review advanced methodologies in SIP. This chapter is self-contained and intended as a tutorial introducing the SIP in the context of media steganography.Comment: A tutorial with 30 page

    A Parametric Framework for the Comparison of Methods of Very Robust Regression

    Full text link
    There are several methods for obtaining very robust estimates of regression parameters that asymptotically resist 50% of outliers in the data. Differences in the behaviour of these algorithms depend on the distance between the regression data and the outliers. We introduce a parameter λ\lambda that defines a parametric path in the space of models and enables us to study, in a systematic way, the properties of estimators as the groups of data move from being far apart to close together. We examine, as a function of λ\lambda, the variance and squared bias of five estimators and we also consider their power when used in the detection of outliers. This systematic approach provides tools for gaining knowledge and better understanding of the properties of robust estimators.Comment: Published in at http://dx.doi.org/10.1214/13-STS437 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Local Density-Based Approach for Local Outlier Detection

    Full text link
    This paper presents a simple but effective density-based outlier detection approach with the local kernel density estimation (KDE). A Relative Density-based Outlier Score (RDOS) is introduced to measure the local outlierness of objects, in which the density distribution at the location of an object is estimated with a local KDE method based on extended nearest neighbors of the object. Instead of using only kk nearest neighbors, we further consider reverse nearest neighbors and shared nearest neighbors of an object for density distribution estimation. Some theoretical properties of the proposed RDOS including its expected value and false alarm probability are derived. A comprehensive experimental study on both synthetic and real-life data sets demonstrates that our approach is more effective than state-of-the-art outlier detection methods.Comment: 22 pages, 14 figures, submitted to Pattern Recognition Letter

    A Near-linear Time Approximation Algorithm for Angle-based Outlier Detection in High-dimensional Data

    Get PDF
    Outlier mining in d-dimensional point sets is a fundamental and well studied data mining task due to its variety of ap-plications. Most such applications arise in high-dimensional domains. A bottleneck of existing approaches is that implicit or explicit assessments on concepts of distance or nearest neighbor are deteriorated in high-dimensional data. Follow-ing up on the work of Kriegel et al. (KDD ’08), we inves-tigate the use of angle-based outlier factor in mining high-dimensional outliers. While their algorithm runs in cubic time (with a quadratic time heuristic), we propose a novel random projection-based technique that is able to estimate the angle-based outlier factor for all data points in time near-linear in the size of the data. Also, our approach is suitable to be performed in parallel environment to achieve a parallel speedup. We introduce a theoretical analysis of the quality of approximation to guarantee the reliability of our estima-tion algorithm. The empirical experiments on synthetic and real world data sets demonstrate that our approach is effi-cient and scalable to very large high-dimensional data sets

    Finding an unknown number of multivariate outliers

    Get PDF
    We use the forward search to provide robust Mahalanobis distances to detect the presence of outliers in a sample of multivariate normal data. Theoretical results on order statistics and on estimation in truncated samples provide the distribution of our test statistic. We also introduce several new robust distances with associated distributional results. Comparisons of our procedure with tests using other robust Mahalanobis distances show the good size and high power of our procedure. We also provide a unification of results on correction factors for estimation from truncated samples
    • …
    corecore