6,619 research outputs found

    In Search of Non-Gaussian Components of a High-Dimensional Distribution

    Get PDF
    Finding non-Gaussian components of high-dimensional data is an important preprocessing step for effcient information processing. This article proposes a new linear method to identify the ``non-Gaussian subspace´´ within a very general semi-parametric framework. Our proposed method, called NGCA (Non-Gaussian Component Analysis), is essentially based on a linear operator which, to any arbitrary nonlinear (smooth) function, associates a vector which belongs to the low dimensional non-Gaussian target subspace up to an estimation error. By applying this operator to a family of different nonlinear functions, one obtains a family of different vectors lying in a vicinity of the target space. As a final step, the target space itself is estimated by applying PCA to this family of vectors. We show that this procedure is consistent in the sense that the estimaton error tends to zero at a parametric rate, uniformly over the family, Numerical examples demonstrate the usefulness of our method.non-Gaussian components, dimension reduction

    FRIOD: a deeply integrated feature-rich interactive system for effective and efficient outlier detection

    Get PDF
    In this paper, we propose an novel interactive outlier detection system called feature-rich interactive outlier detection (FRIOD), which features a deep integration of human interaction to improve detection performance and greatly streamline the detection process. A user-friendly interactive mechanism is developed to allow easy and intuitive user interaction in all the major stages of the underlying outlier detection algorithm which includes dense cell selection, location-aware distance thresholding, and final top outlier validation. By doing so, we can mitigate the major difficulty of the competitive outlier detection methods in specifying the key parameter values, such as the density and distance thresholds. An innovative optimization approach is also proposed to optimize the grid-based space partitioning, which is a critical step of FRIOD. Such optimization fully considers the high-quality outliers it detects with the aid of human interaction. The experimental evaluation demonstrates that FRIOD can improve the quality of the detected outliers and make the detection process more intuitive, effective, and efficient
    • …
    corecore