56,524 research outputs found

    Robust statistical methods for automated outlier detection

    Get PDF
    The computational challenge of automating outlier, or blunder point, detection in radio metric data requires the use of nonstandard statistical methods because the outliers have a deleterious effect on standard least squares methods. The particular nonstandard methods most applicable to the task are the robust statistical techniques that have undergone intense development since the 1960s. These new methods are by design more resistant to the effects of outliers than standard methods. Because the topic may be unfamiliar, a brief introduction to the philosophy and methods of robust statistics is presented. Then the application of these methods to the automated outlier detection problem is detailed for some specific examples encountered in practice

    Kernel Ellipsoidal Trimming

    No full text
    Ellipsoid estimation is an issue of primary importance in many practical areas such as control, system identification, visual/audio tracking, experimental design, data mining, robust statistics and novelty/outlier detection. This paper presents a new method of kernel information matrix ellipsoid estimation (KIMEE) that finds an ellipsoid in a kernel defined feature space based on a centered information matrix. Although the method is very general and can be applied to many of the aforementioned problems, the main focus in this paper is the problem of novelty or outlier detection associated with fault detection. A simple iterative algorithm based on Titterington's minimum volume ellipsoid method is proposed for practical implementation. The KIMEE method demonstrates very good performance on a set of real-life and simulated datasets compared with support vector machine methods

    Provable Self-Representation Based Outlier Detection in a Union of Subspaces

    Full text link
    Many computer vision tasks involve processing large amounts of data contaminated by outliers, which need to be detected and rejected. While outlier detection methods based on robust statistics have existed for decades, only recently have methods based on sparse and low-rank representation been developed along with guarantees of correct outlier detection when the inliers lie in one or more low-dimensional subspaces. This paper proposes a new outlier detection method that combines tools from sparse representation with random walks on a graph. By exploiting the property that data points can be expressed as sparse linear combinations of each other, we obtain an asymmetric affinity matrix among data points, which we use to construct a weighted directed graph. By defining a suitable Markov Chain from this graph, we establish a connection between inliers/outliers and essential/inessential states of the Markov chain, which allows us to detect outliers by using random walks. We provide a theoretical analysis that justifies the correctness of our method under geometric and connectivity assumptions. Experimental results on image databases demonstrate its superiority with respect to state-of-the-art sparse and low-rank outlier detection methods.Comment: 16 pages. CVPR 2017 spotlight oral presentatio

    Robust Identification of Target Genes and Outliers in Triple-negative Breast Cancer Data

    Get PDF
    Correct classification of breast cancer sub-types is of high importance as it directly affects the therapeutic options. We focus on triple-negative breast cancer (TNBC) which has the worst prognosis among breast cancer types. Using cutting edge methods from the field of robust statistics, we analyze Breast Invasive Carcinoma (BRCA) transcriptomic data publicly available from The Cancer Genome Atlas (TCGA) data portal. Our analysis identifies statistical outliers that may correspond to misdiagnosed patients. Furthermore, it is illustrated that classical statistical methods may fail in the presence of these outliers, prompting the need for robust statistics. Using robust sparse logistic regression we obtain 36 relevant genes, of which ca. 60\% have been previously reported as biologically relevant to TNBC, reinforcing the validity of the method. The remaining 14 genes identified are new potential biomarkers for TNBC. Out of these, JAM3, SFT2D2 and PAPSS1 were previously associated to breast tumors or other types of cancer. The relevance of these genes is confirmed by the new DetectDeviatingCells (DDC) outlier detection technique. A comparison of gene networks on the selected genes showed significant differences between TNBC and non-TNBC data. The individual role of FOXA1 in TNBC and non-TNBC, and the strong FOXA1-AGR2 connection in TNBC stand out. Not only will our results contribute to the breast cancer/TNBC understanding and ultimately its management, they also show that robust regression and outlier detection constitute key strategies to cope with high-dimensional clinical data such as omics data

    Letter to the Editor

    Full text link
    The paper by Alfons, Croux and Gelper (2013), Sparse least trimmed squares regression for analyzing high-dimensional large data sets, considered a combination of least trimmed squares (LTS) and lasso penalty for robust and sparse high-dimensional regression. In a recent paper [She and Owen (2011)], a method for outlier detection based on a sparsity penalty on the mean shift parameter was proposed (designated by "SO" in the following). This work is mentioned in Alfons et al. as being an "entirely different approach." Certainly the problem studied by Alfons et al. is novel and interesting.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS640 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    RES-Q: Robust Outlier Detection Algorithm for Fundamental Matrix Estimation

    Get PDF
    Detection of outliers present in noisy images for an accurate fundamental matrix estimation is an important research topic in the field of 3-D computer vision. Although a lot of research is conducted in this domain, not much study has been done in utilizing the robust statistics for successful outlier detection algorithms. This paper proposes to utilize a reprojection residual error-based technique for outlier detection. Given a noisy stereo image pair obtained from a pair of stereo cameras and a set of initial point correspondences between them, reprojection residual error and 3-sigma principle together with robust statistic-based Qn estimator (RES-Q) is proposed to efficiently detect the outliers and estimate the fundamental matrix with superior accuracy. The proposed RES-Q algorithm demonstrates greater precision and lower reprojection residual error than the state-of-the-art techniques. Moreover, in contrast to the assumption of Gaussian noise or symmetric noise model adopted by most previous approaches, the RES-Q is found to be robust for both symmetric and asymmetric random noise assumptions. The proposed algorithm is experimentally tested on both synthetic and real image data sets, and the experiments show that RES-Q is more effective and efficient than the classical outlier detection algorithms
    • …
    corecore