1,185 research outputs found

    Bayesian Lower Bounds for Dense or Sparse (Outlier) Noise in the RMT Framework

    Full text link
    Robust estimation is an important and timely research subject. In this paper, we investigate performance lower bounds on the mean-square-error (MSE) of any estimator for the Bayesian linear model, corrupted by a noise distributed according to an i.i.d. Student's t-distribution. This class of prior parametrized by its degree of freedom is relevant to modelize either dense or sparse (accounting for outliers) noise. Using the hierarchical Normal-Gamma representation of the Student's t-distribution, the Van Trees' Bayesian Cram\'er-Rao bound (BCRB) on the amplitude parameters is derived. Furthermore, the random matrix theory (RMT) framework is assumed, i.e., the number of measurements and the number of unknown parameters grow jointly to infinity with an asymptotic finite ratio. Using some powerful results from the RMT, closed-form expressions of the BCRB are derived and studied. Finally, we propose a framework to fairly compare two models corrupted by noises with different degrees of freedom for a fixed common target signal-to-noise ratio (SNR). In particular, we focus our effort on the comparison of the BCRBs associated with two models corrupted by a sparse noise promoting outliers and a dense (Gaussian) noise, respectively

    A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets

    Get PDF
    The term "outlier" can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework

    Review of the mathematical foundations of data fusion techniques in surface metrology

    Get PDF
    The recent proliferation of engineered surfaces, including freeform and structured surfaces, is challenging current metrology techniques. Measurement using multiple sensors has been proposed to achieve enhanced benefits, mainly in terms of spatial frequency bandwidth, which a single sensor cannot provide. When using data from different sensors, a process of data fusion is required and there is much active research in this area. In this paper, current data fusion methods and applications are reviewed, with a focus on the mathematical foundations of the subject. Common research questions in the fusion of surface metrology data are raised and potential fusion algorithms are discussed

    Discovering transcriptional modules by Bayesian data integration

    Get PDF
    Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs
    corecore