1,008 research outputs found

    A proximal iteration for deconvolving Poisson noisy images using sparse representations

    Get PDF
    We propose an image deconvolution algorithm when the data is contaminated by Poisson noise. The image to restore is assumed to be sparsely represented in a dictionary of waveforms such as the wavelet or curvelet transforms. Our key contributions are: First, we handle the Poisson noise properly by using the Anscombe variance stabilizing transform leading to a {\it non-linear} degradation equation with additive Gaussian noise. Second, the deconvolution problem is formulated as the minimization of a convex functional with a data-fidelity term reflecting the noise properties, and a non-smooth sparsity-promoting penalties over the image representation coefficients (e.g. â„“1\ell_1-norm). Third, a fast iterative backward-forward splitting algorithm is proposed to solve the minimization problem. We derive existence and uniqueness conditions of the solution, and establish convergence of the iterative algorithm. Finally, a GCV-based model selection procedure is proposed to objectively select the regularization parameter. Experimental results are carried out to show the striking benefits gained from taking into account the Poisson statistics of the noise. These results also suggest that using sparse-domain regularization may be tractable in many deconvolution applications with Poisson noise such as astronomy and microscopy

    Sparse Linear Identifiable Multivariate Modeling

    Full text link
    In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully Bayesian hierarchy for sparse models using slab and spike priors (two-component delta-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable computational complexity. We attribute this mainly to the stochastic search strategy used, and to parsimony (sparsity and identifiability), which is an explicit part of the model. We propose two extensions to the basic i.i.d. linear framework: non-linear dependence on observed variables, called SNIM (Sparse Non-linear Identifiable Multivariate modeling) and allowing for correlations between latent variables, called CSLIM (Correlated SLIM), for the temporal and/or spatial data. The source code and scripts are available from http://cogsys.imm.dtu.dk/slim/.Comment: 45 pages, 17 figure

    Bayesian Nonparametric Unmixing of Hyperspectral Images

    Full text link
    Hyperspectral imaging is an important tool in remote sensing, allowing for accurate analysis of vast areas. Due to a low spatial resolution, a pixel of a hyperspectral image rarely represents a single material, but rather a mixture of different spectra. HSU aims at estimating the pure spectra present in the scene of interest, referred to as endmembers, and their fractions in each pixel, referred to as abundances. Today, many HSU algorithms have been proposed, based either on a geometrical or statistical model. While most methods assume that the number of endmembers present in the scene is known, there is only little work about estimating this number from the observed data. In this work, we propose a Bayesian nonparametric framework that jointly estimates the number of endmembers, the endmembers itself, and their abundances, by making use of the Indian Buffet Process as a prior for the endmembers. Simulation results and experiments on real data demonstrate the effectiveness of the proposed algorithm, yielding results comparable with state-of-the-art methods while being able to reliably infer the number of endmembers. In scenarios with strong noise, where other algorithms provide only poor results, the proposed approach tends to overestimate the number of endmembers slightly. The additional endmembers, however, often simply represent noisy replicas of present endmembers and could easily be merged in a post-processing step

    Automatic Extraction of Ordinary Differential Equations from Data: Sparse Regression Tools for System Identification

    Get PDF
    Studying nonlinear systems across engineering, physics, economics, biology, and chemistry often hinges upon successfully discovering their underlying dynamics. However, despite the abundance of data in today's world, a complete comprehension of these governing equations often remains elusive, posing a significant challenge. Traditional system identification methods for building mathematical models to describe these dynamics can be time-consuming, error-prone, and limited by data availability. This thesis presents three comprehensive strategies to address these challenges and automate model discovery. The procedures outlined here employ classic statistical and machine learning methods, such as signal filtering, sparse regression, bootstrap sampling, Bayesian inference, and unsupervised learning algorithms, to capture complex and nonlinear relationships in data. Building on these foundational techniques, the proposed processes offer a reliable and efficient approach to identifying models of ordinary differential equations from data, differing from and complementing existing frameworks. The results presented here provide rigorous benchmarking against state-of-the-art algorithms, demonstrating the proposed methods' effectiveness in model discovery and highlighting the potential for discovering governing equations across applications such as weather forecasting, chemical reaction and electrical circuit modelling, and predator-prey dynamics. These methods can aid in solving critical decision-making problems, including optimising resource allocation, predicting system failures, and facilitating adaptive control in various domains. Ultimately, the strategies developed in this thesis are designed to integrate seamlessly into current workflows, thereby promoting data-driven decision-making and enhancing understanding of complex system dynamics

    MetaPalette: a k-mer Painting Approach for Metagenomic Taxonomic Profiling and Quantification of Novel Strain Variation

    Get PDF
    Metagenomic profiling is challenging in part because of the highly uneven sampling of the tree of life by genome sequencing projects and the limitations imposed by performing phylogenetic inference at fixed taxonomic ranks. We present the algorithm MetaPalette, which uses long k-mer sizes (k = 30, 50) to fit a k-mer “palette” of a given sample to the k-mer palette of reference organisms. By modeling the k-mer palettes of unknown organisms, the method also gives an indication of the presence, abundance, and evolutionary relatedness of novel organisms present in the sample. The method returns a traditional, fixed-rank taxonomic profile which is shown on independently simulated data to be one of the most accurate to date. Tree figures are also returned that quantify the relatedness of novel organisms to reference sequences, and the accuracy of such figures is demonstrated on simulated spike-ins and a metagenomic soil sample. The software implementing MetaPalette is available at: https://github.com/dkoslicki/MetaPalette. Pretrained databases are included for Archaea, Bacteria, Eukaryota, and viruses
    • …
    corecore