423 research outputs found

    MDL Denoising Revisited

    Full text link
    We refine and extend an earlier MDL denoising criterion for wavelet-based denoising. We start by showing that the denoising problem can be reformulated as a clustering problem, where the goal is to obtain separate clusters for informative and non-informative wavelet coefficients, respectively. This suggests two refinements, adding a code-length for the model index, and extending the model in order to account for subband-dependent coefficient distributions. A third refinement is derivation of soft thresholding inspired by predictive universal coding with weighted mixtures. We propose a practical method incorporating all three refinements, which is shown to achieve good performance and robustness in denoising both artificial and natural signals.Comment: Submitted to IEEE Transactions on Information Theory, June 200

    Robust denoising of electrophoresis and mass spectrometry signals with minimum description length principle

    Get PDF
    AbstractThe need for high-throughput assays in molecular biology places increasing requirements on the applied signal processing and modelling methods. In order to be able to extract useful information from the measurements, the removal of undesirable signal characteristics such as random noise is required. This can be done in a quite elegant and efficient way by the minimum description length (MDL) principle, which treats and separates `noise' from the useful information as that part in the data that cannot be compressed. In its current form the MDL denoising method assumes the Gaussian noise model but does not require any ad hoc parameter settings. It provides a basis for high-speed automated processing systems without requiring continual user interventions to validate the results as in the conventional signal processing methods. Our analysis of the denoising problem in mass spectrometry, capillary electrophoresis genotyping, and sequencing signals suggests that the MDL denoising method produces robust and intuitively appealing results sometimes even in situations where competing approaches perform poorly

    Denoising using local projective subspace methods

    Get PDF
    In this paper we present denoising algorithms for enhancing noisy signals based on Local ICA (LICA), Delayed AMUSE (dAMUSE) and Kernel PCA (KPCA). The algorithm LICA relies on applying ICA locally to clusters of signals embedded in a high-dimensional feature space of delayed coordinates. The components resembling the signals can be detected by various criteria like estimators of kurtosis or the variance of autocorrelations depending on the statistical nature of the signal. The algorithm proposed can be applied favorably to the problem of denoising multi-dimensional data. Another projective subspace denoising method using delayed coordinates has been proposed recently with the algorithm dAMUSE. It combines the solution of blind source separation problems with denoising efforts in an elegant way and proofs to be very efficient and fast. Finally, KPCA represents a non-linear projective subspace method that is well suited for denoising also. Besides illustrative applications to toy examples and images, we provide an application of all algorithms considered to the analysis of protein NMR spectra.info:eu-repo/semantics/publishedVersio

    Recovery from Linear Measurements with Complexity-Matching Universal Signal Estimation

    Full text link
    We study the compressed sensing (CS) signal estimation problem where an input signal is measured via a linear matrix multiplication under additive noise. While this setup usually assumes sparsity or compressibility in the input signal during recovery, the signal structure that can be leveraged is often not known a priori. In this paper, we consider universal CS recovery, where the statistics of a stationary ergodic signal source are estimated simultaneously with the signal itself. Inspired by Kolmogorov complexity and minimum description length, we focus on a maximum a posteriori (MAP) estimation framework that leverages universal priors to match the complexity of the source. Our framework can also be applied to general linear inverse problems where more measurements than in CS might be needed. We provide theoretical results that support the algorithmic feasibility of universal MAP estimation using a Markov chain Monte Carlo implementation, which is computationally challenging. We incorporate some techniques to accelerate the algorithm while providing comparable and in many cases better reconstruction quality than existing algorithms. Experimental results show the promise of universality in CS, particularly for low-complexity sources that do not exhibit standard sparsity or compressibility.Comment: 29 pages, 8 figure

    Determining Principal Component Cardinality through the Principle of Minimum Description Length

    Full text link
    PCA (Principal Component Analysis) and its variants areubiquitous techniques for matrix dimension reduction and reduced-dimensionlatent-factor extraction. One significant challenge in using PCA, is thechoice of the number of principal components. The information-theoreticMDL (Minimum Description Length) principle gives objective compression-based criteria for model selection, but it is difficult to analytically applyits modern definition - NML (Normalized Maximum Likelihood) - to theproblem of PCA. This work shows a general reduction of NML prob-lems to lower-dimension problems. Applying this reduction, it boundsthe NML of PCA, by terms of the NML of linear regression, which areknown.Comment: LOD 201
    corecore