423 research outputs found
MDL Denoising Revisited
We refine and extend an earlier MDL denoising criterion for wavelet-based
denoising. We start by showing that the denoising problem can be reformulated
as a clustering problem, where the goal is to obtain separate clusters for
informative and non-informative wavelet coefficients, respectively. This
suggests two refinements, adding a code-length for the model index, and
extending the model in order to account for subband-dependent coefficient
distributions. A third refinement is derivation of soft thresholding inspired
by predictive universal coding with weighted mixtures. We propose a practical
method incorporating all three refinements, which is shown to achieve good
performance and robustness in denoising both artificial and natural signals.Comment: Submitted to IEEE Transactions on Information Theory, June 200
Robust denoising of electrophoresis and mass spectrometry signals with minimum description length principle
AbstractThe need for high-throughput assays in molecular biology places increasing requirements on the applied signal processing and modelling methods. In order to be able to extract useful information from the measurements, the removal of undesirable signal characteristics such as random noise is required. This can be done in a quite elegant and efficient way by the minimum description length (MDL) principle, which treats and separates `noise' from the useful information as that part in the data that cannot be compressed. In its current form the MDL denoising method assumes the Gaussian noise model but does not require any ad hoc parameter settings. It provides a basis for high-speed automated processing systems without requiring continual user interventions to validate the results as in the conventional signal processing methods. Our analysis of the denoising problem in mass spectrometry, capillary electrophoresis genotyping, and sequencing signals suggests that the MDL denoising method produces robust and intuitively appealing results sometimes even in situations where competing approaches perform poorly
Denoising using local projective subspace methods
In this paper we present denoising algorithms for enhancing noisy signals based on Local ICA (LICA), Delayed AMUSE (dAMUSE)
and Kernel PCA (KPCA). The algorithm LICA relies on applying ICA locally to clusters of signals embedded in a high-dimensional
feature space of delayed coordinates. The components resembling the signals can be detected by various criteria like estimators of
kurtosis or the variance of autocorrelations depending on the statistical nature of the signal. The algorithm proposed can be applied
favorably to the problem of denoising multi-dimensional data. Another projective subspace denoising method using delayed coordinates
has been proposed recently with the algorithm dAMUSE. It combines the solution of blind source separation problems with denoising
efforts in an elegant way and proofs to be very efficient and fast. Finally, KPCA represents a non-linear projective subspace method that
is well suited for denoising also. Besides illustrative applications to toy examples and images, we provide an application of all algorithms
considered to the analysis of protein NMR spectra.info:eu-repo/semantics/publishedVersio
Recovery from Linear Measurements with Complexity-Matching Universal Signal Estimation
We study the compressed sensing (CS) signal estimation problem where an input
signal is measured via a linear matrix multiplication under additive noise.
While this setup usually assumes sparsity or compressibility in the input
signal during recovery, the signal structure that can be leveraged is often not
known a priori. In this paper, we consider universal CS recovery, where the
statistics of a stationary ergodic signal source are estimated simultaneously
with the signal itself. Inspired by Kolmogorov complexity and minimum
description length, we focus on a maximum a posteriori (MAP) estimation
framework that leverages universal priors to match the complexity of the
source. Our framework can also be applied to general linear inverse problems
where more measurements than in CS might be needed. We provide theoretical
results that support the algorithmic feasibility of universal MAP estimation
using a Markov chain Monte Carlo implementation, which is computationally
challenging. We incorporate some techniques to accelerate the algorithm while
providing comparable and in many cases better reconstruction quality than
existing algorithms. Experimental results show the promise of universality in
CS, particularly for low-complexity sources that do not exhibit standard
sparsity or compressibility.Comment: 29 pages, 8 figure
Determining Principal Component Cardinality through the Principle of Minimum Description Length
PCA (Principal Component Analysis) and its variants areubiquitous techniques
for matrix dimension reduction and reduced-dimensionlatent-factor extraction.
One significant challenge in using PCA, is thechoice of the number of principal
components. The information-theoreticMDL (Minimum Description Length) principle
gives objective compression-based criteria for model selection, but it is
difficult to analytically applyits modern definition - NML (Normalized Maximum
Likelihood) - to theproblem of PCA. This work shows a general reduction of NML
prob-lems to lower-dimension problems. Applying this reduction, it boundsthe
NML of PCA, by terms of the NML of linear regression, which areknown.Comment: LOD 201
- …