46 research outputs found

    Information Theoretic Principles of Universal Discrete Denoising

    Full text link
    Today, the internet makes tremendous amounts of data widely available. Often, the same information is behind multiple different available data sets. This lends growing importance to latent variable models that try to learn the hidden information from the available imperfect versions. For example, social media platforms can contain an abundance of pictures of the same person or object, yet all of which are taken from different perspectives. In a simplified scenario, one may consider pictures taken from the same perspective, which are distorted by noise. This latter application allows for a rigorous mathematical treatment, which is the content of this contribution. We apply a recently developed method of dependent component analysis to image denoising when multiple distorted copies of one and the same image are available, each being corrupted by a different and unknown noise process. In a simplified scenario, we assume that the distorted image is corrupted by noise that acts independently on each pixel. We answer completely the question of how to perform optimal denoising, when at least three distorted copies are available: First we define optimality of an algorithm in the presented scenario, and then we describe an aymptotically optimal universal discrete denoising algorithm (UDDA). In the case of binary data and binary symmetric noise, we develop a simplified variant of the algorithm, dubbed BUDDA, which we prove to attain universal denoising uniformly.Comment: 10 pages, 6 figure

    Discrete denoising of heterogenous two-dimensional data

    Full text link
    We consider discrete denoising of two-dimensional data with characteristics that may be varying abruptly between regions. Using a quadtree decomposition technique and space-filling curves, we extend the recently developed S-DUDE (Shifting Discrete Universal DEnoiser), which was tailored to one-dimensional data, to the two-dimensional case. Our scheme competes with a genie that has access, in addition to the noisy data, also to the underlying noiseless data, and can employ mm different two-dimensional sliding window denoisers along mm distinct regions obtained by a quadtree decomposition with mm leaves, in a way that minimizes the overall loss. We show that, regardless of what the underlying noiseless data may be, the two-dimensional S-DUDE performs essentially as well as this genie, provided that the number of distinct regions satisfies m=o(n)m=o(n), where nn is the total size of the data. The resulting algorithm complexity is still linear in both nn and mm, as in the one-dimensional case. Our experimental results show that the two-dimensional S-DUDE can be effective when the characteristics of the underlying clean image vary across different regions in the data.Comment: 16 pages, submitted to IEEE Transactions on Information Theor

    Thermodynamics of the Binary Symmetric Channel

    Get PDF
    We study a hidden Markov process which is the result of a transmission of the binary symmetric Markov source over the memoryless binary symmetric channel. This process has been studied extensively in Information Theory and is often used as a benchmark case for the so-called denoising algorithms. Exploiting the link between this process and the 1D Random Field Ising Model (RFIM), we are able to identify the Gibbs potential of the resulting Hidden Markov process. Moreover, we obtain a stronger bound on the memory decay rate. We conclude with a discussion on implications of our results for the development of denoising algorithms

    Discrete Denoising with Shifts

    Full text link
    We introduce S-DUDE, a new algorithm for denoising DMC-corrupted data. The algorithm, which generalizes the recently introduced DUDE (Discrete Universal DEnoiser) of Weissman et al., aims to compete with a genie that has access, in addition to the noisy data, also to the underlying clean data, and can choose to switch, up to mm times, between sliding window denoisers in a way that minimizes the overall loss. When the underlying data form an individual sequence, we show that the S-DUDE performs essentially as well as this genie, provided that mm is sub-linear in the size of the data. When the clean data is emitted by a piecewise stationary process, we show that the S-DUDE achieves the optimum distribution-dependent performance, provided that the same sub-linearity condition is imposed on the number of switches. To further substantiate the universal optimality of the S-DUDE, we show that when the number of switches is allowed to grow linearly with the size of the data, \emph{any} (sequence of) scheme(s) fails to compete in the above senses. Using dynamic programming, we derive an efficient implementation of the S-DUDE, which has complexity (time and memory) growing only linearly with the data size and the number of switches mm. Preliminary experimental results are presented, suggesting that S-DUDE has the capacity to significantly improve on the performance attained by the original DUDE in applications where the nature of the data abruptly changes in time (or space), as is often the case in practice.Comment: 30 pages, 3 figures, submitted to IEEE Trans. Inform. Theor
    corecore