25 research outputs found
Autonomous Cleaning of Corrupted Scanned Documents - A Generative Modeling Approach
We study the task of cleaning scanned text documents that are strongly
corrupted by dirt such as manual line strokes, spilled ink etc. We aim at
autonomously removing dirt from a single letter-size page based only on the
information the page contains. Our approach, therefore, has to learn character
representations without supervision and requires a mechanism to distinguish
learned representations from irregular patterns. To learn character
representations, we use a probabilistic generative model parameterizing pattern
features, feature variances, the features' planar arrangements, and pattern
frequencies. The latent variables of the model describe pattern class, pattern
position, and the presence or absence of individual pattern features. The model
parameters are optimized using a novel variational EM approximation. After
learning, the parameters represent, independently of their absolute position,
planar feature arrangements and their variances. A quality measure defined
based on the learned representation then allows for an autonomous
discrimination between regular character patterns and the irregular patterns
making up the dirt. The irregular patterns can thus be removed to clean the
document. For a full Latin alphabet we found that a single page does not
contain sufficiently many character examples. However, even if heavily
corrupted by dirt, we show that a page containing a lower number of character
types can efficiently and autonomously be cleaned solely based on the
structural regularity of the characters it contains. In different examples
using characters from different alphabets, we demonstrate generality of the
approach and discuss its implications for future developments.Comment: oral presentation and Google Student Travel Award; IEEE conference on
Computer Vision and Pattern Recognition 201
The Statistical Inefficiency of Sparse Coding for Images (or, One Gabor to Rule them All)
Sparse coding is a proven principle for learning compact representations of
images. However, sparse coding by itself often leads to very redundant
dictionaries. With images, this often takes the form of similar edge detectors
which are replicated many times at various positions, scales and orientations.
An immediate consequence of this observation is that the estimation of the
dictionary components is not statistically efficient. We propose a factored
model in which factors of variation (e.g. position, scale and orientation) are
untangled from the underlying Gabor-like filters. There is so much redundancy
in sparse codes for natural images that our model requires only a single
dictionary element (a Gabor-like edge detector) to outperform standard sparse
coding. Our model scales naturally to arbitrary-sized images while achieving
much greater statistical efficiency during learning. We validate this claim
with a number of experiments showing, in part, superior compression of
out-of-sample data using a sparse coding dictionary learned with only a single
image.Comment: 9 pages, 8 figure
A stochastic algorithm for probabilistic independent component analysis
The decomposition of a sample of images on a relevant subspace is a recurrent
problem in many different fields from Computer Vision to medical image
analysis. We propose in this paper a new learning principle and implementation
of the generative decomposition model generally known as noisy ICA (for
independent component analysis) based on the SAEM algorithm, which is a
versatile stochastic approximation of the standard EM algorithm. We demonstrate
the applicability of the method on a large range of decomposition models and
illustrate the developments with experimental results on various data sets.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS499 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A Stochastic Algorithm for Probabilistic Independent Component Analysis
The decomposition of a sample of images on a relevant subspace is a recurrent problem in many different fields from Computer Vision to medical image analysis. We propose in this paper a new learning principle and implementation of the generative decomposition model generally known as noisy ICA (for independent component analysis) based on the SAEM algorithm, which is a versatile stochastic approximation of the standard EM algorithm. We demonstrate the applicability of the method on a large range of decomposition models and illustrate the developments with experimental results on various data sets
Invariance of visual operations at the level of receptive fields
Receptive field profiles registered by cell recordings have shown that
mammalian vision has developed receptive fields tuned to different sizes and
orientations in the image domain as well as to different image velocities in
space-time. This article presents a theoretical model by which families of
idealized receptive field profiles can be derived mathematically from a small
set of basic assumptions that correspond to structural properties of the
environment. The article also presents a theory for how basic invariance
properties to variations in scale, viewing direction and relative motion can be
obtained from the output of such receptive fields, using complementary
selection mechanisms that operate over the output of families of receptive
fields tuned to different parameters. Thereby, the theory shows how basic
invariance properties of a visual system can be obtained already at the level
of receptive fields, and we can explain the different shapes of receptive field
profiles found in biological vision from a requirement that the visual system
should be invariant to the natural types of image transformations that occur in
its environment.Comment: 40 pages, 17 figure