9 research outputs found
Neural Expectation Maximization
Many real world tasks such as reasoning and physical interaction require
identification and manipulation of conceptual entities. A first step towards
solving these tasks is the automated discovery of distributed symbol-like
representations. In this paper, we explicitly formalize this problem as
inference in a spatial mixture model where each component is parametrized by a
neural network. Based on the Expectation Maximization framework we then derive
a differentiable clustering method that simultaneously learns how to group and
represent individual entities. We evaluate our method on the (sequential)
perceptual grouping task and find that it is able to accurately recover the
constituent objects. We demonstrate that the learned representations are useful
for next-step prediction.Comment: Accepted to NIPS 201
Mixtures of conditional Gaussian scale mixtures applied to multiscale image representations
We present a probabilistic model for natural images which is based on
Gaussian scale mixtures and a simple multiscale representation. In contrast to
the dominant approach to modeling whole images focusing on Markov random
fields, we formulate our model in terms of a directed graphical model. We show
that it is able to generate images with interesting higher-order correlations
when trained on natural images or samples from an occlusion based model. More
importantly, the directed model enables us to perform a principled evaluation.
While it is easy to generate visually appealing images, we demonstrate that our
model also yields the best performance reported to date when evaluated with
respect to the cross-entropy rate, a measure tightly linked to the average
log-likelihood
Natural Image Coding in V1: How Much Use is Orientation Selectivity?
Orientation selectivity is the most striking feature of simple cell coding in
V1 which has been shown to emerge from the reduction of higher-order
correlations in natural images in a large variety of statistical image models.
The most parsimonious one among these models is linear Independent Component
Analysis (ICA), whereas second-order decorrelation transformations such as
Principal Component Analysis (PCA) do not yield oriented filters. Because of
this finding it has been suggested that the emergence of orientation
selectivity may be explained by higher-order redundancy reduction. In order to
assess the tenability of this hypothesis, it is an important empirical question
how much more redundancies can be removed with ICA in comparison to PCA, or
other second-order decorrelation methods. This question has not yet been
settled, as over the last ten years contradicting results have been reported
ranging from less than five to more than hundred percent extra gain for ICA.
Here, we aim at resolving this conflict by presenting a very careful and
comprehensive analysis using three evaluation criteria related to redundancy
reduction: In addition to the multi-information and the average log-loss we
compute, for the first time, complete rate-distortion curves for ICA in
comparison with PCA. Without exception, we find that the advantage of the ICA
filters is surprisingly small. Furthermore, we show that a simple spherically
symmetric distribution with only two parameters can fit the data even better
than the probabilistic model underlying ICA. Since spherically symmetric models
are agnostic with respect to the specific filter shapes, we conlude that
orientation selectivity is unlikely to play a critical role for redundancy
reduction
Cortical Surround Interactions and Perceptual Salience via Natural Scene Statistics
Spatial context in images induces perceptual phenomena associated with salience and modulates the responses of neurons in primary visual cortex (V1). However, the computational and ecological principles underlying contextual effects are incompletely understood. We introduce a model of natural images that includes grouping and segmentation of neighboring features based on their joint statistics, and we interpret the firing rates of V1 neurons as performing optimal recognition in this model. We show that this leads to a substantial generalization of divisive normalization, a computation that is ubiquitous in many neural areas and systems. A main novelty in our model is that the influence of the context on a target stimulus is determined by their degree of statistical dependence. We optimized the parameters of the model on natural image patches, and then simulated neural and perceptual responses on stimuli used in classical experiments. The model reproduces some rich and complex response patterns observed in V1, such as the contrast dependence, orientation tuning and spatial asymmetry of surround suppression, while also allowing for surround facilitation under conditions of weak stimulation. It also mimics the perceptual salience produced by simple displays, and leads to readily testable predictions. Our results provide a principled account of orientation-based contextual modulation in early vision and its sensitivity to the homogeneity and spatial arrangement of inputs, and lends statistical support to the theory that V1 computes visual salience
Sparse and low-rank techniques for the efficient restoration of images
Image reconstruction is a key problem in numerous applications of computer vision and medical imaging. By removing noise and artifacts from corrupted images, or by enhancing the quality of low-resolution images, reconstruction methods are essential to provide high-quality images for these applications. Over the years, extensive research efforts have been invested toward the development of accurate and efficient approaches for this problem.
Recently, considerable improvements have been achieved by exploiting the principles of sparse representation and nonlocal self-similarity. However, techniques based on these principles often suffer from important limitations that impede their use in high-quality and large-scale applications. Thus, sparse representation approaches consider local patches during reconstruction, but ignore the global structure of the image. Likewise, because they average over groups of similar patches, nonlocal self-similarity methods tend to over-smooth images. Such methods can also be computationally expensive, requiring a hour or more to reconstruct a single image. Furthermore, existing reconstruction approaches consider either local patch-based regularization or global structure regularization, due to the complexity of combining both regularization strategies in a single model. Yet, such combined model could improve upon existing techniques by removing noise or reconstruction artifacts, while preserving both local details and global structure in the image. Similarly, current approaches rarely consider external information during the reconstruction process. When the structure to reconstruct is known, external information like statistical atlases or geometrical priors could also improve performance by guiding the reconstruction.
This thesis addresses limitations of the prior art through three distinct contributions. The first contribution investigates the histogram of image gradients as a powerful prior for image reconstruction. Due to the trade-off between noise removal and smoothing, image reconstruction techniques based on global or local regularization often over-smooth the image, leading to the loss of edges and textures. To alleviate this problem, we propose a novel prior for preserving the distribution of image gradients modeled as a histogram. This prior is combined with low-rank patch regularization in a single efficient model, which is then shown to improve reconstruction accuracy for the problems of denoising and deblurring.
The second contribution explores the joint modeling of local and global structure regularization for image restoration. Toward this goal, groups of similar patches are reconstructed simultaneously using an adaptive regularization technique based on the weighted nuclear norm. An innovative strategy, which decomposes the image into a smooth component and a sparse residual, is proposed to preserve global image structure. This strategy is shown to better exploit the property of structure sparsity than standard techniques like total variation. The proposed model is evaluated on the problems of completion and super-resolution, outperforming state-of-the-art approaches for these tasks.
Lastly, the third contribution of this thesis proposes an atlas-based prior for the efficient reconstruction of MR data. Although popular, image priors based on total variation and nonlocal patch similarity often over-smooth edges and textures in the image due to the uniform regularization of gradients. Unlike natural images, the spatial characteristics of medical images are often restricted by the target anatomical structure and imaging modality. Based on this principle, we propose a novel MRI reconstruction method that leverages external information in the form of an probabilistic atlas. This atlas controls the level of gradient regularization at each image location, via a weighted total-variation prior. The proposed method also exploits the redundancy of nonlocal similar patches through a sparse representation model. Experiments on a large scale dataset of T1-weighted images show this method to be highly competitive with the state-of-the-art
Recent Advances in Signal Processing
The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity
Image denoising using mixtures of gaussian scale mixtures
The local statistical properties of photographic images, when represented in a multi-scale basis, have been described using Gaussian scale mixtures (GSMs). In that model, each spatial neighborhood of coefficients is described as a Gaussian random vector modulated by a random hidden positive scaling variable. Here, we introduce a more powerful model in which neighborhoods of each subband are described as a finite mixture of GSMs. We develop methods to learn the mixing densities and covariance matrices associated with each of the GSM components from a single image, and show that this process naturally segments the image into regions of similar content. The model parameters can also be learned in the presence of additive Gaussian noise, and the resulting fitted model may be used as a prior for Bayesian noise removal. Simulations demonstrate this model substantially outperforms the original GSM model. Index Terms — Image denoising, Image modelling, Gaussian scale mixture