245 research outputs found
A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity
The richness of natural images makes the quest for optimal representations in
image processing and computer vision challenging. The latter observation has
not prevented the design of image representations, which trade off between
efficiency and complexity, while achieving accurate rendering of smooth regions
as well as reproducing faithful contours and textures. The most recent ones,
proposed in the past decade, share an hybrid heritage highlighting the
multiscale and oriented nature of edges and patterns in images. This paper
presents a panorama of the aforementioned literature on decompositions in
multiscale, multi-orientation bases or dictionaries. They typically exhibit
redundancy to improve sparsity in the transformed domain and sometimes its
invariance with respect to simple geometric deformations (translation,
rotation). Oriented multiscale dictionaries extend traditional wavelet
processing and may offer rotation invariance. Highly redundant dictionaries
require specific algorithms to simplify the search for an efficient (sparse)
representation. We also discuss the extension of multiscale geometric
decompositions to non-Euclidean domains such as the sphere or arbitrary meshed
surfaces. The etymology of panorama suggests an overview, based on a choice of
partially overlapping "pictures". We hope that this paper will contribute to
the appreciation and apprehension of a stream of current research directions in
image understanding.Comment: 65 pages, 33 figures, 303 reference
A Generative Model of Natural Texture Surrogates
Natural images can be viewed as patchworks of different textures, where the
local image statistics is roughly stationary within a small neighborhood but
otherwise varies from region to region. In order to model this variability, we
first applied the parametric texture algorithm of Portilla and Simoncelli to
image patches of 64X64 pixels in a large database of natural images such that
each image patch is then described by 655 texture parameters which specify
certain statistics, such as variances and covariances of wavelet coefficients
or coefficient magnitudes within that patch.
To model the statistics of these texture parameters, we then developed
suitable nonlinear transformations of the parameters that allowed us to fit
their joint statistics with a multivariate Gaussian distribution. We find that
the first 200 principal components contain more than 99% of the variance and
are sufficient to generate textures that are perceptually extremely close to
those generated with all 655 components. We demonstrate the usefulness of the
model in several ways: (1) We sample ensembles of texture patches that can be
directly compared to samples of patches from the natural image database and can
to a high degree reproduce their perceptual appearance. (2) We further
developed an image compression algorithm which generates surprisingly accurate
images at bit rates as low as 0.14 bits/pixel. Finally, (3) We demonstrate how
our approach can be used for an efficient and objective evaluation of samples
generated with probabilistic models of natural images.Comment: 34 pages, 9 figure
A novel fast and reduced redundancy structure for multiscale directional filter banks
2007-2008 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe
Natural Image Statistics for Natural Image Segmentation
Building on recent progress in modeling filter response statistics of natural mages we integrate a statistical model into a variational framework for image segmentation. Incorporated in asound probabilistic distance measure the model drives level sets toward meaningful segment at ions of complex textures and natural scenes. Despite its enhanced descriptive power our approach preserves the efficiency of level set based segmentation since each connected region comprises two model parameters only. We validate the statistical basis of our model on thousands of natural images and demonstrate that our approach outperforms recent variational segment at ion methods based on second-order statistics
A polar prediction model for learning to represent visual transformations
All organisms make temporal predictions, and their evolutionary fitness level
depends on the accuracy of these predictions. In the context of visual
perception, the motions of both the observer and objects in the scene structure
the dynamics of sensory signals, allowing for partial prediction of future
signals based on past ones. Here, we propose a self-supervised
representation-learning framework that extracts and exploits the regularities
of natural videos to compute accurate predictions. We motivate the polar
architecture by appealing to the Fourier shift theorem and its group-theoretic
generalization, and we optimize its parameters on next-frame prediction.
Through controlled experiments, we demonstrate that this approach can discover
the representation of simple transformation groups acting in data. When trained
on natural video datasets, our framework achieves better prediction performance
than traditional motion compensation and rivals conventional deep networks,
while maintaining interpretability and speed. Furthermore, the polar
computations can be restructured into components resembling normalized simple
and direction-selective complex cell models of primate V1 neurons. Thus, polar
prediction offers a principled framework for understanding how the visual
system represents sensory inputs in a form that simplifies temporal prediction
Surface reflectance estimation from spatio-temporal subband statistics of moving object videos
Ankara : The Department of Electrical and Electronics Engineering and the Graduate School of Engineering and Science of Bilkent University, 2012.Thesis (Master's) -- Bilkent University, 2012.Includes bibliographical refences.Image motion can convey a broad range of object properties including 3D structure
(structure from motion, SfM), animacy (biological motion), and its material.
Our understanding of how the visual system may estimate complex properties
such as surface reflectance or object rigidity from image motion is still limited. In
order to reveal the neural mechanisms underlying surface material understanding,
a natural point to begin with is to study the output of filters that mimic
response properties of low level visual neurons to different classes of moving textures,
such as patches of shiny and matte surfaces. To this end we designed
spatio-temporal bandpass filters whose frequency response is the second order
derivative of the Gaussian function. Those filters are generated towards eight
orientations in three scales in the frequency domain. We computed responses of
these filters to dynamic specular and matte textures. Specifically, we assessed
the statistics of the resultant filter output histograms and calculated the mean,
standard deviation, skewness and kurtosis of those histograms. We found that
there were substantial differences in standard deviation and skewness of specular
and matte texture subband histograms. To formally test whether these simple
measurements can in fact predict surface material from image motion we developed
a computer-assisted classifier based on these statistics. The results of the
classification showed that, 75% of all movies are classified correctly, where the
correct classification rate of shiny object movies is around 77% and the correct
classification rate of matte object movies is around 71%. Next, we synthesized
dynamic textures which resembled the subband statistics of videos of moving
shiny and matte objects. Interestingly the appearance of these synthesized textures
were neither shiny nor matte. Taken together our results indicate that
there are differences in the spatio-temporal subband statistics of image motion
generated by rotating matte and specular objects. While these differences may
be utilized by the human brain during the perceptual process, our results on the
synthesized textures suggest that the statistics may not be sufficient to judge the
material qualities of an object.Külçe, OnurM.S
Estimating the Material Properties of Fabric from Video
Passively estimating the intrinsic material properties of deformable objects moving in a natural environment is essential for scene understanding. We present a framework to automatically analyze videos of fabrics moving under various unknown wind forces, and recover two key material properties of the fabric: stiffness and area weight. We extend features previously developed to compactly represent static image textures to describe video textures, such as fabric motion. A discriminatively trained regression model is then used to predict the physical properties of fabric from these features. The success of our model is demonstrated on a new, publicly available database of fabric videos with corresponding measured ground truth material properties. We show that our predictions are well correlated with ground truth measurements of stiffness and density for the fabrics. Our contributions include: (a) a database that can be used for training and testing algorithms for passively predicting fabric properties from video, (b) an algorithm for predicting the material properties of fabric from a video, and (c) a perceptual study of humans' ability to estimate the material properties of fabric from videos and images.National Science Foundation (U.S.) (CGV-1111415)National Science Foundation (U.S.) (CGV-1212928)National Science Foundation (U.S.). Graduate Research FellowshipMassachusetts Institute of Technology (Intelligent Initiative Postdoctoral Fellowship)United States. Intelligence Advanced Research Projects Activity (D10PC20023
- …