245 research outputs found

    A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity

    Full text link
    The richness of natural images makes the quest for optimal representations in image processing and computer vision challenging. The latter observation has not prevented the design of image representations, which trade off between efficiency and complexity, while achieving accurate rendering of smooth regions as well as reproducing faithful contours and textures. The most recent ones, proposed in the past decade, share an hybrid heritage highlighting the multiscale and oriented nature of edges and patterns in images. This paper presents a panorama of the aforementioned literature on decompositions in multiscale, multi-orientation bases or dictionaries. They typically exhibit redundancy to improve sparsity in the transformed domain and sometimes its invariance with respect to simple geometric deformations (translation, rotation). Oriented multiscale dictionaries extend traditional wavelet processing and may offer rotation invariance. Highly redundant dictionaries require specific algorithms to simplify the search for an efficient (sparse) representation. We also discuss the extension of multiscale geometric decompositions to non-Euclidean domains such as the sphere or arbitrary meshed surfaces. The etymology of panorama suggests an overview, based on a choice of partially overlapping "pictures". We hope that this paper will contribute to the appreciation and apprehension of a stream of current research directions in image understanding.Comment: 65 pages, 33 figures, 303 reference

    A Generative Model of Natural Texture Surrogates

    Full text link
    Natural images can be viewed as patchworks of different textures, where the local image statistics is roughly stationary within a small neighborhood but otherwise varies from region to region. In order to model this variability, we first applied the parametric texture algorithm of Portilla and Simoncelli to image patches of 64X64 pixels in a large database of natural images such that each image patch is then described by 655 texture parameters which specify certain statistics, such as variances and covariances of wavelet coefficients or coefficient magnitudes within that patch. To model the statistics of these texture parameters, we then developed suitable nonlinear transformations of the parameters that allowed us to fit their joint statistics with a multivariate Gaussian distribution. We find that the first 200 principal components contain more than 99% of the variance and are sufficient to generate textures that are perceptually extremely close to those generated with all 655 components. We demonstrate the usefulness of the model in several ways: (1) We sample ensembles of texture patches that can be directly compared to samples of patches from the natural image database and can to a high degree reproduce their perceptual appearance. (2) We further developed an image compression algorithm which generates surprisingly accurate images at bit rates as low as 0.14 bits/pixel. Finally, (3) We demonstrate how our approach can be used for an efficient and objective evaluation of samples generated with probabilistic models of natural images.Comment: 34 pages, 9 figure

    A novel fast and reduced redundancy structure for multiscale directional filter banks

    Get PDF
    2007-2008 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe

    Natural Image Statistics for Natural Image Segmentation

    Get PDF
    Building on recent progress in modeling filter response statistics of natural mages we integrate a statistical model into a variational framework for image segmentation. Incorporated in asound probabilistic distance measure the model drives level sets toward meaningful segment at ions of complex textures and natural scenes. Despite its enhanced descriptive power our approach preserves the efficiency of level set based segmentation since each connected region comprises two model parameters only. We validate the statistical basis of our model on thousands of natural images and demonstrate that our approach outperforms recent variational segment at ion methods based on second-order statistics

    A polar prediction model for learning to represent visual transformations

    Full text link
    All organisms make temporal predictions, and their evolutionary fitness level depends on the accuracy of these predictions. In the context of visual perception, the motions of both the observer and objects in the scene structure the dynamics of sensory signals, allowing for partial prediction of future signals based on past ones. Here, we propose a self-supervised representation-learning framework that extracts and exploits the regularities of natural videos to compute accurate predictions. We motivate the polar architecture by appealing to the Fourier shift theorem and its group-theoretic generalization, and we optimize its parameters on next-frame prediction. Through controlled experiments, we demonstrate that this approach can discover the representation of simple transformation groups acting in data. When trained on natural video datasets, our framework achieves better prediction performance than traditional motion compensation and rivals conventional deep networks, while maintaining interpretability and speed. Furthermore, the polar computations can be restructured into components resembling normalized simple and direction-selective complex cell models of primate V1 neurons. Thus, polar prediction offers a principled framework for understanding how the visual system represents sensory inputs in a form that simplifies temporal prediction

    Surface reflectance estimation from spatio-temporal subband statistics of moving object videos

    Get PDF
    Ankara : The Department of Electrical and Electronics Engineering and the Graduate School of Engineering and Science of Bilkent University, 2012.Thesis (Master's) -- Bilkent University, 2012.Includes bibliographical refences.Image motion can convey a broad range of object properties including 3D structure (structure from motion, SfM), animacy (biological motion), and its material. Our understanding of how the visual system may estimate complex properties such as surface reflectance or object rigidity from image motion is still limited. In order to reveal the neural mechanisms underlying surface material understanding, a natural point to begin with is to study the output of filters that mimic response properties of low level visual neurons to different classes of moving textures, such as patches of shiny and matte surfaces. To this end we designed spatio-temporal bandpass filters whose frequency response is the second order derivative of the Gaussian function. Those filters are generated towards eight orientations in three scales in the frequency domain. We computed responses of these filters to dynamic specular and matte textures. Specifically, we assessed the statistics of the resultant filter output histograms and calculated the mean, standard deviation, skewness and kurtosis of those histograms. We found that there were substantial differences in standard deviation and skewness of specular and matte texture subband histograms. To formally test whether these simple measurements can in fact predict surface material from image motion we developed a computer-assisted classifier based on these statistics. The results of the classification showed that, 75% of all movies are classified correctly, where the correct classification rate of shiny object movies is around 77% and the correct classification rate of matte object movies is around 71%. Next, we synthesized dynamic textures which resembled the subband statistics of videos of moving shiny and matte objects. Interestingly the appearance of these synthesized textures were neither shiny nor matte. Taken together our results indicate that there are differences in the spatio-temporal subband statistics of image motion generated by rotating matte and specular objects. While these differences may be utilized by the human brain during the perceptual process, our results on the synthesized textures suggest that the statistics may not be sufficient to judge the material qualities of an object.Külçe, OnurM.S

    3D Steerable Wavelets in Practice

    Full text link

    Estimating the Material Properties of Fabric from Video

    Get PDF
    Passively estimating the intrinsic material properties of deformable objects moving in a natural environment is essential for scene understanding. We present a framework to automatically analyze videos of fabrics moving under various unknown wind forces, and recover two key material properties of the fabric: stiffness and area weight. We extend features previously developed to compactly represent static image textures to describe video textures, such as fabric motion. A discriminatively trained regression model is then used to predict the physical properties of fabric from these features. The success of our model is demonstrated on a new, publicly available database of fabric videos with corresponding measured ground truth material properties. We show that our predictions are well correlated with ground truth measurements of stiffness and density for the fabrics. Our contributions include: (a) a database that can be used for training and testing algorithms for passively predicting fabric properties from video, (b) an algorithm for predicting the material properties of fabric from a video, and (c) a perceptual study of humans' ability to estimate the material properties of fabric from videos and images.National Science Foundation (U.S.) (CGV-1111415)National Science Foundation (U.S.) (CGV-1212928)National Science Foundation (U.S.). Graduate Research FellowshipMassachusetts Institute of Technology (Intelligent Initiative Postdoctoral Fellowship)United States. Intelligence Advanced Research Projects Activity (D10PC20023
    corecore