771 research outputs found

    A generative model for separating illumination and reflectance from images

    Get PDF
    It is well known that even slight changes in nonuniform illumination lead to a large image variability and are crucial for many visual tasks. This paper presents a new ICA related probabilistic model where the number of sources exceeds the number of sensors to perform an image segmentation and illumination removal, simultaneously. We model illumination and reflectance in log space by a generalized autoregressive process and Hidden Gaussian Markov random field, respectively. The model ability to deal with segmentation of illuminated images is compared with a Canny edge detector and homomorphic filtering. We apply the model to two problems: synthetic image segmentation and sea surface pollution detection from intensity images

    The Visual Centrifuge: Model-Free Layered Video Representations

    Full text link
    True video understanding requires making sense of non-lambertian scenes where the color of light arriving at the camera sensor encodes information about not just the last object it collided with, but about multiple mediums -- colored windows, dirty mirrors, smoke or rain. Layered video representations have the potential of accurately modelling realistic scenes but have so far required stringent assumptions on motion, lighting and shape. Here we propose a learning-based approach for multi-layered video representation: we introduce novel uncertainty-capturing 3D convolutional architectures and train them to separate blended videos. We show that these models then generalize to single videos, where they exhibit interesting abilities: color constancy, factoring out shadows and separating reflections. We present quantitative and qualitative results on real world videos.Comment: Appears in: 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019). This arXiv contains the CVPR Camera Ready version of the paper (although we have included larger figures) as well as an appendix detailing the model architectur

    Programmable Spectrometry -- Per-pixel Classification of Materials using Learned Spectral Filters

    Full text link
    Many materials have distinct spectral profiles. This facilitates estimation of the material composition of a scene at each pixel by first acquiring its hyperspectral image, and subsequently filtering it using a bank of spectral profiles. This process is inherently wasteful since only a set of linear projections of the acquired measurements contribute to the classification task. We propose a novel programmable camera that is capable of producing images of a scene with an arbitrary spectral filter. We use this camera to optically implement the spectral filtering of the scene's hyperspectral image with the bank of spectral profiles needed to perform per-pixel material classification. This provides gains both in terms of acquisition speed --- since only the relevant measurements are acquired --- and in signal-to-noise ratio --- since we invariably avoid narrowband filters that are light inefficient. Given training data, we use a range of classical and modern techniques including SVMs and neural networks to identify the bank of spectral profiles that facilitate material classification. We verify the method in simulations on standard datasets as well as real data using a lab prototype of the camera

    Efficient illumination independent appearance-based face tracking

    Get PDF
    One of the major challenges that visual tracking algorithms face nowadays is being able to cope with changes in the appearance of the target during tracking. Linear subspace models have been extensively studied and are possibly the most popular way of modelling target appearance. We introduce a linear subspace representation in which the appearance of a face is represented by the addition of two approxi- mately independent linear subspaces modelling facial expressions and illumination respectively. This model is more compact than previous bilinear or multilinear ap- proaches. The independence assumption notably simplifies system training. We only require two image sequences. One facial expression is subject to all possible illumina- tions in one sequence and the face adopts all facial expressions under one particular illumination in the other. This simple model enables us to train the system with no manual intervention. We also revisit the problem of efficiently fitting a linear subspace-based model to a target image and introduce an additive procedure for solving this problem. We prove that Matthews and Baker’s Inverse Compositional Approach makes a smoothness assumption on the subspace basis that is equiva- lent to Hager and Belhumeur’s, which worsens convergence. Our approach differs from Hager and Belhumeur’s additive and Matthews and Baker’s compositional ap- proaches in that we make no smoothness assumptions on the subspace basis. In the experiments conducted we show that the model introduced accurately represents the appearance variations caused by illumination changes and facial expressions. We also verify experimentally that our fitting procedure is more accurate and has better convergence rate than the other related approaches, albeit at the expense of a slight increase in computational cost. Our approach can be used for tracking a human face at standard video frame rates on an average personal computer

    Recovering Intrinsic Images from a Single Image

    Get PDF
    We present an algorithm that uses multiple cues to recover shading and reflectance intrinsic images from a single image. Using both color information and a classifier trained to recognize gray-scale patterns, each image derivative is classified as being caused by shading or a change in the surface's reflectance. Generalized Belief Propagation is then used to propagate information from areas where the correct classification is clear to areas where it is ambiguous. We also show results on real images
    • …
    corecore