9 research outputs found

    Learning Active Basis Models by EM-Type Algorithms

    Full text link
    EM algorithm is a convenient tool for maximum likelihood model fitting when the data are incomplete or when there are latent variables or hidden states. In this review article we explain that EM algorithm is a natural computational scheme for learning image templates of object categories where the learning is not fully supervised. We represent an image template by an active basis model, which is a linear composition of a selected set of localized, elongated and oriented wavelet elements that are allowed to slightly perturb their locations and orientations to account for the deformations of object shapes. The model can be easily learned when the objects in the training images are of the same pose, and appear at the same location and scale. This is often called supervised learning. In the situation where the objects may appear at different unknown locations, orientations and scales in the training images, we have to incorporate the unknown locations, orientations and scales as latent variables into the image generation process, and learn the template by EM-type algorithms. The E-step imputes the unknown locations, orientations and scales based on the currently learned template. This step can be considered self-supervision, which involves using the current template to recognize the objects in the training images. The M-step then relearns the template based on the imputed locations, orientations and scales, and this is essentially the same as supervised learning. So the EM learning process iterates between recognition and supervised learning. We illustrate this scheme by several experiments.Comment: Published in at http://dx.doi.org/10.1214/09-STS281 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    I2T: Image Parsing to Text Description

    Full text link

    Model-based image analysis for forensic shoe print recognition

    Get PDF
    This thesis is about automated forensic shoe print recognition. Recognizing a shoe print in an image is an inherently difficult task. Shoe prints vary in their pose, shape and appearance. They are surrounded and partially occluded by other objects and may be left on a wide range of diverse surfaces. We propose to formulate this task in a model-based image analysis framework. Our framework is based on the Active Basis Model. A shoe print is represented as hierarchical composition of basis filters. The individual filters encode local information about the geometry and appearance of the shoe print pattern. The hierarchical com- position encodes mid- and long-range geometric properties of the object. A statistical distribution is imposed on the parameters of this representation, in order to account for the variation in a shoe print‘s geometry and appearance. Our work extends the Active Basis Model in various ways, in order to make it robustly applicable to the analysis of shoe print images. We propose an algorithm that automat- ically infers an efficient hierarchical dependency structure between the basis filters. The learned hierarchical dependencies are beneficial for our further extensions, while at the same time permitting an efficient optimization process. We introduce an occlusion model and propose to leverage the hierarchical dependencies to integrate contextual informa- tion efficiently into the reasoning process about occlusions. Finally, we study the effect of the basis filter on the discrimination of the object from the background. In this con- text, we highlight the role of the hierarchical model structure in terms of combining the locally ambiguous filter response into a sophisticated discriminator. The main contribution of this work is a model-based image analysis framework which represents a planar object‘s variation in shape and appearance, it‘s partial occlusion as well as background clutter. The model parameters are optimized jointly in an efficient optimization scheme. Our extensions to the Active Basis Model lead to an improved discriminative ability and permit coherent occlusions and hierarchical deformations. The experimental results demonstrate a new state of the art performance at the task of forensic shoe print recognition

    Pattern Recognition in High-Throughput Zebrafish Imaging

    Get PDF
    High Throughput (HT) methods are high volume experimental approaches that are common in the fields of the life-sciences. The instrumentation for these methods differs per application. We will focus on the HT methods that are concerned with imaging. The aim of this thesis is to find robust methods for object extraction and analysis. We focus on the Computer Science aspects of such analysis, namely pattern recognition. Pattern Recognition can be seen in the context of object recognition and data mining. Both aspects will be described in this thesis. We present a framework for segmenting and recognizing the objects of interest based on Template Matching. This approach was designed for an application in the HT screening of zebrafish embryos. All proposed methods are fully automated. We further elaborate on the segmentation algorithms to apply these in software that can be used in a HT context to derive measurements. Then we apply the software on a real life problem involving zebrafish infected with Mycobacterium marinum.SmartmixComputer Systems, Imagery and Medi

    Synthesizing and Editing Photo-realistic Visual Objects

    Get PDF
    In this thesis we investigate novel methods of synthesizing new images of a deformable visual object using a collection of images of the object. We investigate both parametric and non-parametric methods as well as a combination of the two methods for the problem of image synthesis. Our main focus are complex visual objects, specifically deformable objects and objects with varying numbers of visible parts. We first introduce sketch-driven image synthesis system, which allows the user to draw ellipses and outlines in order to sketch a rough shape of animals as a constraint to the synthesized image. This system interactively provides feedback in the form of ellipse and contour suggestions to the partial sketch of the user. The user's sketch guides the non-parametric synthesis algorithm that blends patches from two exemplar images in a coarse-to-fine fashion to create a final image. We evaluate the method and synthesized images through two user studies. Instead of non-parametric blending of patches, a parametric model of the appearance is more desirable as its appearance representation is shared between all images of the dataset. Hence, we propose Context-Conditioned Component Analysis, a probabilistic generative parametric model, which described images with a linear combination of basis functions. The basis functions are evaluated for each pixel using a context vector computed from the local shape information. We evaluate C-CCA qualitatively and quantitatively on inpainting, appearance transfer and reconstruction tasks. Drawing samples of C-CCA generates novel, globally-coherent images, which, unfortunately, lack high-frequency details due to dimensionality reduction and misalignment. We develop a non-parametric model that enhances the samples of C-CCA with locally-coherent, high-frequency details. The non-parametric model efficiently finds patches from the dataset that match the C-CCA sample and blends the patches together. We analyze the results of the combined method on the datasets of horse and elephant images
    corecore