475 research outputs found

    Manifold Constrained Low-Rank Decomposition

    Full text link
    Low-rank decomposition (LRD) is a state-of-the-art method for visual data reconstruction and modelling. However, it is a very challenging problem when the image data contains significant occlusion, noise, illumination variation, and misalignment from rotation or viewpoint changes. We leverage the specific structure of data in order to improve the performance of LRD when the data are not ideal. To this end, we propose a new framework that embeds manifold priors into LRD. To implement the framework, we design an alternating direction method of multipliers (ADMM) method which efficiently integrates the manifold constraints during the optimization process. The proposed approach is successfully used to calculate low-rank models from face images, hand-written digits and planar surface images. The results show a consistent increase of performance when compared to the state-of-the-art over a wide range of realistic image misalignments and corruptions

    Automatic facial landmark labeling with minimal supervision.

    Get PDF
    Abstract Landmark labeling of training images is essential for many learning tasks in computer vision, such as object detection, tracking, an

    Unsupervised Face Alignment by Robust Nonrigid Mapping

    Get PDF
    We propose a novel approach to unsupervised facial im-age alignment. Differently from previous approaches, that are confined to affine transformations on either the entire face or separate patches, we extract a nonrigid mapping be-tween facial images. Based on a regularized face model, we frame unsupervised face alignment into the Lucas-Kanade image registration approach. We propose a robust optimiza-tion scheme to handle appearance variations. The method is fully automatic and can cope with pose variations and ex-pressions, all in an unsupervised manner. Experiments on a large set of images showed that the approach is effective. 1

    Deformable face ensemble alignment with robust grouped-L1 anchors

    Get PDF
    Many methods exist at the moment for deformable face fitting. A drawback to nearly all these approaches is that they are (i) noisy in terms of landmark positions, and (ii) the noise is biased across frames (i.e. the misalignment is toward common directions across all frames). In this paper we propose a grouped L1\mathcal{L}1-norm anchored method for simultaneously aligning an ensemble of deformable face images stemming from the same subject, given noisy heterogeneous landmark estimates. Impressive alignment performance improvement and refinement is obtained using very weak initialization as "anchors"

    Learning from one example in machine vision by sharing probability densities

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 125-130).Human beings exhibit rapid learning when presented with a small number of images of a new object. A person can identify an object under a wide variety of visual conditions after having seen only a single example of that object. This ability can be partly explained by the application of previously learned statistical knowledge to a new setting. This thesis presents an approach to acquiring knowledge in one setting and using it in another. Specifically, we develop probability densities over common image changes. Given a single image of a new object and a model of change learned from a different object, we form a model of the new object that can be used for synthesis, classification, and other visual tasks. We start by modeling spatial changes. We develop a framework for learning statistical knowledge of spatial transformations in one task and using that knowledge in a new task. By sharing a probability density over spatial transformations learned from a sample of handwritten letters, we develop a handwritten digit classifier that achieves 88.6% accuracy using only a single hand-picked training example from each class. The classification scheme includes a new algorithm, congealing, for the joint alignment of a set of images using an entropy minimization criterion. We investigate properties of this algorithm and compare it to other methods of addressing spatial variability in images. We illustrate its application to binary images, gray-scale images, and a set of 3-D neonatal magnetic resonance brain volumes.Next, we extend the method of change modeling from spatial transformations to color transformations. By measuring statistically common joint color changes of a scene in an office environment, and then applying standard statistical techniques such as principal components analysis, we develop a probabilistic model of color change. We show that these color changes, which we call color flows, can be shared effectively between certain types of scenes. That is, a probability density over color change developed by observing one scene can provide useful information about the variability of another scene. We demonstrate a variety of applications including image synthesis, image matching, and shadow detection.by Erik G. Miller.Ph.D

    Unsupervised alignment of objects in images

    Get PDF
    With the advent of computer vision, various applications become interested to apply it to interpret the 3D and 2D scenes. The main core of computer vision is visual object detection which deals with detecting and representing objects in the image. Visual object detection requires to learn a model of each class type (e.g. car, cat) to be capable to detect objects belonging to the same class. Class learning benefits from a method which automatically aligns class examples making learning more straightforward. The objective of this thesis is to further develop the sate-of-the-art feature-based alignment method which rigidly and automatically aligns object class images to a manually selected seed image. We try to compensate the weakness by providing a method to automatically select the best seed from dataset. Our method first extracts features by utilizing dense sampling method and then scale invariant feature transform (SIFT) descriptor is used to find best matches as initial local feature matches. The final alignment is based on spatial scoring procedure where the initial matches are refined to a set of spatially verified matches. The spatial score is used next to calculate similarity scores. We propose an algorithm which operates on spatial and similarity scores and finally selects the best seed. We also investigate the performance of step-wise alignment using minimum spanning tree (MST) and Dijkstra shortest path instead of direct alignment utilizing a single seed. We conduct our experiments using classes of Caltech-101 for which our unsupervised seed selection and step-wise alignment achieve state-of-the-art performance

    A Deep Moving-camera Background Model

    Full text link
    In video analysis, background models have many applications such as background/foreground separation, change detection, anomaly detection, tracking, and more. However, while learning such a model in a video captured by a static camera is a fairly-solved task, in the case of a Moving-camera Background Model (MCBM), the success has been far more modest due to algorithmic and scalability challenges that arise due to the camera motion. Thus, existing MCBMs are limited in their scope and their supported camera-motion types. These hurdles also impeded the employment, in this unsupervised task, of end-to-end solutions based on deep learning (DL). Moreover, existing MCBMs usually model the background either on the domain of a typically-large panoramic image or in an online fashion. Unfortunately, the former creates several problems, including poor scalability, while the latter prevents the recognition and leveraging of cases where the camera revisits previously-seen parts of the scene. This paper proposes a new method, called DeepMCBM, that eliminates all the aforementioned issues and achieves state-of-the-art results. Concretely, first we identify the difficulties associated with joint alignment of video frames in general and in a DL setting in particular. Next, we propose a new strategy for joint alignment that lets us use a spatial transformer net with neither a regularization nor any form of specialized (and non-differentiable) initialization. Coupled with an autoencoder conditioned on unwarped robust central moments (obtained from the joint alignment), this yields an end-to-end regularization-free MCBM that supports a broad range of camera motions and scales gracefully. We demonstrate DeepMCBM's utility on a variety of videos, including ones beyond the scope of other methods. Our code is available at https://github.com/BGU-CS-VIL/DeepMCBM .Comment: 26 paged, 5 figures. To be published in ECCV 202