Search CORE

48 research outputs found

Unsupervised Face Alignment by Robust Nonrigid Mapping

Author: HOI Steven C. H.
VAN GOOL Luc
ZHU Jianke
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2009
Field of study

We propose a novel approach to unsupervised facial im-age alignment. Differently from previous approaches, that are confined to affine transformations on either the entire face or separate patches, we extract a nonrigid mapping be-tween facial images. Based on a regularized face model, we frame unsupervised face alignment into the Lucas-Kanade image registration approach. We propose a robust optimiza-tion scheme to handle appearance variations. The method is fully automatic and can cope with pose variations and ex-pressions, all in an unsupervised manner. Experiments on a large set of images showed that the approach is effective. 1

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Unsupervised alignment of objects in images

Author: Shokrollahi Yancheshmeh Fatemeh
Publication venue
Publication date: 15/01/2014
Field of study

With the advent of computer vision, various applications become interested to apply it to interpret the 3D and 2D scenes. The main core of computer vision is visual object detection which deals with detecting and representing objects in the image. Visual object detection requires to learn a model of each class type (e.g. car, cat) to be capable to detect objects belonging to the same class. Class learning benefits from a method which automatically aligns class examples making learning more straightforward. The objective of this thesis is to further develop the sate-of-the-art feature-based alignment method which rigidly and automatically aligns object class images to a manually selected seed image. We try to compensate the weakness by providing a method to automatically select the best seed from dataset. Our method first extracts features by utilizing dense sampling method and then scale invariant feature transform (SIFT) descriptor is used to find best matches as initial local feature matches. The final alignment is based on spatial scoring procedure where the initial matches are refined to a set of spatially verified matches. The spatial score is used next to calculate similarity scores. We propose an algorithm which operates on spatial and similarity scores and finally selects the best seed. We also investigate the performance of step-wise alignment using minimum spanning tree (MST) and Dijkstra shortest path instead of direct alignment utilizing a single seed. We conduct our experiments using classes of Caltech-101 for which our unsupervised seed selection and step-wise alignment achieve state-of-the-art performance

Trepo - Institutional Repository of Tampere University

Automatic facial landmark labeling with minimal supervision.

Author: Frederick W Wheeler
Peter Tu
Xiaoming Liu
Yan Tong
Publication venue
Publication date: 01/01/2009
Field of study

Abstract Landmark labeling of training images is essential for many learning tasks in computer vision, such as object detection, tracking, an

CiteSeerX

$\mathcal{X}$ -Metric: An N-Dimensional Information-Theoretic Framework for Groupwise Registration and Deep Combined Computing

Author: Luo Xinzhe
Zhuang Xiahai
Publication venue
Publication date: 03/11/2022
Field of study

This paper presents a generic probabilistic framework for estimating the statistical dependency and finding the anatomical correspondences among an arbitrary number of medical images. The method builds on a novel formulation of the

N

-dimensional joint intensity distribution by representing the common anatomy as latent variables and estimating the appearance model with nonparametric estimators. Through connection to maximum likelihood and the expectation-maximization algorithm, an information\hyp{}theoretic metric called

\mathcal{X}

-metric and a co-registration algorithm named

\mathcal{X}

-CoReg are induced, allowing groupwise registration of the

N

observed images with computational complexity of

\mathcal{O}(N)

. Moreover, the method naturally extends for a weakly-supervised scenario where anatomical labels of certain images are provided. This leads to a combined\hyp{}computing framework implemented with deep learning, which performs registration and segmentation simultaneously and collaboratively in an end-to-end fashion. Extensive experiments were conducted to demonstrate the versatility and applicability of our model, including multimodal groupwise registration, motion correction for dynamic contrast enhanced magnetic resonance images, and deep combined computing for multimodal medical images. Results show the superiority of our method in various applications in terms of both accuracy and efficiency, highlighting the advantage of the proposed representation of the imaging process

arXiv.org e-Print Archive

Flexible Bayesian Modelling for Nonlinear Image Registration

Author: A Klein
B Draganski
BA Ardekani
BB Avants
C Blaiotta
CM Bishop
D Böhning
G Balakrishnan
G Ridgway
GE Christensen
IB Malone
J Ashburner
J Ashburner
J Ashburner
J Ashburner
J Ashburner
J Fan
J Krebs
J Mourao-Miranda
JG Csernansky
KJ Friston
KK Bhatia
L Zöllei
M Brudfors
M Jenkinson
MI Miller
ML Seghier
PT Fox
RA Heckemann
RP Woods
T Vercauteren
T Yarkoni
Publication venue
Publication date: 03/06/2020
Field of study

We describe a diffeomorphic registration algorithm that allows groups of images to be accurately aligned to a common space, which we intend to incorporate into the SPM software. The idea is to perform inference in a probabilistic graphical model that accounts for variability in both shape and appearance. The resulting framework is general and entirely unsupervised. The model is evaluated at inter-subject registration of 3D human brain scans. Here, the main modeling assumption is that individual anatomies can be generated by deforming a latent 'average' brain. The method is agnostic to imaging modality and can be applied with no prior processing. We evaluate the algorithm using freely available, manually labelled datasets. In this validation we achieve state-of-the-art results, within reasonable runtimes, against previous state-of-the-art widely used, inter-subject registration algorithms. On the unprocessed dataset, the increase in overlap score is over 17%. These results demonstrate the benefits of using informative computational anatomy frameworks for nonlinear registration.Comment: Accepted for MICCAI 202

arXiv.org e-Print Archive

Crossref

UCL Discovery

Learning from one example in machine vision by sharing probability densities

Author: Miller Erik G. (Erik Gundersen)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2002
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 125-130).Human beings exhibit rapid learning when presented with a small number of images of a new object. A person can identify an object under a wide variety of visual conditions after having seen only a single example of that object. This ability can be partly explained by the application of previously learned statistical knowledge to a new setting. This thesis presents an approach to acquiring knowledge in one setting and using it in another. Specifically, we develop probability densities over common image changes. Given a single image of a new object and a model of change learned from a different object, we form a model of the new object that can be used for synthesis, classification, and other visual tasks. We start by modeling spatial changes. We develop a framework for learning statistical knowledge of spatial transformations in one task and using that knowledge in a new task. By sharing a probability density over spatial transformations learned from a sample of handwritten letters, we develop a handwritten digit classifier that achieves 88.6% accuracy using only a single hand-picked training example from each class. The classification scheme includes a new algorithm, congealing, for the joint alignment of a set of images using an entropy minimization criterion. We investigate properties of this algorithm and compare it to other methods of addressing spatial variability in images. We illustrate its application to binary images, gray-scale images, and a set of 3-D neonatal magnetic resonance brain volumes.Next, we extend the method of change modeling from spatial transformations to color transformations. By measuring statistically common joint color changes of a scene in an office environment, and then applying standard statistical techniques such as principal components analysis, we develop a probabilistic model of color change. We show that these color changes, which we call color flows, can be shared effectively between certain types of scenes. That is, a probability density over color change developed by observing one scene can provide useful information about the variability of another scene. We demonstrate a variety of applications including image synthesis, image matching, and shadow detection.by Erik G. Miller.Ph.D

DSpace@MIT

Incorporating Boltzmann Machine Priors for Semantic Labeling in Images and Videos

Author: Kae Andrew
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2014
Field of study

Semantic labeling is the task of assigning category labels to regions in an image. For example, a scene may consist of regions corresponding to categories such as sky, water, and ground, or parts of a face such as eyes, nose, and mouth. Semantic labeling is an important mid-level vision task for grouping and organizing image regions into coherent parts. Labeling these regions allows us to better understand the scene itself as well as properties of the objects in the scene, such as their parts, location, and interaction within the scene. Typical approaches for this task include the conditional random field (CRF), which is well-suited to modeling local interactions among adjacent image regions. However the CRF is limited in dealing with complex, global (long-range) interactions between regions in an image, and between frames in a video. This thesis presents approaches to modeling long-range interactions within images and videos, for use in semantic labeling. In order to model these long-range interactions, we incorporate priors based on the restricted Boltzmann machine (RBM). The RBM is a generative model which has demonstrated the ability to learn the shape of an object and the CRBM is a temporal extension which can learn the motion of an object. Although the CRF is a good baseline labeler, we show how the RBM and CRBM can be added to the architecture to model both the global object shape within an image and the temporal dependencies of the object from previous frames in a video. We demonstrate the labeling performance of our models for the parts of complex face images from the Labeled Faces in the Wild database (for images) and the YouTube Faces Database (for videos). Our hybrid models produce results that are both quantitatively and qualitatively better than the baseline CRF alone for both images and videos

CiteSeerX

ScholarWorks@UMass Amherst