Search CORE

3 research outputs found

Transformation-Invariant Analysis of Visual Signals with Parametric Models

Author: Vural Elif
Publication venue: Lausanne, EPFL
Publication date: 01/10/2013
Field of study

The analysis of collections of visual data, e.g., their classification, modeling and clustering, has become a problem of high importance in a variety of applications. Meanwhile, image data captured in uncontrolled environments by arbitrary users is very likely to be exposed to geometric transformations. Therefore, efficient methods are needed for analyzing high-dimensional visual data sets that can cope with geometric transformations of the visual content of interest. In this thesis, we study parametric models for transformation-invariant analysis of geometrically transformed image data, which provide low-dimensional image representations that capture relevant information efficiently. We focus on transformation manifolds, which are image sets created by parametrizable geometric transformations of a reference image model. Transformation manifolds provide a geometric interpretation of several image analysis problems. In particular, image registration corresponds to the computation of the projection of the target image onto the transformation manifold of the reference image. Similarly, in classification, the class label of a query image can be estimated in a transformation-invariant way by comparing its distance to transformation manifolds that represent different image classes. In this thesis, we explore several problems related to the registration, modeling, and classification of images with transformation manifolds. First, we address the problem of sampling transformation manifolds of known parameterization, where we focus on the target applications of image registration and classification in the sampling. We first propose an iterative algorithm for sampling a manifold such that the selected set of samples gives an accurate estimate of the distance of a query image to the manifold. We then extend this method to a classification setting with several transformation manifolds representing different image classes. We develop an algorithm to jointly sample multiple transformation manifolds such that the class label of query images can be estimated accurately by comparing their distances to the class-representative manifold samples. The proposed methods outperform baseline sampling schemes in image registration and classification. Next, we study the problem of learning transformation manifolds that are good models of a given set of geometrically transformed image data. We first learn a representative pattern whose transformation manifold fits well the input images and then generalize the problem to a supervised classification setting, where we jointly learn multiple class-representative pattern transformation manifolds from training images with known class labels. The proposed manifold learning methods exploit the information of the type of the geometric transformation in the data to compute an accurate data model, which is ignored in previous manifold learning algorithms. Finally, we focus on the usage of transformation manifolds in multiscale image registration. We consider two different methods in image registration, namely, the tangent distance method and the minimization of the image intensity difference with gradient descent. We present a multiscale performance analysis of these methods. We derive upper bounds for the alignment errors yielded by the two methods and analyze the variations of these bounds with noise and low-pass filtering, which is useful for gaining an understanding of the performance of these methods in image registration. To the best of our knowledge, these are the first such studies in multiscale registration settings. Geometrically transformed image sets have a particular structure, and classical image analysis methods do not always suit well for the treatment of such data. This thesis is motivated by this observation and proposes new techniques and insights for handling geometric transformations in image analysis and processing

Infoscience - École polytechnique fédérale de Lausanne

3-D scene reconstruction from image sequences

Author: Hanajik M Milan
Nguyen HV
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/1995
Field of study

In this paper we describe a technique for reconstruction of a static 3-D scene from a monocular image sequence. The problem is formulated as a stochastic filtering problem where state variables describe the scene and the camera parameters, and images are observations of this state. This rather general problem formulation allows us to drop requirements otherwise requested, like knowledge of the camera motion path, or the necessity to have a binocular or a trinocular image sequence. The scene is represented as a set of 3-D line segments, edges contours etc. The computation time of each state update is linear with the number of scene line segments. In the paper the achieved results are given and conclusions are derived

Repository TU/e

Pure OAI Repository

Computing the Field-of-View of a Stitched Panorama to Create FoV Sensitive Virtual Environments

Author: Hikaru Nishira
Kenji Mase
Publication venue
Publication date
Field of study

The representation of a visual scene is a key issue in navigating through image-based virtual environments. Using affine and projective invariants, sequential /multiple images from a pivoting camera can be easily mosaiced into a planar sheet. We discuss the importance of field-of-view to share the sensation of the scene observer (i.e. camera person) in a virtual environment and propose a method of obtaining the total field-of-view angle of image mosaics. The extension to an environment with moving objects as well as applications are presented. 1. Introduction Recent progress in computer graphics technology, especially in the texture-mapping (e.g. environmentmapping) of pixel images is closing the gap between 3-D scene reconstruction from image sequences and the creation of geometrically modeled virtual environments. Due to the visual perception ability of humans, we can "feel" reality by mapping scene images (texture) onto roughly reconstructed geometric models such as human faces an..

CiteSeerX