48 research outputs found
Contributions to image-based object reconstruction : geometric and photometric aspects
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Practical Euclidean reconstruction of buildings.
Chou Yun-Sum, Bailey.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 89-92).Abstracts in English and Chinese.List of SymbolChapter Chapter 1 --- IntroductionChapter 1.1 --- The Goal: Euclidean Reconstruction --- p.1Chapter 1.2 --- Historical background --- p.2Chapter 1.3 --- Scope of the thesis --- p.2Chapter 1.4 --- Thesis Outline --- p.3Chapter Chapter 2 --- An introduction to stereo vision and 3D shape reconstructionChapter 2.1 --- Homogeneous Coordinates --- p.4Chapter 2.2 --- Camera ModelChapter 2.2.1 --- Pinhole Camera Model --- p.5Chapter 2.3 --- Camera Calibration --- p.11Chapter 2.4 --- Geometry of Binocular System --- p.14Chapter 2.5 --- Stereo Matching --- p.15Chapter 2.5.1 --- Accuracy of Corresponding Point --- p.17Chapter 2.5.2 --- The Stereo Matching Approach --- p.18Chapter 2.5.2.1 --- Intensity-based stereo matching --- p.19Chapter 2.5.2.2 --- Feature-based stereo matching --- p.20Chapter 2.5.3 --- Matching Constraints --- p.20Chapter 2.6 --- 3D Reconstruction --- p.22Chapter 2.7 --- Recent development on self calibration --- p.24Chapter 2.8 --- Summary of the Chapter --- p.25Chapter Chapter 3 --- Camera CalibrationChapter 3.1 --- Introduction --- p.26Chapter 3.2 --- Camera Self-calibration --- p.27Chapter 3.3 --- Self-calibration under general camera motion --- p.27Chapter 3.3.1 --- The absolute Conic Based Techniques --- p.28Chapter 3.3.2 --- A Stratified approach for self-calibration by Pollefeys --- p.33Chapter 3.3.3 --- Pollefeys self-calibration with Absolute Quadric --- p.34Chapter 3.3.4 --- Newsam's self-calibration with linear algorithm --- p.34Chapter 3.4 --- Camera Self-calibration under specially designed motion sequenceChapter 3.4. 1 --- Hartley's self-calibration by pure rotations --- p.35Chapter 3.4.1.1 --- Summary of the AlgorithmChapter 3.4.2 --- Pollefeys self-calibration with variant focal length --- p.36Chapter 3.4.2.1 --- Summary of the AlgorithmChapter 3.4.3 --- Faugeras self-calibration of a 1D Projective Camera --- p.38Chapter 3.5 --- Summary of the Chapter --- p.39Chapter Chapter 4 --- Self-calibration under Planar motionsChapter 4.1 --- Introduction --- p.40Chapter 4.2 --- 1D Projective Camera Self-calibration --- p.41Chapter 4.2.1 --- 1-D camera model --- p.42Chapter 4.2.2 --- 1-D Projective Camera Self-calibration Algorithms --- p.44Chapter 4.2.3 --- Planar motion detection --- p.45Chapter 4.2.4 --- Self-calibration under horizontal planar motions --- p.46Chapter 4.2.5 --- Self-calibration under three different planar motions --- p.47Chapter 4.2.6 --- Result analysis on self-calibration Experiments --- p.49Chapter 4.3 --- Essential Matrix and Triangulation --- p.51Chapter 4.4 --- Merge of Partial 3D models --- p.51Chapter 4.5 --- Summary of the Reconstruction Algorithms --- p.53Chapter 4.6 --- Experimental ResultsChapter 4.6.1 --- Experiment 1 : A Simulated Box --- p.54Chapter 4.6.2 --- Experiment 2 : A Real Building --- p.57Chapter 4.6.3 --- Experiment 3 : A Sun Flower --- p.58Chapter 4.7 --- Conclusion --- p.59Chapter Chapter 5 --- Building Reconstruction using a linear camera self- calibration techniqueChapter 5.1 --- Introduction --- p.60Chapter 5.2 --- Metric Reconstruction from Partially Calibrated imageChapter 5.2.1 --- Partially Calibrated Camera --- p.62Chapter 5.2.2 --- Optimal Computation of Fundamental Matrix (F) --- p.63Chapter 5.2.3 --- Linearly Recovering Two Focal Lengths from F --- p.64Chapter 5.2.4 --- Essential Matrix and Triangulation --- p.66Chapter 5.3 --- Experiments and Discussions --- p.67Chapter 5.4 --- Conclusion --- p.71Chapter Chapter 6 --- Refine the basic model with detail depth information by a Model-Based Stereo techniqueChapter 6.1 --- Introduction --- p.72Chapter 6.2 --- Model Based Epipolar GeometryChapter 6.2.1 --- Overview --- p.74Chapter 6.2.2 --- Warped offset image preparation --- p.76Chapter 6.2.3 --- Epipolar line calculation --- p.78Chapter 6.2.4 --- Actual corresponding point finding by stereo matching --- p.80Chapter 6.2.5 --- Actual 3D point generated by Triangulation --- p.80Chapter 6.3 --- Summary of the Algorithms --- p.81Chapter 6.4 --- Experiments and discussions --- p.83Chapter 6.5 --- Conclusion --- p.85Chapter Chapter 7 --- ConclusionsChapter 7.1 --- Summary --- p.86Chapter 7.2 --- Future Work --- p.88BIBLIOGRAPHY --- p.8
Robust surface modelling of visual hull from multiple silhouettes
Reconstructing depth information from images is one of the actively researched themes
in computer vision and its application involves most vision research areas from object
recognition to realistic visualisation. Amongst other useful vision-based reconstruction
techniques, this thesis extensively investigates the visual hull (VH) concept for volume
approximation and its robust surface modelling when various views of an object are
available. Assuming that multiple images are captured from a circular motion, projection
matrices are generally parameterised in terms of a rotation angle from a reference position
in order to facilitate the multi-camera calibration. However, this assumption is often
violated in practice, i.e., a pure rotation in a planar motion with accurate rotation angle
is hardly realisable. To address this problem, at first, this thesis proposes a calibration
method associated with the approximate circular motion.
With these modified projection matrices, a resulting VH is represented by a hierarchical
tree structure of voxels from which surfaces are extracted by the Marching
cubes (MC) algorithm. However, the surfaces may have unexpected artefacts caused by
a coarser volume reconstruction, the topological ambiguity of the MC algorithm, and
imperfect image processing or calibration result. To avoid this sensitivity, this thesis
proposes a robust surface construction algorithm which initially classifies local convex
regions from imperfect MC vertices and then aggregates local surfaces constructed by the
3D convex hull algorithm. Furthermore, this thesis also explores the use of wide baseline
images to refine a coarse VH using an affine invariant region descriptor. This improves
the quality of VH when a small number of initial views is given.
In conclusion, the proposed methods achieve a 3D model with enhanced accuracy.
Also, robust surface modelling is retained when silhouette images are degraded by
practical noise
Robust surface modelling of visual hull from multiple silhouettes
Reconstructing depth information from images is one of the actively researched themes
in computer vision and its application involves most vision research areas from object
recognition to realistic visualisation. Amongst other useful vision-based reconstruction
techniques, this thesis extensively investigates the visual hull (VH) concept for volume
approximation and its robust surface modelling when various views of an object are
available. Assuming that multiple images are captured from a circular motion, projection
matrices are generally parameterised in terms of a rotation angle from a reference position
in order to facilitate the multi-camera calibration. However, this assumption is often
violated in practice, i.e., a pure rotation in a planar motion with accurate rotation angle
is hardly realisable. To address this problem, at first, this thesis proposes a calibration
method associated with the approximate circular motion.
With these modified projection matrices, a resulting VH is represented by a hierarchical
tree structure of voxels from which surfaces are extracted by the Marching
cubes (MC) algorithm. However, the surfaces may have unexpected artefacts caused by
a coarser volume reconstruction, the topological ambiguity of the MC algorithm, and
imperfect image processing or calibration result. To avoid this sensitivity, this thesis
proposes a robust surface construction algorithm which initially classifies local convex
regions from imperfect MC vertices and then aggregates local surfaces constructed by the
3D convex hull algorithm. Furthermore, this thesis also explores the use of wide baseline
images to refine a coarse VH using an affine invariant region descriptor. This improves
the quality of VH when a small number of initial views is given.
In conclusion, the proposed methods achieve a 3D model with enhanced accuracy.
Also, robust surface modelling is retained when silhouette images are degraded by
practical noise
Visuelle Detektion unabhÀngig bewegter Objekte durch einen bewegten monokularen Beobachter
The development of a driver assistant system supporting drivers in complex intersection situations would be a major achievement for traffic safety, since many traffic accidents happen in such situations. While this is a highly complex task, which is still not accomplished, this thesis focused on one important and obligatory aspect of such systems: The visual detection of independently moving objects. Information about moving objects can, for example, be used in an attention guidance system, which is a central component of any complete intersection assistant system. The decision to base such a system on visual input had two reasons: (i) Humans gather their information to a large extent visually and (ii) cameras are inexpensive and already widely used in luxury and professional vehicles for specific applications. Mimicking the articulated human head and eyes, agile camera systems are desirable. To avoid heavy and sensitive stereo rigs, a small and lightweight monocular camera system mounted on a pan-tilt unit has been chosen as input device. In this thesis information about moving objects has been used to develop a prototype of an attention guidance system. It is based on the analysis of sequences from a single freely moving camera and on measurements from inertial sensors rigidly coupled with the camera system.Die Entwicklung eines Fahrerassistenzsystems, welches den Fahrer in komplexen Kreuzungssituationen unterstĂŒtzt, wĂ€re ein wichtiger Beitrag zur Verkehrssicherheit, da sehr viele UnfĂ€lle in solchen Situationen passieren. Dies ist eine hochgradig komplexe Aufgabe und daher liegt der Fokus dieser Arbeit auf einen wichtigen und notwendigen Aspekt solcher Systeme: Die visuelle Detektion unabhĂ€ngig bewegter Objekte. Informationen ĂŒber bewegte Objekte können z.B. fĂŒr ein System zur Aufmerksamkeitssteuerung verwendet werden. Solch ein System ist ein integraler Bestandteil eines jeden kompletten Kreuzungsassistenzssystems. Zwei GrĂŒnde haben zu der Entscheidung gefĂŒhrt, das System auf visuellen Daten zu stĂŒtzen: (i) Der Mensch sammelt seine Informationen zum GroĂteil visuell und (ii) Kameras sind zum Einen gĂŒnstig und zum Anderen bereits jetzt in vielen Fahrzeugen verfĂŒgbar. Agile Kamerasysteme sind nötig um den beweglichen menschlichen Kopf zu imitieren. Die Wahl einer kleinen und leichten monokularen Kamera, die auf einer Schwenk-Neige-Einheit montiert ist, vermeidet die Verwendung von schweren und empfindlichen Stereokamerasystemen. Mit den Informationen ĂŒber bewegte Objekte ist in dieser Arbeit der Prototyp eines Fahrerassistenzsystems Aufmerksamkeitssteuerung entwickelt worden. Das System basiert auf der Analyse von Bildsequenzen einer frei bewegten Kamera und auf Messungen von der mit der Kamera starr gekoppelten Inertialsensorik
Generalising the ideal pinhole model to multi-pupil imaging for depth recovery
This thesis investigates the applicability of computer vision camera models in recovering depth information from images, and presents a novel camera model incorporating a modified pupil plane capable of performing this task accurately from a single image. Standard models, such
as the ideal pinhole, suffer a loss of depth information when projecting from the world to an image plane. Recovery of this data enables reconstruction of the original scene as well as object and 3D motion reconstruction. The major contributions of this thesis are the complete characterisation of the ideal pinhole model calibration and the development of a new multi-pupil imaging model which enables depth recovery. A comprehensive analysis of the calibration sensitivity of the ideal pinhole model is presented along with a novel method of capturing calibration
images which avoid singularities in image space. Experimentation reveals a higher degree of accuracy using the new calibration images. A novel camera model employing multiple pupils is proposed which, in contrast to the ideal pinhole model, recovers scene depth. The accuracy of the multi-pupil model is demonstrated and validated through rigorous experimentation. An integral property of any camera model is the location of its pupil. To this end, the new model is expanded by generalising the location of the multi-pupil plane, thus enabling superior flexibility over traditional camera models which are confined to positioning
the pupil plane to negate particular aberrations in the lens. A key step in the development of the multi-pupil model is the treatment of optical aberrations in the imaging system. The unconstrained location and configuration of the pupil plane enables the determination of optical distortions in the multi-pupil imaging model. A calibration algorithm is proposed which corrects for the optical aberrations. This allows the multi-pupil model to be applied to a multitude of imaging systems regardless of the optical quality of the lens. Experimentation validates the multi-pupil modelâs accuracy in accounting for the aberrations and estimating accurate depth information from a single image. Results for object reconstruction are presented establishing the capabilities of the proposed multi-pupil imaging model
Recommended from our members
Intelligent image cropping and scaling
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 2011.Nowadays, there exist a huge number of end devices with different screen properties for
watching television content, which is either broadcasted or transmitted over the internet.
To allow best viewing conditions on each of these devices, different image formats have
to be provided by the broadcaster. Producing content for every single format is,
however, not applicable by the broadcaster as it is much too laborious and costly.
The most obvious solution for providing multiple image formats is to produce one high resolution format and prepare formats of lower resolution from this. One possibility to do this is to simply scale video images to the resolution of the target image format. Two significant drawbacks are the loss of image details through ownscaling and possibly unused image areas due to letter- or pillarboxes. A preferable solution is to find the contextual most important region in the high-resolution format at first and crop this area with an aspect ratio of the target image format afterwards. On the other hand, defining
the contextual most important region manually is very time consuming. Trying to apply that to live productions would be nearly impossible. Therefore, some approaches exist that automatically define cropping areas. To do so, they extract visual features, like moving reas in a video, and define regions of interest
(ROIs) based on those. ROIs are finally used to define an enclosing cropping area. The
extraction of features is done without any knowledge about the type of content. Hence,
these approaches are not able to distinguish between features that might be important in
a given context and those that are not.
The work presented within this thesis tackles the problem of extracting visual features based on prior knowledge about the content. Such knowledge is fed into the system in form of metadata that is available from TV production environments. Based on the
extracted features, ROIs are then defined and filtered dependent on the analysed
content. As proof-of-concept, this application finally adapts SDTV (Standard Definition Television) sports productions automatically to image formats with lower resolution through intelligent cropping and scaling. If no content information is available, the system can still be applied on any type of content through a default mode. The presented approach is based on the principle of a plug-in system. Each plug-in
represents a method for analysing video content information, either on a low level by
extracting image features or on a higher level by processing extracted ROIs. The
combination of plug-ins is determined by the incoming descriptive production metadata
and hence can be adapted to each type of sport individually. The application has been comprehensively evaluated by comparing the results of the system against alternative cropping methods. This evaluation utilised videos which were manually cropped by a professional video editor, statically cropped videos and simply scaled, non-cropped videos. In addition to and apart from purely subjective evaluations,
the gaze positions of subjects watching sports videos have been measured and compared
to the regions of interest positions extracted by the system
Trajectory based video analysis in multi-camera setups
PhDThis thesis presents an automated framework for activity analysis in multi-camera
setups. We start with the calibration of cameras particularly without overlapping
views. An algorithm is presented that exploits trajectory observations in each view
and works iteratively on camera pairs. First outliers are identified and removed
from observations of each camera. Next, spatio-temporal information derived from
the available trajectory is used to estimate unobserved trajectory segments in areas
uncovered by the cameras. The unobserved trajectory estimates are used to estimate
the relative position of each camera pair, whereas the exit-entrance direction of
each object is used to estimate their relative orientation. The process continues and
iteratively approximates the configuration of all cameras with respect to each other.
Finally, we refi ne the initial configuration estimates with bundle adjustment, based
on the observed and estimated trajectory segments. For cameras with overlapping
views, state-of-the-art homography based approaches are used for calibration.
Next we establish object correspondence across multiple views. Our algorithm
consists of three steps, namely association, fusion and linkage. For association,
local trajectory pairs corresponding to the same physical object are estimated using
multiple spatio-temporal features on a common ground plane. To disambiguate
spurious associations, we employ a hybrid approach that utilises the matching results
on the image plane and ground plane. The trajectory segments after association
are fused by adaptive averaging. Trajectory linkage then integrates segments and generates a single trajectory of an object across the entire observed area.
Finally, for activities analysis clustering is applied on complete trajectories. Our
clustering algorithm is based on four main steps, namely the extraction of a set of
representative trajectory features, non-parametric clustering, cluster merging and
information fusion for the identification of normal and rare object motion patterns.
First we transform the trajectories into a set of feature spaces on which Meanshift
identi es the modes and the corresponding clusters. Furthermore, a merging
procedure is devised to re fine these results by combining similar adjacent clusters.
The fi nal common patterns are estimated by fusing the clustering results across all
feature spaces. Clusters corresponding to reoccurring trajectories are considered as
normal, whereas sparse trajectories are associated to abnormal and rare events.
The performance of the proposed framework is evaluated on standard data-sets
and compared with state-of-the-art techniques. Experimental results show that
the proposed framework outperforms state-of-the-art algorithms both in terms of
accuracy and robustness