43 research outputs found

    Control de robots móviles mediante visión omnidireccional utilizando la geometría de tres vistas

    Get PDF
    Este trabajo trata acerca del control visual de robot móviles. Dentro de este campo tan amplio de investigación existen dos elementos a los que prestaremos especial atención: la visión omnidireccional y los modelos geométricos multi-vista. Las cámaras omnidireccionales proporcionan información angular muy precisa, aunque presentan un grado de distorsión significativo en dirección radial. Su cualidad de poseer un amplio campo de visión hace que dichas cámaras sean apropiadas para tareas de navegación robótica. Por otro lado, el uso de los modelos geométricos que relacionan distintas vistas de una escena permite rechazar emparejamientos erróneos de características visuales entre imágenes, y de este modo robustecer el proceso de control mediante visión. Nuestro trabajo presenta dos técnicas de control visual para ser usadas por un robot moviéndose en el plano del suelo. En primer lugar, proponemos un nuevo método para homing visual, que emplea la información dada por un conjunto de imágenes de referencia adquiridas previamente en el entorno, y las imágenes que toma el robot a lo largo de su movimiento. Con el objeto de sacar partido de las cualidades de la visión omnidireccional, nuestro método de homing es puramente angular, y no emplea información alguna sobre distancia. Esta característica, unida al hecho de que el movimiento se realiza en un plano, motiva el empleo del modelo geométrico dado por el tensor trifocal 1D. En particular, las restricciones geométricas impuestas por dicho tensor, que puede ser calculado a partir de correspondencias de puntos entre tres imágenes, mejoran la robustez del control en presencia de errores de emparejamiento. El interés de nuestra propuesta reside en que el método de control empleado calcula las velocidades del robot a partir de información únicamente angular, siendo ésta muy precisa en las cámaras omnidireccionales. Además, presentamos un procedimiento que calcula las relaciones angulares entre las vistas disponibles de manera indirecta, sin necesidad de que haya información visual compartida entre todas ellas. La técnica descrita se puede clasificar como basada en imagen (image-based), dado que no precisa estimar la localización ni utiliza información 3D. El robot converge a la posición objetivo sin conocer la información métrica sobre la trayectoria seguida. Para algunas aplicaciones, como la evitación de obstáculos, puede ser necesario disponer de mayor información sobre el movimiento 3D realizado. Con esta idea en mente, presentamos un nuevo método de control visual basado en entradas sinusoidales. Las sinusoides son funciones con propiedades matemáticas bien conocidas y de variación suave, lo cual las hace adecuadas para su empleo en maniobras de aparcamiento de vehículos. A partir de las velocidades de variación sinusoidal que definimos en nuestro diseño, obtenemos las expresiones analíticas de la evolución de las variables de estado del robot. Además, basándonos en dichas expresiones, proponemos un método de control mediante realimentación del estado. La estimación del estado del robot se obtiene a partir del tensor trifocal 1D calculado entre la vista objetivo, la vista inicial y la vista actual del robot. Mediante este control sinusoidal, el robot queda alineado con la posición objetivo. En un segundo paso, efectuamos la corrección de la profundidad mediante una ley de control definida directamente en términos del tensor trifocal 1D. El funcionamiento de los dos controladores propuestos en el trabajo se ilustra mediante simulaciones, y con el objeto de respaldar su viabilidad se presentan análisis de estabilidad y resultados de simulaciones y de experimentos con imágenes reales

    Rank classification of linear line structure in determining trifocal tensor.

    Get PDF
    Zhao, Ming.Thesis (M.Phil.)--Chinese University of Hong Kong, 2008.Includes bibliographical references (p. 111-117) and index.Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation --- p.1Chapter 1.2 --- Objective of the study --- p.2Chapter 1.3 --- Challenges and our approach --- p.4Chapter 1.4 --- Original contributions --- p.6Chapter 1.5 --- Organization of this dissertation --- p.6Chapter 2 --- Related Work --- p.9Chapter 2.1 --- Critical configuration for motion estimation and projective reconstruction --- p.9Chapter 2.1.1 --- Point feature --- p.9Chapter 2.1.2 --- Line feature --- p.12Chapter 2.2 --- Camera motion estimation --- p.14Chapter 2.2.1 --- Line tracking --- p.15Chapter 2.2.2 --- Determining camera motion --- p.19Chapter 3 --- Preliminaries on Three-View Geometry and Trifocal Tensor --- p.23Chapter 3.1 --- Projective spaces P3 and transformations --- p.23Chapter 3.2 --- The trifocal tensor --- p.24Chapter 3.3 --- Computation of the trifocal tensor-Normalized linear algorithm --- p.31Chapter 4 --- Linear Line Structures --- p.33Chapter 4.1 --- Models of line space --- p.33Chapter 4.2 --- Line structures --- p.35Chapter 4.2.1 --- Linear line space --- p.37Chapter 4.2.2 --- Ruled surface --- p.37Chapter 4.2.3 --- Line congruence --- p.38Chapter 4.2.4 --- Line complex --- p.38Chapter 5 --- Critical Configurations of Three Views Revealed by Line Correspondences --- p.41Chapter 5.1 --- Two-view degeneracy --- p.41Chapter 5.2 --- Three-view degeneracy --- p.42Chapter 5.2.1 --- Introduction --- p.42Chapter 5.2.2 --- Linear line space --- p.44Chapter 5.2.3 --- Linear ruled surface --- p.54Chapter 5.2.4 --- Linear line congruence --- p.55Chapter 5.2.5 --- Linear line complex --- p.57Chapter 5.3 --- Retrieving tensor in critical configurations --- p.60Chapter 5.4 --- Rank classification of non-linear line structures --- p.61Chapter 6 --- Camera Motion Estimation Framework --- p.63Chapter 6.1 --- Line extraction --- p.64Chapter 6.2 --- Line tracking --- p.65Chapter 6.2.1 --- Preliminary geometric tracking --- p.65Chapter 6.2.2 --- Experimental results --- p.69Chapter 6.3 --- Camera motion estimation framework using EKF --- p.71Chapter 7 --- Experimental Results --- p.75Chapter 7.1 --- Simulated data experiments --- p.75Chapter 7.2 --- Real data experiments --- p.76Chapter 7.2.1 --- Linear line space --- p.80Chapter 7.2.2 --- Linear ruled surface --- p.84Chapter 7.2.3 --- Linear line congruence --- p.84Chapter 7.2.4 --- Linear line complex --- p.91Chapter 7.3 --- Empirical observation: ruled plane for line transfer --- p.93Chapter 7.4 --- Simulation for non-linear line structures --- p.94Chapter 8 --- Conclusions and Future Work --- p.97Chapter 8.1 --- Summary --- p.97Chapter 8.2 --- Future work --- p.99Chapter A --- Notations --- p.101Chapter B --- Tensor --- p.103Chapter C --- Matrix Decomposition and Estimation Techniques --- p.104Chapter D --- MATLAB Files --- p.107Chapter D.1 --- Estimation matrix --- p.107Chapter D.2 --- Line transfer --- p.109Chapter D.3 --- Simulation --- p.10

    Image Based View Synthesis

    Get PDF
    This dissertation deals with the image-based approach to synthesize a virtual scene using sparse images or a video sequence without the use of 3D models. In our scenario, a real dynamic or static scene is captured by a set of un-calibrated images from different viewpoints. After automatically recovering the geometric transformations between these images, a series of photo-realistic virtual views can be rendered and a virtual environment covered by these several static cameras can be synthesized. This image-based approach has applications in object recognition, object transfer, video synthesis and video compression. In this dissertation, I have contributed to several sub-problems related to image based view synthesis. Before image-based view synthesis can be performed, images need to be segmented into individual objects. Assuming that a scene can approximately be described by multiple planar regions, I have developed a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions, correctly detect the occlusion pixels over multiple consecutive frames, and accurately segment the scene into several motion layers. First, a number of seed regions using correspondences in two frames are determined, and the seed regions are expanded and outliers are rejected employing the graph cuts method integrated with level set representation. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, the occlusion order constraints on multiple frames are explored, which guarantee that the occlusion area increases with the temporal order in a short period and effectively maintains segmentation consistency over multiple consecutive frames. Then the correct layer segmentation is obtained by using a graph cuts algorithm, and the occlusions between the overlapping layers are explicitly determined. Several experimental results are demonstrated to show that our approach is effective and robust. Recovering the geometrical transformations among images of a scene is a prerequisite step for image-based view synthesis. I have developed a wide baseline matching algorithm to identify the correspondences between two un-calibrated images, and to further determine the geometric relationship between images, such as epipolar geometry or projective transformation. In our approach, a set of salient features, edge-corners, are detected to provide robust and consistent matching primitives. Then, based on the Singular Value Decomposition (SVD) of an affine matrix, we effectively quantize the search space into two independent subspaces for rotation angle and scaling factor, and then we use a two-stage affine matching algorithm to obtain robust matches between these two frames. The experimental results on a number of wide baseline images strongly demonstrate that our matching method outperforms the state-of-art algorithms even under the significant camera motion, illumination variation, occlusion, and self-similarity. Given the wide baseline matches among images I have developed a novel method for Dynamic view morphing. Dynamic view morphing deals with the scenes containing moving objects in presence of camera motion. The objects can be rigid or non-rigid, each of them can move in any orientation or direction. The proposed method can generate a series of continuous and physically accurate intermediate views from only two reference images without any knowledge about 3D. The procedure consists of three steps: segmentation, morphing and post-warping. Given a boundary connection constraint, the source and target scenes are segmented into several layers for morphing. Based on the decomposition of affine transformation between corresponding points, we uniquely determine a physically correct path for post-warping by the least distortion method. I have successfully generalized the dynamic scene synthesis problem from the simple scene with only rotation to the dynamic scene containing non-rigid objects. My method can handle dynamic rigid or non-rigid objects, including complicated objects such as humans. Finally, I have also developed a novel algorithm for tri-view morphing. This is an efficient image-based method to navigate a scene based on only three wide-baseline un-calibrated images without the explicit use of a 3D model. After automatically recovering corresponding points between each pair of images using our wide baseline matching method, an accurate trifocal plane is extracted from the trifocal tensor implied in these three images. Next, employing a trinocular-stereo algorithm and barycentric blending technique, we generate an arbitrary novel view to navigate the scene in a 2D space. Furthermore, after self-calibration of the cameras, a 3D model can also be correctly augmented into this virtual environment synthesized by the tri-view morphing algorithm. We have applied our view morphing framework to several interesting applications: 4D video synthesis, automatic target recognition, multi-view morphing

    The Geometry of Dynamic Scenes - On Coplanar and Convergent Linear Motions Embedded in 3D Static Scenes

    Get PDF
    In this paper, we consider structure and motion recovery for scenes consisting of static and dynamic features. More particularly, we consider a single moving uncalibrated camera observing a scene consisting of points moving along straight lines converging to a unique point and lying on a motion plane. This scenario may describe a roadway observed by a moving camera whose motion is unknown. We show that there exist matching tensors similar to fundamental matrices. We derive the link between dynamic and static structure and motion and show how the equation of the motion plane (or equivalently the plane homographies it induces between images) may be recovered from dynamic features only. Experimental results on real images are provided, in particular on a 60-frames video sequence

    Multiple-camera capture system implementation

    Get PDF
    The project consists in studying and analyzing different techniques for the acquisition of 3D scenes using a set of different cameras observing the scene from multiple views. Algorithms for camera calibration will be also considered and implemented. Moreover, algorithms for estimating the depth of the objects in the scene, using the information provided by two, three or more cameras; will also be develope

    04251 -- Imaging Beyond the Pinhole Camera

    Get PDF
    From 13.06.04 to 18.06.04, the Dagstuhl Seminar 04251 ``Imaging Beyond the Pin-hole Camera. 12th Seminar on Theoretical Foundations of Computer Vision\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available


    Get PDF

    Pointing, Acquisition, and Tracking Systems for Free-Space Optical Communication Links

    Get PDF
    Pointing, acquisition, and tracking (PAT) systems have been widely applied in many applications, from short-range (e.g. human motion tracking) to long-haul (e.g. missile guidance) systems. This dissertation extends the PAT system into new territory: free space optical (FSO) communication system alignment, the most important missing ingredient for practical deployment. Exploring embedded geometric invariances intrinsic to the rigidity of actuators and sensors is a key design feature. Once the configuration of the actuator and sensor is determined, the geometric invariance is fixed, which can therefore be calibrated in advance. This calibrated invariance further serves as a transformation for converting the sensor measurement to actuator action. The challenge of the FSO alignment problem lies in how to point to a 3D target by only using a 2D sensor. Two solutions are proposed: the first one exploits the invariance, known as the linear homography, embedded in the FSO applications which involve long link length between transceivers or have planar trajectories. The second one employs either an additional 2D or 1D sensor, which results in invariances known as the trifocal tensor and radial trifocal tensor, respectively. Since these invariances have been developed upon an assumption that the measurements from sensors are free from noise, including the uncertainty resulting from aberrations, a robust calibrate algorithm is required to retrieve the optimal invariance from noisy measurements. The first solution is suffcient for most of the PAT systems used for FSO alignment since a long link length constraint is generally the case. Although PAT systems are normally categorized into coarse and fine subsystems to deal with different requirements, they are proven to be governed by a linear homography. Robust calibration algorithms have been developed during this work and further verified by simulations. Two prototype systems have been developed: one serves as a fine pointing subsystem, which consists of a beam steerer and an angular resolver; while the other serves as a coarse pointing subsystem, which consists of a rotary gimbal and a camera. The average pointing errors in both prototypes were less than 170 and 700 micro-rads, respectively. PAT systems based on the second solution are capable of pointing to any target within the intersected field-of-view from both sensors because two sensors provide stereo vision to determine the depth of the target, the missing information that cannot be determined by a 2D sensor. They are only required when short-distance FSO communication links must be established. Two simulations were conducted to show the robustness of the calibration procedures and the pointing accuracy with respect to random noise


    Get PDF

    Practical Euclidean reconstruction of buildings.

    Get PDF
    Chou Yun-Sum, Bailey.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 89-92).Abstracts in English and Chinese.List of SymbolChapter Chapter 1 --- IntroductionChapter 1.1 --- The Goal: Euclidean Reconstruction --- p.1Chapter 1.2 --- Historical background --- p.2Chapter 1.3 --- Scope of the thesis --- p.2Chapter 1.4 --- Thesis Outline --- p.3Chapter Chapter 2 --- An introduction to stereo vision and 3D shape reconstructionChapter 2.1 --- Homogeneous Coordinates --- p.4Chapter 2.2 --- Camera ModelChapter 2.2.1 --- Pinhole Camera Model --- p.5Chapter 2.3 --- Camera Calibration --- p.11Chapter 2.4 --- Geometry of Binocular System --- p.14Chapter 2.5 --- Stereo Matching --- p.15Chapter 2.5.1 --- Accuracy of Corresponding Point --- p.17Chapter 2.5.2 --- The Stereo Matching Approach --- p.18Chapter --- Intensity-based stereo matching --- p.19Chapter --- Feature-based stereo matching --- p.20Chapter 2.5.3 --- Matching Constraints --- p.20Chapter 2.6 --- 3D Reconstruction --- p.22Chapter 2.7 --- Recent development on self calibration --- p.24Chapter 2.8 --- Summary of the Chapter --- p.25Chapter Chapter 3 --- Camera CalibrationChapter 3.1 --- Introduction --- p.26Chapter 3.2 --- Camera Self-calibration --- p.27Chapter 3.3 --- Self-calibration under general camera motion --- p.27Chapter 3.3.1 --- The absolute Conic Based Techniques --- p.28Chapter 3.3.2 --- A Stratified approach for self-calibration by Pollefeys --- p.33Chapter 3.3.3 --- Pollefeys self-calibration with Absolute Quadric --- p.34Chapter 3.3.4 --- Newsam's self-calibration with linear algorithm --- p.34Chapter 3.4 --- Camera Self-calibration under specially designed motion sequenceChapter 3.4. 1 --- Hartley's self-calibration by pure rotations --- p.35Chapter --- Summary of the AlgorithmChapter 3.4.2 --- Pollefeys self-calibration with variant focal length --- p.36Chapter --- Summary of the AlgorithmChapter 3.4.3 --- Faugeras self-calibration of a 1D Projective Camera --- p.38Chapter 3.5 --- Summary of the Chapter --- p.39Chapter Chapter 4 --- Self-calibration under Planar motionsChapter 4.1 --- Introduction --- p.40Chapter 4.2 --- 1D Projective Camera Self-calibration --- p.41Chapter 4.2.1 --- 1-D camera model --- p.42Chapter 4.2.2 --- 1-D Projective Camera Self-calibration Algorithms --- p.44Chapter 4.2.3 --- Planar motion detection --- p.45Chapter 4.2.4 --- Self-calibration under horizontal planar motions --- p.46Chapter 4.2.5 --- Self-calibration under three different planar motions --- p.47Chapter 4.2.6 --- Result analysis on self-calibration Experiments --- p.49Chapter 4.3 --- Essential Matrix and Triangulation --- p.51Chapter 4.4 --- Merge of Partial 3D models --- p.51Chapter 4.5 --- Summary of the Reconstruction Algorithms --- p.53Chapter 4.6 --- Experimental ResultsChapter 4.6.1 --- Experiment 1 : A Simulated Box --- p.54Chapter 4.6.2 --- Experiment 2 : A Real Building --- p.57Chapter 4.6.3 --- Experiment 3 : A Sun Flower --- p.58Chapter 4.7 --- Conclusion --- p.59Chapter Chapter 5 --- Building Reconstruction using a linear camera self- calibration techniqueChapter 5.1 --- Introduction --- p.60Chapter 5.2 --- Metric Reconstruction from Partially Calibrated imageChapter 5.2.1 --- Partially Calibrated Camera --- p.62Chapter 5.2.2 --- Optimal Computation of Fundamental Matrix (F) --- p.63Chapter 5.2.3 --- Linearly Recovering Two Focal Lengths from F --- p.64Chapter 5.2.4 --- Essential Matrix and Triangulation --- p.66Chapter 5.3 --- Experiments and Discussions --- p.67Chapter 5.4 --- Conclusion --- p.71Chapter Chapter 6 --- Refine the basic model with detail depth information by a Model-Based Stereo techniqueChapter 6.1 --- Introduction --- p.72Chapter 6.2 --- Model Based Epipolar GeometryChapter 6.2.1 --- Overview --- p.74Chapter 6.2.2 --- Warped offset image preparation --- p.76Chapter 6.2.3 --- Epipolar line calculation --- p.78Chapter 6.2.4 --- Actual corresponding point finding by stereo matching --- p.80Chapter 6.2.5 --- Actual 3D point generated by Triangulation --- p.80Chapter 6.3 --- Summary of the Algorithms --- p.81Chapter 6.4 --- Experiments and discussions --- p.83Chapter 6.5 --- Conclusion --- p.85Chapter Chapter 7 --- ConclusionsChapter 7.1 --- Summary --- p.86Chapter 7.2 --- Future Work --- p.88BIBLIOGRAPHY --- p.8