51 research outputs found
Depth Recovery with Rectification using Single-Lens Prism based Stereovision System
Ph.DDOCTOR OF PHILOSOPH
Plenoptic Signal Processing for Robust Vision in Field Robotics
This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications
Plenoptic Signal Processing for Robust Vision in Field Robotics
This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications
EXTRACTING DEPTH INFORMATION FROM STEREO VISION SYSTEM, USING A CORRELATION AND A FEATURE BASED METHODS
This thesis presents a new method to extract depth information from stereo-vision acquisitions using a feature and a correlation based approaches. The main implementation of the proposed method is in the area of Autonomous Pick & Place, using a robotic manipulator. Current vision-guided robotics are still based on a priori training and teaching steps, and still suffer from long response time. The study uses a stereo triangulation setup where two Charged Coupled Devices CCDs are arranged to acquire the scene from two different perspectives. The study discusses the details of two methods to calculate the depth; firstly a correlation matching routine is programmed using a Square Sum Difference SSD algorithm to search for the corresponding points from the left and the right images. The SSD is further modified using an adjustable Region Of Interest ROI along with a center of gravity based calculations. Furthermore, the two perspective images are rectified to reduce the required processing time. Secondly, a feature based approach is proposed to match the objects from the two perspectives. The proposed method implements a search kernel based on the 8-connected neighbor principle. The reported error in depth using the feature method is found to be around 1.2 m
Recommended from our members
Reconstruction of 3D scenes from pairs of uncalibrated images. Creation of an interactive system for extracting 3D data points and investigation of automatic techniques for generating dense 3D data maps from pairs of uncalibrated images for remote sensing applications.
Much research effort has been devoted to producing algorithms that contribute directly or indirectly to the extraction of 3D information from a wide variety of types of scenes and conditions of image capture. The research work presented in this thesis is aimed at three distinct applications in this area: interactively extracting 3D points from a pair of uncalibrated images in a flexible way; finding corresponding points automatically in high resolution images, particularly those of archaeological scenes captured from a freely moving light aircraft; and improving a correlation approach to dense disparity mapping leading to 3D surface reconstructions.
The fundamental concepts required to describe the principles of stereo vision, the camera models, and the epipolar geometry described by the fundamental matrix are introduced, followed by a detailed literature review of existing methods.
An interactive system for viewing a scene via a monochrome or colour anaglyph is presented which allows the user to choose the level of compromise between amount of colour and ghosting perceived by controlling colour saturation, and to choose the depth plane of interest. An improved method of extracting 3D coordinates from disparity values when there is significant error is presented.
Interactive methods, while very flexible, require significant effort from the user finding and fusing corresponding points and the thesis continues by presenting several variants of existing scale invariant feature transform methods to automatically find correspondences in uncalibrated high resolution aerial images with improved speed and memory requirements. In addition, a contribution to estimating lens distortion correction by a Levenberg Marquard based method is presented; generating data strings for straight lines which are essential input for estimating lens distortion correction.
The remainder of the thesis presents correlation based methods for generating dense disparity maps based on single and multiple image rectifications using sets of automatically found correspondences and demonstrates improvements obtained using the latter method. Some example views of point clouds for 3D surfaces produced from pairs of uncalibrated images using the methods presented in the thesis are included.Al-Baath UniversityThe appendices files and images are not available online
Three-Dimensional Hand Tracking and Surface-Geometry Measurement for a Robot-Vision System
Tracking of human motion and object identification and recognition are important in many applications including motion capture for human-machine interaction systems. This research is part of a global project to enable a service robot to recognize new objects and perform different object-related tasks based on task guidance and demonstration provided by a general user. This research consists of the calibration and testing of two vision systems which are part of a robot-vision system. First, real-time tracking of a human hand is achieved using images acquired from three calibrated synchronized cameras. Hand pose is determined from the positions of physical markers and input to the robot system in real-time. Second, a multi-line laser camera range sensor is designed, calibrated, and mounted on a robot end-effector to provide three-dimensional (3D) geometry information about objects in the robot environment. The laser-camera sensor includes two cameras to provide stereo vision. For the 3D hand tracking, a novel score-based hand tracking scheme is presented employing dynamic multi-threshold marker detection, a stereo camera-pair utilization scheme, marker matching and labeling using epipolar geometry and hand pose axis analysis, to enable real-time hand tracking under occlusion and non-uniform lighting environments. For surface-geometry measurement using the multi-line laser range sensor, two different approaches are analyzed for two-dimensional (2D) to 3D coordinate mapping, using Bezier surface fitting and neural networks, respectively. The neural-network approach was found to be a more viable approach for surface-geometry measurement worth future exploration for its lower magnitude of 3D reconstruction error and consistency over different regions of the object space
LEVEL-BASED CORRESPONDENCE APPROACH TO COMPUTATIONAL STEREO
One fundamental problem in computational stereo reconstruction is correspondence.
Correspondence is the method of detecting the real world object reflections in two
camera views. This research focuses on correspondence, proposing an algorithm to
improve such detection for low quality cameras (webcams) while trying to achieve
real-time image processing.
Correspondence plays an important role in computational stereo reconstruction and it
has a vast spectrum of applicability. This method is useful in other areas such as
structure from motion reconstruction, object detection, tracking in robot vision and
virtual reality. Due to its importance, a correspondence method needs to be accurate
enough to meet the requirement of such fields but it should be less costly and easy to
use and configure, to be accessible by everyone.
By comparing current local correspondence method and discussing their weakness
and strength, this research tries to enhance an algorithm to improve previous works to
achieve fast detection, less costly and acceptable accuracy to meet the requirement of
reconstruction. In this research, the correspondence is divided into four stages. Two
stages of preprocessing which are noise reduction and edge detection have been
compared with respect to different methods available. In the next stage, the feature
detection process is introduced and discussed focusing on possible solutions to reduce
errors created by system or problem occurring in the scene such as occlusion. Lastly,
in the final stage it elaborates different methods of displaying reconstructed result.
Different sets of data are processed based on the steps involved in correspondence and
the results are discussed and compared in detail. The finding shows how this system
can achieve high speed and acceptable outcome despite of poor quality input. As a
conclusion, some possible improvements are proposed based on ultimate outcome
3D points recover from stereo video sequences based on open CV 2.1 libraries
Mestrado em Engenharia MecânicaThe purpose of this study was to implement a program in C++ using OpenCV image processing
platform's algorithms and Microsoft Visual Studio 2008 development environment to perform cameras calibration and calibration parameters optimization, stereo rectification, stereo correspondence and recover sets of 3D points from a pair of synchronized video sequences obtained
from a stereo configuration. The study utilized two pretest laboratory sessions and one intervention laboratory session. Measurements included setting different stereo configurations with two
Phantom v9.1 high-speed cameras to: capture video sequences of a MELFA RV-2AJ robot executing a simple 3D path, and additionally capture video sequences of a planar calibration object, being moved by a person, to calibrate each stereo configuration. Significant improvements were made
from pretest to intervention laboratory session on minimizing procedures errors and choosing the best camera capture settings. Cameras intrinsic and extrinsic parameters, stereo relations, and disparity-to-depth matrix were better estimated for the last measurements and the comparison
between the obtained sets of 3D points (3D path) with the robot's 3D path proved to be similar
Acquisition and Processing of ToF and Stereo data
Providing a computer the capability to estimate the three-dimensional geometry of a scene is a fundamental problem in computer vision. A classical systems that has been adopted for solving this problem is the so-called stereo vision system (stereo system). Such a system is constituted by a couple of cameras and it exploits the principle of triangulation in order to provide an estimate of the framed scene. In the last ten years, new devices based on the time-of-flight principle have been proposed in order to solve the same problem, i.e., matricial Time-of-Flight range cameras (ToF cameras).
This thesis focuses on the analysis of the two systems (ToF and stereo cam- eras) from a theoretical and an experimental point of view. ToF cameras are introduced in Chapter 2 and stereo systems in Chapter 3. In particular, for the case of the ToF cameras, a new formal model that describes the acquisition process is derived and presented. In order to understand strengths and weaknesses of such different systems, a comparison methodology is introduced and explained in Chapter 4. From the analysis of ToF cameras and stereo systems it is possible to understand the complementarity of the two systems and it is intuitive to figure that a synergic fusion of their data might provide an improvement in the quality of the measurements preformed by the two devices. In Chapter 5 a method for fusing ToF and stereo data based on a probability approach is presented. In Chapter 6 a method that exploits color and three-dimensional geometry information for solving the classical problem of scene segmentation is explaine
- …