76 research outputs found
Learning monocular depth estimation with unsupervised trinocular assumptions
Obtaining accurate depth measurements out of a single image represents a
fascinating solution to 3D sensing. CNNs led to considerable improvements in
this field, and recent trends replaced the need for ground-truth labels with
geometry-guided image reconstruction signals enabling unsupervised training.
Currently, for this purpose, state-of-the-art techniques rely on images
acquired with a binocular stereo rig to predict inverse depth (i.e., disparity)
according to the aforementioned supervision principle. However, these methods
suffer from well-known problems near occlusions, left image border, etc
inherited from the stereo setup. Therefore, in this paper, we tackle these
issues by moving to a trinocular domain for training. Assuming the central
image as the reference, we train a CNN to infer disparity representations
pairing such image with frames on its left and right side. This strategy allows
obtaining depth maps not affected by typical stereo artifacts. Moreover, being
trinocular datasets seldom available, we introduce a novel interleaved training
procedure enabling to enforce the trinocular assumption outlined from current
binocular datasets. Exhaustive experimental results on the KITTI dataset
confirm that our proposal outperforms state-of-the-art methods for unsupervised
monocular depth estimation trained on binocular stereo pairs as well as any
known methods relying on other cues.Comment: 14 pages, 7 figures, 4 tables. Accepted to 3DV 201
Segment-based stereo matching algorithm with rectification for single-lens bi-prism stereovision system
Ph.DDOCTOR OF PHILOSOPH
Automated Mobile System for Accurate Outdoor Tree Crop Enumeration Using an Uncalibrated Camera.
This paper demonstrates an automated computer vision system for outdoor tree crop enumeration in a seedling nursery. The complete system incorporates both hardware components (including an embedded microcontroller, an odometry encoder, and an uncalibrated digital color camera) and software algorithms (including microcontroller algorithms and the proposed algorithm for tree crop enumeration) required to obtain robust performance in a natural outdoor environment. The enumeration system uses a three-step image analysis process based upon: (1) an orthographic plant projection method integrating a perspective transform with automatic parameter estimation; (2) a plant counting method based on projection histograms; and (3) a double-counting avoidance method based on a homography transform. Experimental results demonstrate the ability to count large numbers of plants automatically with no human effort. Results show that, for tree seedlings having a height up to 40 cm and a within-row tree spacing of approximately 10 cm, the algorithms successfully estimated the number of plants with an average accuracy of 95.2% for trees within a single image and 98% for counting of the whole plant population in a large sequence of images
Multi-Scale 3D Scene Flow from Binocular Stereo Sequences
Scene flow methods estimate the three-dimensional motion field for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene flow estimation that provides reliable results using only two cameras by fusing stereo and optical flow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical flow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene flow than previous methods allow. To handle the aperture problems inherent in the estimation of optical flow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization – two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108
Stereo Correspondence and Depth Recovery of Single-lens Bi-prism Based Stereovision System
Ph.DDOCTOR OF PHILOSOPH
Visual grasp point localization, classification and state recognition in robotic manipulation of cloth: an overview
© . This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/Cloth manipulation by robots is gaining popularity among researchers because of its relevance, mainly (but not only) in domestic and assistive robotics. The required science and technologies begin to be ripe for the challenges posed by the manipulation of soft materials, and many contributions have appeared in the last years. This survey provides a systematic review of existing techniques for the basic perceptual tasks of grasp point localization, state estimation and classification of cloth items, from the perspective of their manipulation by robots. This choice is grounded on the fact that any manipulative action requires to instruct the robot where to grasp, and most garment handling activities depend on the correct recognition of the type to which the particular cloth item belongs and its state. The high inter- and intraclass variability of garments, the continuous nature of the possible deformations of cloth and the evident difficulties in predicting their localization and extension on the garment piece are challenges that have encouraged the researchers to provide a plethora of methods to confront such problems, with some promising results. The present review constitutes for the first time an effort in furnishing a structured framework of these works, with the aim of helping future contributors to gain both insight and perspective on the subjectPeer ReviewedPostprint (author's final draft
Stereo matching algorithm by propagation of correspondences and stereo vision instrumentation
A new image processing method is described for measuring the 3-D coordinates of a complex, biological surface. One of the problems in stereo vision is known as the accuracy-precision tradeoff problem. This thesis proposes a new method that promises to solve this problem. To do so, two issues are addressed. First, stereo vision instrumentation methods are described. This instrumentation includes a camera system as well as camera calibration, rectification, matching and triangulation. Second, the approach employs an array of cameras that allow accurate computation of the depth map of a surface by propagation of correspondences through pair-wise camera views.
The new method proposed in this thesis employs an array of cameras, and preserves the small baseline advantage by finding accurate correspondences in pairs of adjacent cameras. These correspondences are then propagated along the consecutive pairs of cameras in the array until a large baseline is accomplished. The resulting large baseline disparities are then used for triangulation to achieve advantage of precision in depth measurement.
The matching is done by an area-based intensity correlation function called Sum of Squared Differences (SSD). In this thesis, the feasibility of using these data for further processing to achieve surface or volume measurements in the future is discussed
- …