1,368 research outputs found
Solving Uncalibrated Photometric Stereo using Total Variation
International audienceEstimating the shape and appearance of an object, given one or several images, is still an open and challenging research problem called 3D-reconstruction. Among the different techniques available, photometric stereo (PS) produces highly accurate results when the lighting conditions have been identified. When these conditions are unknown, the problem becomes the so-called uncalibrated PS problem, which is ill-posed. In this paper, we will show how total variation can be used to reduce the ambiguities of uncalibrated PS, and we will study two methods for estimating the parameters of the generalized bas-relief ambiguity. These methods will be evaluated through the 3D-reconstruction of real-world objects
Recommended from our members
Image Understanding and Robotics Research at Columbia University
Over the past year, the research investigations of the Vision/Robotics Laboratory at Columbia University have reflected the interests of its four faculty members, two staff programmers, and 16 Ph.D. students. Several of the projects involve other faculty members in the department or the university, or researchers at AT&T, IBM, or Philips. We list below a summary of our interests and results, together with the principal researchers associated with them. Since it is difficult to separate those aspects of robotic research that are purely visual from those that are vision-like (for example, tactile sensing) or vision-related (for example, integrated vision-robotic systems), we have listed all robotic research that is not purely manipulative. The majority of our current investigations are deepenings of work reported last year; this was the second year of both our basic Image Understanding contract and our Strategic Computing contract. Therefore, the form of this year's report closely resembles last year's. Although there are a few new initiatives, mainly we report the new results we have obtained in the same five basic research areas. Much of this work is summarized on a video tape that is available on request. We also note two service contributions this past year. The Special Issue on Computer Vision of the Proceedings of the IEEE, August, 1988, was co-edited by one of us (John Kender [27]). And, the upcoming IEEE Computer Society Conference on Computer Vision and Pattem Recognition, June, 1989, is co-program chaired by one of us (John Kender [23])
Self-Supervised Monocular Depth Hints
Monocular depth estimators can be trained with various forms of
self-supervision from binocular-stereo data to circumvent the need for
high-quality laser scans or other ground-truth data. The disadvantage, however,
is that the photometric reprojection losses used with self-supervised learning
typically have multiple local minima. These plausible-looking alternatives to
ground truth can restrict what a regression network learns, causing it to
predict depth maps of limited quality. As one prominent example, depth
discontinuities around thin structures are often incorrectly estimated by
current state-of-the-art methods.
Here, we study the problem of ambiguous reprojections in depth prediction
from stereo-based self-supervision, and introduce Depth Hints to alleviate
their effects. Depth Hints are complementary depth suggestions obtained from
simple off-the-shelf stereo algorithms. These hints enhance an existing
photometric loss function, and are used to guide a network to learn better
weights. They require no additional data, and are assumed to be right only
sometimes. We show that using our Depth Hints gives a substantial boost when
training several leading self-supervised-from-stereo models, not just our own.
Further, combined with other good practices, we produce state-of-the-art depth
predictions on the KITTI benchmark.Comment: Accepted to ICCV 201
Robust multimodal dense SLAM
To enable increasingly intelligent behaviours, autonomous robots will need to be equipped with a deep understanding of their surrounding environment. It would be particularly desirable if this level of perception could be achieved automatically through the use of vision-based sensing, as passive cameras make a compelling sensor choice for robotic platforms due to their low cost, low weight, and low power consumption.
Fundamental to extracting a high-level understanding from a set of 2D images is an understanding of the underlying 3D geometry of the environment. In mobile robotics, the most popular and successful technique for building a representation of 3D geometry from 2D images is Visual Simultaneous Localisation and Mapping (SLAM). While sparse, landmark-based SLAM systems have demonstrated high levels of accuracy and robustness, they are only capable of producing sparse maps. In general, to move beyond simple navigation to scene understanding and interaction, dense 3D reconstructions are required.
Dense SLAM systems naturally allow for online dense scene reconstruction, but suffer from a lack of robustness due to the fact that the dense image alignment used in the tracking step has a narrow convergence basin and that the photometric-based depth estimation used in the mapping step is typically poorly constrained due to the presence of occlusions and homogeneous textures.
This thesis develops methods that can be used to increase the robustness of dense SLAM by fusing additional sensing modalities into standard dense SLAM pipelines. In particular, this thesis will look at two sensing modalities: acceleration and rotation rate measurements from an inertial measurement unit (IMU) to address the tracking issue, and learned priors on dense reconstructions from deep neural networks (DNNs) to address the mapping issue.Open Acces
Numerical Linear Algebra applications in Archaeology: the seriation and the photometric stereo problems
The aim of this thesis is to explore the application of Numerical Linear Algebra to Archaeology. An ordering problem called the seriation problem, used for dating findings and/or artifacts deposits, is analysed in terms of graph theory. In particular, a Matlab implementation of an algorithm for spectral seriation, based on the use of the Fiedler vector of the Laplacian matrix associated with the problem, is presented. We consider bipartite graphs for describing the seriation problem, since the interrelationship between the units (i.e. archaeological sites) to be reordered, can be described in terms of these graphs. In our archaeological metaphor of seriation, the two disjoint nodes sets into which the vertices of a bipartite graph can be divided, represent the excavation sites and the artifacts found inside
them.
Since it is a difficult task to determine the closest bipartite network to a given one, we describe how a starting network can be approximated by a bipartite one by solving a sequence of fairly simple optimization problems.
Another numerical problem related to Archaeology is the 3D reconstruction of the shape of an object from a set of digital pictures. In particular, the Photometric Stereo (PS) photographic technique is considered
Dense Vision in Image-guided Surgery
Image-guided surgery needs an efficient and effective camera tracking system in order to perform augmented reality for overlaying preoperative models or label cancerous tissues on the 2D video images of the surgical scene. Tracking in endoscopic/laparoscopic scenes however is an extremely difficult task primarily due to tissue deformation, instrument invasion into the surgical scene and the presence of specular highlights. State of the art feature-based SLAM systems such as PTAM fail in tracking such scenes since the number of good features to track is very limited. When the scene is smoky and when there are instrument motions, it will cause feature-based tracking to fail immediately.
The work of this thesis provides a systematic approach to this problem using dense vision. We initially attempted to register a 3D preoperative model with multiple 2D endoscopic/laparoscopic images using a dense method but this approach did not perform well. We subsequently proposed stereo reconstruction to directly obtain the 3D structure of the scene. By using the dense reconstructed model together with robust estimation, we demonstrate that dense stereo tracking can be incredibly robust even within extremely challenging endoscopic/laparoscopic scenes.
Several validation experiments have been conducted in this thesis. The proposed stereo reconstruction algorithm has turned out to be the state of the art method for several publicly available ground truth datasets. Furthermore, the proposed robust dense stereo tracking algorithm has been proved highly accurate in synthetic environment (< 0.1 mm RMSE) and qualitatively extremely robust when being applied to real scenes in RALP prostatectomy surgery. This is an important step toward achieving accurate image-guided laparoscopic surgery.Open Acces
- …