135 research outputs found
Topological evaluation of volume reconstructions by voxel carving
Space or voxel carving [1, 4, 10, 15] is a technique for creating a three-dimensional reconstruction of an object from a series of two-dimensional images captured from cameras placed around the object at different viewing angles. However, little work has been done to date on evaluating the quality of space carving results. This paper extends the work reported in [8], where application of persistent homology was initially proposed as a tool for providing a topological analysis of the carving process along the sequence of 3D reconstructions with increasing number of cameras. We give now a more extensive treatment by: (1) developing the formal framework by which persistent homology can be applied in this context; (2) computing persistent homology of the 3D reconstructions of 66 new frames, including different poses, resolutions and camera orders; (3) studying what information about stability, topological correctness and influence of the camera orders in the carving performance can be drawn from the computed barcodes
Persistent homology for 3D reconstruction evaluation
Space or voxel carving is a non-invasive technique that is used to produce a 3D volume and can be used in particular for the reconstruction of a 3D human model from images captured from a set of cameras placed around the subject. In [1], the authors present a technique to quantitatively evaluate spatially carved volumetric representations of humans using a synthetic dataset of typical sports motion in a tennis court scenario, with regard to the number of cameras used. In this paper, we compute persistent homology over the sequence of chain complexes obtained from the 3D outcomes with increasing number of cameras. This allows us to analyze the topological evolution of the reconstruction process, something which as far as we are aware has not been investigated to date
Designing a topological algorithm for 3D activity recognition
Voxel carving is a non-invasive and low-cost technique that is used for the reconstruction of a 3D volume from images captured from a set of cameras placed around the object of interest. In this paper we propose a method to topologically analyze a video sequence of 3D reconstructions representing a tennis player performing different forehand and backhand strokes with the aim of providing an approach that could be useful in other sport activities
From small to large baseline multiview stereo : dealing with blur, clutter and occlusions
This thesis addresses the problem of reconstructing the three-dimensional
(3D) digital model of a scene from a collection of two-dimensional (2D)
images taken from it. To address this fundamental computer vision
problem, we propose three algorithms. They are the main contributions
of this thesis.
First, we solve multiview stereo with the o -axis aperture camera.
This system has a very small baseline as images are captured from
viewpoints close to each other. The key idea is to change the size or
the 3D location of the aperture of the camera so as to extract selected
portions of the scene. Our imaging model takes both defocus and
stereo information into account and allows to solve shape reconstruction
and image restoration in one go. The o -axis aperture camera can
be used in a small-scale space where the camera motion is constrained
by the surrounding environment, such as in 3D endoscopy.
Second, to solve multiview stereo with large baseline, we present a
framework that poses the problem of recovering a 3D surface in the
scene as a regularized minimal partition problem of a visibility function.
The formulation is convex and hence guarantees that the solution
converges to the global minimum. Our formulation is robust
to view-varying extensive occlusions, clutter and image noise. At
any stage during the estimation process the method does not rely on
the visual hull, 2D silhouettes, approximate depth maps, or knowing
which views are dependent(i.e., overlapping) and which are independent(
i.e., non overlapping). Furthermore, the degenerate solution, the
null surface, is not included as a global solution in this formulation.
One limitation of this algorithm is that its computation complexity
grows with the number of views that we combine simultaneously. To
address this limitation, we propose a third formulation. In this formulation,
the visibility functions are integrated within a narrow band
around the estimated surface by setting weights to each point along
optical rays.
This thesis presents technical descriptions for each algorithm and detailed
analyses to show how these algorithms improve existing reconstruction
techniques
Large-Scale Automatic Reconstruction of Neuronal Processes from Electron Microscopy Images
Automated sample preparation and electron microscopy enables acquisition of
very large image data sets. These technical advances are of special importance
to the field of neuroanatomy, as 3D reconstructions of neuronal processes at
the nm scale can provide new insight into the fine grained structure of the
brain. Segmentation of large-scale electron microscopy data is the main
bottleneck in the analysis of these data sets. In this paper we present a
pipeline that provides state-of-the art reconstruction performance while
scaling to data sets in the GB-TB range. First, we train a random forest
classifier on interactive sparse user annotations. The classifier output is
combined with an anisotropic smoothing prior in a Conditional Random Field
framework to generate multiple segmentation hypotheses per image. These
segmentations are then combined into geometrically consistent 3D objects by
segmentation fusion. We provide qualitative and quantitative evaluation of the
automatic segmentation and demonstrate large-scale 3D reconstructions of
neuronal processes from a volume of brain
tissue over a cube of in each dimension corresponding to
1000 consecutive image sections. We also introduce Mojo, a proofreading tool
including semi-automated correction of merge errors based on sparse user
scribbles
Approaches to three-dimensional reconstruction of plant shoot topology and geometry
There are currently 805 million people classified as chronically undernourished, and yet the World’s population is still increasing. At the same time, global warming is causing more frequent and severe flooding and drought, thus destroying crops and reducing the amount of land available for agriculture. Recent studies show that without crop climate adaption, crop productivity will deteriorate. With access to 3D models of real plants it is possible to acquire detailed morphological and gross developmental data that can be used to study their ecophysiology, leading to an increase in crop yield and stability across hostile and changing environments. Here we review approaches to the reconstruction of 3D models of plant shoots from image data, consider current applications in plant and crop science, and identify remaining challenges. We conclude that although phenotyping is receiving an increasing amount of attention – particularly from computer vision researchers – and numerous vision approaches have been proposed, it still remains a highly interactive process. An automated system capable of producing 3D models of plants would significantly aid phenotyping practice, increasing accuracy and repeatability of measurements
Multiview stereo via volumetric graph-cuts and occlusion robust photo-consistency
This paper presents a volumetric formulation for the multiview stereo problem which is amenable to a computationally tractable global optimization using Graph-cuts. Our approach is to seek the optimal partitioning of 3D space into two regions labeled as "object" and "empty" under a cost functional consisting of the following two terms: 1) A term that forces the boundary between the two regions to pass through photo-consistent locations; and 2) a ballooning term that inflates the "object" region. To take account of the effect of occlusion on the first term, we use an occlusion robust photo-consistency metric based on normalized cross correlation, which does not assume any geometric knowledge about the reconstructed object. The globally optimal 3D partitioning can be obtained as the minimum cut solution of a weighted graph
Recommended from our members
3D Shape Understanding and Generation
In recent years, Machine Learning techniques have revolutionized solutions to longstanding image-based problems, like image classification, generation, semantic segmentation, object detection and many others. However, if we want to be able to build agents that can successfully interact with the real world, those techniques need to be capable of reasoning about the world as it truly is: a tridimensional space. There are two main challenges while handling 3D information in machine learning models. First, it is not clear what is the best 3D representation. For images, convolutional neural networks (CNNs) operating on raster images yield the best results in virtually all image-based benchmarks. For 3D data, the best combination of model and representation is still an open question. Second, 3D data is not available on the same scale as images – taking pictures is a common procedure in our daily lives, whereas capturing 3D content is an activity usually restricted to specialized professionals. This thesis is focused on addressing both of these issues. Which model and representation should we use for generating and recognizing 3D data? What are efficient ways of learning 3D representations from a few examples? Is it possible to leverage image data to build models capable of reasoning about the world in 3D?
Our research findings show that it is possible to build models that efficiently generate 3D shapes as irregularly structured representations. Those models require significantly less memory while generating higher quality shapes than the ones based on voxels and multi-view representations. We start by developing techniques to generate shapes represented as point clouds. This class of models leads to high quality reconstructions and better unsupervised feature learning. However, since point clouds are not amenable to editing and human manipulation, we also present models capable of generating shapes as sets of shape handles -- simpler primitives that summarize complex 3D shapes and were specifically designed for high-level tasks and user interaction. Despite their effectiveness, those approaches require some form of 3D supervision, which is scarce. We present multiple alternatives to this problem. First, we investigate how approximate convex decomposition techniques can be used as self-supervision to improve recognition models when only a limited number of labels are available. Second, we study how neural network architectures induce shape priors that can be used in multiple reconstruction tasks -- using both volumetric and manifold representations. In this regime, reconstruction is performed from a single example -- either a sparse point cloud or multiple silhouettes. Finally, we demonstrate how to train generative models of 3D shapes without using any 3D supervision by combining differentiable rendering techniques and Generative Adversarial Networks
- …