1,369 research outputs found
Simultaneously Reconstructing Transparent and Opaque Surfaces from Texture Images
This paper addresses the problem of reconstructing non-overlapping transparent and opaque surfaces from multiple view images. The reconstruction is attained through progressive refinement of an initial 3D shape by minimizing the error between the images of the object and the initial 3D shape. The challenge is to simultaneously reconstruct both the transparent and opaque surfaces given only a limited number of images. Any refinement methods can theoretically be applied if analytic relation between pixel value in the training images and vertices position of the initial 3D shape is known. This paper investigates such analytic relations for reconstructing opaque and transparent surfaces. The analytic relation for opaque surface follows diffuse reflection model, whereas for transparent surface follows ray tracing model. However, both relations can be converged for reconstruction both surfaces into texture mapping model. To improve the reconstruction results several strategies including regularization, hierarchical learning, and simulated annealing are investigated
Light-scattering reconstruction of transparent shapes using neural networks
We propose a cheap non-intrusive high-resolution method of visualising
transparent or translucent objects which may translate, rotate and shapeshift.
We propose a method of reconstructing a strongly deformed time-evolving surface
from a time-series of noisy clouds of points using a lightweight neural
network. We benchmark the method against three different geometries and varying
levels of noise and find that the Gaussian curvature is accurately recovered
when the noise level is below of the diameter of the surface and the data
from distinct regions of the surface do not overlap
NeRRF: 3D Reconstruction and View Synthesis for Transparent and Specular Objects with Neural Refractive-Reflective Fields
Neural radiance fields (NeRF) have revolutionized the field of image-based
view synthesis. However, NeRF uses straight rays and fails to deal with
complicated light path changes caused by refraction and reflection. This
prevents NeRF from successfully synthesizing transparent or specular objects,
which are ubiquitous in real-world robotics and A/VR applications. In this
paper, we introduce the refractive-reflective field. Taking the object
silhouette as input, we first utilize marching tetrahedra with a progressive
encoding to reconstruct the geometry of non-Lambertian objects and then model
refraction and reflection effects of the object in a unified framework using
Fresnel terms. Meanwhile, to achieve efficient and effective anti-aliasing, we
propose a virtual cone supersampling technique. We benchmark our method on
different shapes, backgrounds and Fresnel terms on both real-world and
synthetic datasets. We also qualitatively and quantitatively benchmark the
rendering results of various editing applications, including material editing,
object replacement/insertion, and environment illumination estimation. Codes
and data are publicly available at https://github.com/dawning77/NeRRF
The role of terminators and occlusion cues in motion integration and segmentation: a neural network model
The perceptual interaction of terminators and occlusion cues with the functional processes of motion integration and segmentation is examined using a computational model. Inte-gration is necessary to overcome noise and the inherent ambiguity in locally measured motion direction (the aperture problem). Segmentation is required to detect the presence of motion discontinuities and to prevent spurious integration of motion signals between objects with different trajectories. Terminators are used for motion disambiguation, while occlusion cues are used to suppress motion noise at points where objects intersect. The model illustrates how competitive and cooperative interactions among cells carrying out these functions can account for a number of perceptual effects, including the chopsticks illusion and the occluded diamond illusion. Possible links to the neurophysiology of the middle temporal visual area (MT) are suggested
Simultaneously Reconstructing Transparent and Opaque Surfaces from Texture Images
This paper addresses the problem of reconstructing non-overlapping transparent and opaque surfaces from multiple view images. The reconstruction is attained through progressive refinement of an initial 3D shape by minimizing the error between the images of the object and the initial 3D shape. The challenge is to simultaneously reconstruct both the transparent and opaque surfaces given only a limited number of images. Any refinement methods can theoretically be applied if analytic relation between pixel value in the training images and vertices position of the initial 3D shape is known. This paper investigates such analytic relations for reconstructing opaque and transparent surfaces. The analytic relation for opaque surface follows diffuse reflection model, whereas for transparent surface follows ray tracing model. However, both relations can be converged for reconstruction both surfaces into texture mapping model. To improve the reconstruction results several strategies including regularization, hierarchical learning, and simulated annealing are investigated
Semantic Validation in Structure from Motion
The Structure from Motion (SfM) challenge in computer vision is the process
of recovering the 3D structure of a scene from a series of projective
measurements that are calculated from a collection of 2D images, taken from
different perspectives. SfM consists of three main steps; feature detection and
matching, camera motion estimation, and recovery of 3D structure from estimated
intrinsic and extrinsic parameters and features.
A problem encountered in SfM is that scenes lacking texture or with
repetitive features can cause erroneous feature matching between frames.
Semantic segmentation offers a route to validate and correct SfM models by
labelling pixels in the input images with the use of a deep convolutional
neural network. The semantic and geometric properties associated with classes
in the scene can be taken advantage of to apply prior constraints to each class
of object. The SfM pipeline COLMAP and semantic segmentation pipeline DeepLab
were used. This, along with planar reconstruction of the dense model, were used
to determine erroneous points that may be occluded from the calculated camera
position, given the semantic label, and thus prior constraint of the
reconstructed plane. Herein, semantic segmentation is integrated into SfM to
apply priors on the 3D point cloud, given the object detection in the 2D input
images. Additionally, the semantic labels of matched keypoints are compared and
inconsistent semantically labelled points discarded. Furthermore, semantic
labels on input images are used for the removal of objects associated with
motion in the output SfM models. The proposed approach is evaluated on a
data-set of 1102 images of a repetitive architecture scene. This project offers
a novel method for improved validation of 3D SfM models
HISR: Hybrid Implicit Surface Representation for Photorealistic 3D Human Reconstruction
Neural reconstruction and rendering strategies have demonstrated
state-of-the-art performances due, in part, to their ability to preserve high
level shape details. Existing approaches, however, either represent objects as
implicit surface functions or neural volumes and still struggle to recover
shapes with heterogeneous materials, in particular human skin, hair or clothes.
To this aim, we present a new hybrid implicit surface representation to model
human shapes. This representation is composed of two surface layers that
represent opaque and translucent regions on the clothed human body. We segment
different regions automatically using visual cues and learn to reconstruct two
signed distance functions (SDFs). We perform surface-based rendering on opaque
regions (e.g., body, face, clothes) to preserve high-fidelity surface normals
and volume rendering on translucent regions (e.g., hair). Experiments
demonstrate that our approach obtains state-of-the-art results on 3D human
reconstructions, and also shows competitive performances on other objects.Comment: Accepted by AAAI 2024 main trac
Challenges for Monocular 6D Object Pose Estimation in Robotics
Object pose estimation is a core perception task that enables, for example,
object grasping and scene understanding. The widely available, inexpensive and
high-resolution RGB sensors and CNNs that allow for fast inference based on
this modality make monocular approaches especially well suited for robotics
applications. We observe that previous surveys on object pose estimation
establish the state of the art for varying modalities, single- and multi-view
settings, and datasets and metrics that consider a multitude of applications.
We argue, however, that those works' broad scope hinders the identification of
open challenges that are specific to monocular approaches and the derivation of
promising future challenges for their application in robotics. By providing a
unified view on recent publications from both robotics and computer vision, we
find that occlusion handling, novel pose representations, and formalizing and
improving category-level pose estimation are still fundamental challenges that
are highly relevant for robotics. Moreover, to further improve robotic
performance, large object sets, novel objects, refractive materials, and
uncertainty estimates are central, largely unsolved open challenges. In order
to address them, ontological reasoning, deformability handling, scene-level
reasoning, realistic datasets, and the ecological footprint of algorithms need
to be improved.Comment: arXiv admin note: substantial text overlap with arXiv:2302.1182
- …