870 research outputs found
A comparative study of breast surface reconstruction for aesthetic outcome assessment
Breast cancer is the most prevalent cancer type in women, and while its
survival rate is generally high the aesthetic outcome is an increasingly
important factor when evaluating different treatment alternatives. 3D scanning
and reconstruction techniques offer a flexible tool for building detailed and
accurate 3D breast models that can be used both pre-operatively for surgical
planning and post-operatively for aesthetic evaluation. This paper aims at
comparing the accuracy of low-cost 3D scanning technologies with the
significantly more expensive state-of-the-art 3D commercial scanners in the
context of breast 3D reconstruction. We present results from 28 synthetic and
clinical RGBD sequences, including 12 unique patients and an anthropomorphic
phantom demonstrating the applicability of low-cost RGBD sensors to real
clinical cases. Body deformation and homogeneous skin texture pose challenges
to the studied reconstruction systems. Although these should be addressed
appropriately if higher model quality is warranted, we observe that low-cost
sensors are able to obtain valuable reconstructions comparable to the
state-of-the-art within an error margin of 3 mm.Comment: This paper has been accepted to MICCAI201
Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video
We present a novel real-time capable learning method that jointly perceives a
3D scene's geometry structure and semantic labels. Recent approaches to
real-time 3D scene reconstruction mostly adopt a volumetric scheme, where a
Truncated Signed Distance Function (TSDF) is directly regressed. However, these
volumetric approaches tend to focus on the global coherence of their
reconstructions, which leads to a lack of local geometric detail. To overcome
this issue, we propose to leverage the latent geometric prior knowledge in 2D
image features by explicit depth prediction and anchored feature generation, to
refine the occupancy learning in TSDF volume. Besides, we find that this
cross-dimensional feature refinement methodology can also be adopted for the
semantic segmentation task by utilizing semantic priors. Hence, we proposed an
end-to-end cross-dimensional refinement neural network (CDRNet) to extract both
3D mesh and 3D semantic labeling in real time. The experiment results show that
this method achieves a state-of-the-art 3D perception efficiency on multiple
datasets, which indicates the great potential of our method for industrial
applications.Comment: Accpeted to ICCV 2023 Workshops. Project page:
https://hafred.github.io/cdrnet
SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes
Online reconstructing and rendering of large-scale indoor scenes is a
long-standing challenge. SLAM-based methods can reconstruct 3D scene geometry
progressively in real time but can not render photorealistic results. While
NeRF-based methods produce promising novel view synthesis results, their long
offline optimization time and lack of geometric constraints pose challenges to
efficiently handling online input. Inspired by the complementary advantages of
classical 3D reconstruction and NeRF, we thus investigate marrying explicit
geometric representation with NeRF rendering to achieve efficient online
reconstruction and high-quality rendering. We introduce SurfelNeRF, a variant
of neural radiance field which employs a flexible and scalable neural surfel
representation to store geometric attributes and extracted appearance features
from input images. We further extend the conventional surfel-based fusion
scheme to progressively integrate incoming input frames into the reconstructed
global neural scene representation. In addition, we propose a highly-efficient
differentiable rasterization scheme for rendering neural surfel radiance
fields, which helps SurfelNeRF achieve speedups in training and
inference time, respectively. Experimental results show that our method
achieves the state-of-the-art 23.82 PSNR and 29.58 PSNR on ScanNet in
feedforward inference and per-scene optimization settings, respectively.Comment: To appear in CVPR 202
Nonrigid reconstruction of 3D breast surfaces with a low-cost RGBD camera for surgical planning and aesthetic evaluation
Accounting for 26% of all new cancer cases worldwide, breast cancer remains
the most common form of cancer in women. Although early breast cancer has a
favourable long-term prognosis, roughly a third of patients suffer from a
suboptimal aesthetic outcome despite breast conserving cancer treatment.
Clinical-quality 3D modelling of the breast surface therefore assumes an
increasingly important role in advancing treatment planning, prediction and
evaluation of breast cosmesis. Yet, existing 3D torso scanners are expensive
and either infrastructure-heavy or subject to motion artefacts. In this paper
we employ a single consumer-grade RGBD camera with an ICP-based registration
approach to jointly align all points from a sequence of depth images
non-rigidly. Subtle body deformation due to postural sway and respiration is
successfully mitigated leading to a higher geometric accuracy through
regularised locally affine transformations. We present results from 6 clinical
cases where our method compares well with the gold standard and outperforms a
previous approach. We show that our method produces better reconstructions
qualitatively by visual assessment and quantitatively by consistently obtaining
lower landmark error scores and yielding more accurate breast volume estimates
Large-Scale Textured 3D Scene Reconstruction
Die Erstellung dreidimensionaler Umgebungsmodelle ist eine fundamentale Aufgabe im Bereich des maschinellen Sehens. Rekonstruktionen sind für eine Reihe von Anwendungen von Nutzen, wie bei der Vermessung, dem Erhalt von Kulturgütern oder der Erstellung virtueller Welten in der Unterhaltungsindustrie. Im Bereich des automatischen Fahrens helfen sie bei der Bewältigung einer Vielzahl an Herausforderungen. Dazu gehören Lokalisierung, das Annotieren großer Datensätze oder die vollautomatische Erstellung von Simulationsszenarien.
Die Herausforderung bei der 3D Rekonstruktion ist die gemeinsame Schätzung von Sensorposen und einem Umgebunsmodell. Redundante und potenziell fehlerbehaftete Messungen verschiedener Sensoren müssen in eine gemeinsame Repräsentation der Welt integriert werden, um ein metrisch und photometrisch korrektes Modell zu erhalten. Gleichzeitig muss die Methode effizient Ressourcen nutzen, um Laufzeiten zu erreichen, welche die praktische Nutzung ermöglichen.
In dieser Arbeit stellen wir ein Verfahren zur Rekonstruktion vor, das fähig ist, photorealistische 3D Rekonstruktionen großer Areale zu erstellen, die sich über mehrere Kilometer erstrecken. Entfernungsmessungen aus Laserscannern und Stereokamerasystemen werden zusammen mit Hilfe eines volumetrischen Rekonstruktionsverfahrens fusioniert. Ringschlüsse werden erkannt und als zusätzliche Bedingungen eingebracht, um eine global konsistente Karte zu erhalten. Das resultierende Gitternetz wird aus Kamerabildern texturiert, wobei die einzelnen Beobachtungen mit ihrer Güte gewichtet werden. Für eine nahtlose Erscheinung werden die unbekannten Belichtungszeiten und Parameter des optischen Systems mitgeschätzt und die Bilder entsprechend korrigiert.
Wir evaluieren unsere Methode auf synthetischen Daten, realen Sensordaten unseres Versuchsfahrzeugs und öffentlich verfügbaren Datensätzen. Wir zeigen qualitative Ergebnisse großer innerstädtischer Bereiche, sowie quantitative Auswertungen der Fahrzeugtrajektorie und der Rekonstruktionsqualität.
Zuletzt präsentieren wir mehrere Anwendungen und zeigen somit den Nutzen unserer Methode für Anwendungen im Bereich des automatischen Fahrens
Robust multimodal dense SLAM
To enable increasingly intelligent behaviours, autonomous robots will need to be equipped with a deep understanding of their surrounding environment. It would be particularly desirable if this level of perception could be achieved automatically through the use of vision-based sensing, as passive cameras make a compelling sensor choice for robotic platforms due to their low cost, low weight, and low power consumption.
Fundamental to extracting a high-level understanding from a set of 2D images is an understanding of the underlying 3D geometry of the environment. In mobile robotics, the most popular and successful technique for building a representation of 3D geometry from 2D images is Visual Simultaneous Localisation and Mapping (SLAM). While sparse, landmark-based SLAM systems have demonstrated high levels of accuracy and robustness, they are only capable of producing sparse maps. In general, to move beyond simple navigation to scene understanding and interaction, dense 3D reconstructions are required.
Dense SLAM systems naturally allow for online dense scene reconstruction, but suffer from a lack of robustness due to the fact that the dense image alignment used in the tracking step has a narrow convergence basin and that the photometric-based depth estimation used in the mapping step is typically poorly constrained due to the presence of occlusions and homogeneous textures.
This thesis develops methods that can be used to increase the robustness of dense SLAM by fusing additional sensing modalities into standard dense SLAM pipelines. In particular, this thesis will look at two sensing modalities: acceleration and rotation rate measurements from an inertial measurement unit (IMU) to address the tracking issue, and learned priors on dense reconstructions from deep neural networks (DNNs) to address the mapping issue.Open Acces
FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction
Recent works on 3D reconstruction from posed images have demonstrated that
direct inference of scene-level 3D geometry without test-time optimization is
feasible using deep neural networks, showing remarkable promise and high
efficiency. However, the reconstructed geometry, typically represented as a 3D
truncated signed distance function (TSDF), is often coarse without fine
geometric details. To address this problem, we propose three effective
solutions for improving the fidelity of inference-based 3D reconstructions. We
first present a resolution-agnostic TSDF supervision strategy to provide the
network with a more accurate learning signal during training, avoiding the
pitfalls of TSDF interpolation seen in previous work. We then introduce a depth
guidance strategy using multi-view depth estimates to enhance the scene
representation and recover more accurate surfaces. Finally, we develop a novel
architecture for the final layers of the network, conditioning the output TSDF
prediction on high-resolution image features in addition to coarse voxel
features, enabling sharper reconstruction of fine details. Our method,
FineRecon, produces smooth and highly accurate reconstructions, showing
significant improvements across multiple depth and 3D reconstruction metrics.Comment: ICCV 202
- …