18,095 research outputs found
SCOOP: Self-Supervised Correspondence and Optimization-Based Scene Flow
Scene flow estimation is a long-standing problem in computer vision, where
the goal is to find the 3D motion of a scene from its consecutive observations.
Recently, there have been efforts to compute the scene flow from 3D point
clouds. A common approach is to train a regression model that consumes source
and target point clouds and outputs the per-point translation vectors. An
alternative is to learn point matches between the point clouds concurrently
with regressing a refinement of the initial correspondence flow. In both cases,
the learning task is very challenging since the flow regression is done in the
free 3D space, and a typical solution is to resort to a large annotated
synthetic dataset. We introduce SCOOP, a new method for scene flow estimation
that can be learned on a small amount of data without employing ground-truth
flow supervision. In contrast to previous work, we train a pure correspondence
model focused on learning point feature representation and initialize the flow
as the difference between a source point and its softly corresponding target
point. Then, in the run-time phase, we directly optimize a flow refinement
component with a self-supervised objective, which leads to a coherent and
accurate flow field between the point clouds. Experiments on widespread
datasets demonstrate the performance gains achieved by our method compared to
existing leading techniques while using a fraction of the training data. Our
code is publicly available at https://github.com/itailang/SCOOP
Adversarial Self-Supervised Scene Flow Estimation
This work proposes a metric learning approach for self-supervised scene flow
estimation. Scene flow estimation is the task of estimating 3D flow vectors for
consecutive 3D point clouds. Such flow vectors are fruitful, \eg for
recognizing actions, or avoiding collisions. Training a neural network via
supervised learning for scene flow is impractical, as this requires manual
annotations for each 3D point at each new timestamp for each scene. To that
end, we seek for a self-supervised approach, where a network learns a latent
metric to distinguish between points translated by flow estimations and the
target point cloud. Our adversarial metric learning includes a multi-scale
triplet loss on sequences of two-point clouds as well as a cycle consistency
loss. Furthermore, we outline a benchmark for self-supervised scene flow
estimation: the Scene Flow Sandbox. The benchmark consists of five datasets
designed to study individual aspects of flow estimation in progressive order of
complexity, from a moving object to real-world scenes. Experimental evaluation
on the benchmark shows that our approach obtains state-of-the-art
self-supervised scene flow results, outperforming recent neighbor-based
approaches. We use our proposed benchmark to expose shortcomings and draw
insights on various training setups. We find that our setup captures motion
coherence and preserves local geometries. Dealing with occlusions, on the other
hand, is still an open challenge.Comment: Published at 3DV 202
Deep Learning for Scene Flow Estimation on Point Clouds: A Survey and Prospective Trends
Aiming at obtaining structural information and 3D motion of dynamic scenes, scene flow estimation has been an interest of research in computer vision and computer graphics for a long time. It is also a fundamental task for various applications such as autonomous driving. Compared to previous methods that utilize image representations, many recent researches build upon the power of deep analysis and focus on point clouds representation to conduct 3D flow estimation. This paper comprehensively reviews the pioneering literature in scene flow estimation based on point clouds. Meanwhile, it delves into detail in learning paradigms and presents insightful comparisons between the state-of-the-art methods using deep learning for scene flow estimation. Furthermore, this paper investigates various higher-level scene understanding tasks, including object tracking, motion segmentation, etc. and concludes with an overview of foreseeable research trends for scene flow estimation
Flow-based GAN for 3D Point Cloud Generation from a Single Image
Generating a 3D point cloud from a single 2D image is of great importance for
3D scene understanding applications. To reconstruct the whole 3D shape of the
object shown in the image, the existing deep learning based approaches use
either explicit or implicit generative modeling of point clouds, which,
however, suffer from limited quality. In this work, we aim to alleviate this
issue by introducing a hybrid explicit-implicit generative modeling scheme,
which inherits the flow-based explicit generative models for sampling point
clouds with arbitrary resolutions while improving the detailed 3D structures of
point clouds by leveraging the implicit generative adversarial networks (GANs).
We evaluate on the large-scale synthetic dataset ShapeNet, with the
experimental results demonstrating the superior performance of the proposed
method. In addition, the generalization ability of our method is demonstrated
by performing on cross-category synthetic images as well as by testing on real
images from PASCAL3D+ dataset.Comment: 13 pages, 5 figures, accepted to BMVC202
RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation
Recently, the RGB images and point clouds fusion methods have been proposed
to jointly estimate 2D optical flow and 3D scene flow. However, as both
conventional RGB cameras and LiDAR sensors adopt a frame-based data acquisition
mechanism, their performance is limited by the fixed low sampling rates,
especially in highly-dynamic scenes. By contrast, the event camera can
asynchronously capture the intensity changes with a very high temporal
resolution, providing complementary dynamic information of the observed scenes.
In this paper, we incorporate RGB images, Point clouds and Events for joint
optical flow and scene flow estimation with our proposed multi-stage multimodal
fusion model, RPEFlow. First, we present an attention fusion module with a
cross-attention mechanism to implicitly explore the internal cross-modal
correlation for 2D and 3D branches, respectively. Second, we introduce a mutual
information regularization term to explicitly model the complementary
information of three modalities for effective multimodal feature learning. We
also contribute a new synthetic dataset to advocate further research.
Experiments on both synthetic and real datasets show that our model outperforms
the existing state-of-the-art by a wide margin. Code and dataset is available
at https://npucvr.github.io/RPEFlow.Comment: ICCV 2023. Project page: https://npucvr.github.io/RPEFlow Code:
https://github.com/danqu130/RPEFlo
Unleash the Potential of 3D Point Cloud Modeling with A Calibrated Local Geometry-driven Distance Metric
Quantifying the dissimilarity between two unstructured 3D point clouds is a
challenging task, with existing metrics often relying on measuring the distance
between corresponding points that can be either inefficient or ineffective. In
this paper, we propose a novel distance metric called Calibrated Local Geometry
Distance (CLGD), which computes the difference between the underlying 3D
surfaces calibrated and induced by a set of reference points. By associating
each reference point with two given point clouds through computing its
directional distances to them, the difference in directional distances of an
identical reference point characterizes the geometric difference between a
typical local region of the two point clouds. Finally, CLGD is obtained by
averaging the directional distance differences of all reference points. We
evaluate CLGD on various optimization and unsupervised learning-based tasks,
including shape reconstruction, rigid registration, scene flow estimation, and
feature representation. Extensive experiments show that CLGD achieves
significantly higher accuracy under all tasks in a memory and computationally
efficient manner, compared with existing metrics. As a generic metric, CLGD has
the potential to advance 3D point cloud modeling. The source code is publicly
available at https://github.com/rsy6318/CLGD
- …