46 research outputs found
FlowNet3D++: Geometric Losses For Deep Scene Flow Estimation
We present FlowNet3D++, a deep scene flow estimation network. Inspired by
classical methods, FlowNet3D++ incorporates geometric constraints in the form
of point-to-plane distance and angular alignment between individual vectors in
the flow field, into FlowNet3D. We demonstrate that the addition of these
geometric loss terms improves the previous state-of-art FlowNet3D accuracy from
57.85% to 63.43%. To further demonstrate the effectiveness of our geometric
constraints, we propose a benchmark for flow estimation on the task of dynamic
3D reconstruction, thus providing a more holistic and practical measure of
performance than the breakdown of individual metrics previously used to
evaluate scene flow. This is made possible through the contribution of a novel
pipeline to integrate point-based scene flow predictions into a global dense
volume. FlowNet3D++ achieves up to a 15.0% reduction in reconstruction error
over FlowNet3D, and up to a 35.2% improvement over KillingFusion alone. We will
release our scene flow estimation code later.Comment: Accepted in WACV 202
Adversarial Self-Supervised Scene Flow Estimation
This work proposes a metric learning approach for self-supervised scene flow
estimation. Scene flow estimation is the task of estimating 3D flow vectors for
consecutive 3D point clouds. Such flow vectors are fruitful, \eg for
recognizing actions, or avoiding collisions. Training a neural network via
supervised learning for scene flow is impractical, as this requires manual
annotations for each 3D point at each new timestamp for each scene. To that
end, we seek for a self-supervised approach, where a network learns a latent
metric to distinguish between points translated by flow estimations and the
target point cloud. Our adversarial metric learning includes a multi-scale
triplet loss on sequences of two-point clouds as well as a cycle consistency
loss. Furthermore, we outline a benchmark for self-supervised scene flow
estimation: the Scene Flow Sandbox. The benchmark consists of five datasets
designed to study individual aspects of flow estimation in progressive order of
complexity, from a moving object to real-world scenes. Experimental evaluation
on the benchmark shows that our approach obtains state-of-the-art
self-supervised scene flow results, outperforming recent neighbor-based
approaches. We use our proposed benchmark to expose shortcomings and draw
insights on various training setups. We find that our setup captures motion
coherence and preserves local geometries. Dealing with occlusions, on the other
hand, is still an open challenge.Comment: Published at 3DV 202
SCOOP: Self-Supervised Correspondence and Optimization-Based Scene Flow
Scene flow estimation is a long-standing problem in computer vision, where
the goal is to find the 3D motion of a scene from its consecutive observations.
Recently, there have been efforts to compute the scene flow from 3D point
clouds. A common approach is to train a regression model that consumes source
and target point clouds and outputs the per-point translation vectors. An
alternative is to learn point matches between the point clouds concurrently
with regressing a refinement of the initial correspondence flow. In both cases,
the learning task is very challenging since the flow regression is done in the
free 3D space, and a typical solution is to resort to a large annotated
synthetic dataset. We introduce SCOOP, a new method for scene flow estimation
that can be learned on a small amount of data without employing ground-truth
flow supervision. In contrast to previous work, we train a pure correspondence
model focused on learning point feature representation and initialize the flow
as the difference between a source point and its softly corresponding target
point. Then, in the run-time phase, we directly optimize a flow refinement
component with a self-supervised objective, which leads to a coherent and
accurate flow field between the point clouds. Experiments on widespread
datasets demonstrate the performance gains achieved by our method compared to
existing leading techniques while using a fraction of the training data. Our
code is publicly available at https://github.com/itailang/SCOOP
GMSF: Global Matching Scene Flow
We tackle the task of scene flow estimation from point clouds. Given a source
and a target point cloud, the objective is to estimate a translation from each
point in the source point cloud to the target, resulting in a 3D motion vector
field. Previous dominant scene flow estimation methods require complicated
coarse-to-fine or recurrent architectures as a multi-stage refinement. In
contrast, we propose a significantly simpler single-scale one-shot global
matching to address the problem. Our key finding is that reliable feature
similarity between point pairs is essential and sufficient to estimate accurate
scene flow. To this end, we propose to decompose the feature extraction step
via a hybrid local-global-cross transformer architecture which is crucial to
accurate and robust feature representations. Extensive experiments show that
GMSF sets a new state-of-the-art on multiple scene flow estimation benchmarks.
On FlyingThings3D, with the presence of occlusion points, GMSF reduces the
outlier percentage from the previous best performance of 27.4% to 11.7%. On
KITTI Scene Flow, without any fine-tuning, our proposed method shows
state-of-the-art performance