12,678 research outputs found
Cascaded Scene Flow Prediction using Semantic Segmentation
Given two consecutive frames from a pair of stereo cameras, 3D scene flow
methods simultaneously estimate the 3D geometry and motion of the observed
scene. Many existing approaches use superpixels for regularization, but may
predict inconsistent shapes and motions inside rigidly moving objects. We
instead assume that scenes consist of foreground objects rigidly moving in
front of a static background, and use semantic cues to produce pixel-accurate
scene flow estimates. Our cascaded classification framework accurately models
3D scenes by iteratively refining semantic segmentation masks, stereo
correspondences, 3D rigid motion estimates, and optical flow fields. We
evaluate our method on the challenging KITTI autonomous driving benchmark, and
show that accounting for the motion of segmented vehicles leads to
state-of-the-art performance.Comment: International Conference on 3D Vision (3DV), 2017 (oral presentation
Learning the dynamics and time-recursive boundary detection of deformable objects
We propose a principled framework for recursively segmenting deformable objects across a sequence
of frames. We demonstrate the usefulness of this method on left ventricular segmentation across a cardiac
cycle. The approach involves a technique for learning the system dynamics together with methods of
particle-based smoothing as well as non-parametric belief propagation on a loopy graphical model capturing
the temporal periodicity of the heart. The dynamic system state is a low-dimensional representation
of the boundary, and the boundary estimation involves incorporating curve evolution into recursive state
estimation. By formulating the problem as one of state estimation, the segmentation at each particular
time is based not only on the data observed at that instant, but also on predictions based on past and future
boundary estimates. Although the paper focuses on left ventricle segmentation, the method generalizes
to temporally segmenting any deformable object
CNN for Very Fast Ground Segmentation in Velodyne LiDAR Data
This paper presents a novel method for ground segmentation in Velodyne point
clouds. We propose an encoding of sparse 3D data from the Velodyne sensor
suitable for training a convolutional neural network (CNN). This general
purpose approach is used for segmentation of the sparse point cloud into ground
and non-ground points. The LiDAR data are represented as a multi-channel 2D
signal where the horizontal axis corresponds to the rotation angle and the
vertical axis the indexes channels (i.e. laser beams). Multiple topologies of
relatively shallow CNNs (i.e. 3-5 convolutional layers) are trained and
evaluated using a manually annotated dataset we prepared. The results show
significant improvement of performance over the state-of-the-art method by
Zhang et al. in terms of speed and also minor improvements in terms of
accuracy.Comment: ICRA 2018 submissio
Blending Learning and Inference in Structured Prediction
In this paper we derive an efficient algorithm to learn the parameters of
structured predictors in general graphical models. This algorithm blends the
learning and inference tasks, which results in a significant speedup over
traditional approaches, such as conditional random fields and structured
support vector machines. For this purpose we utilize the structures of the
predictors to describe a low dimensional structured prediction task which
encourages local consistencies within the different structures while learning
the parameters of the model. Convexity of the learning task provides the means
to enforce the consistencies between the different parts. The
inference-learning blending algorithm that we propose is guaranteed to converge
to the optimum of the low dimensional primal and dual programs. Unlike many of
the existing approaches, the inference-learning blending allows us to learn
efficiently high-order graphical models, over regions of any size, and very
large number of parameters. We demonstrate the effectiveness of our approach,
while presenting state-of-the-art results in stereo estimation, semantic
segmentation, shape reconstruction, and indoor scene understanding
- …