16,703 research outputs found
SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation
We introduce Similarity Group Proposal Network (SGPN), a simple and intuitive
deep learning framework for 3D object instance segmentation on point clouds.
SGPN uses a single network to predict point grouping proposals and a
corresponding semantic class for each proposal, from which we can directly
extract instance segmentation results. Important to the effectiveness of SGPN
is its novel representation of 3D instance segmentation results in the form of
a similarity matrix that indicates the similarity between each pair of points
in embedded feature space, thus producing an accurate grouping proposal for
each point. To the best of our knowledge, SGPN is the first framework to learn
3D instance-aware semantic segmentation on point clouds. Experimental results
on various 3D scenes show the effectiveness of our method on 3D instance
segmentation, and we also evaluate the capability of SGPN to improve 3D object
detection and semantic segmentation results. We also demonstrate its
flexibility by seamlessly incorporating 2D CNN features into the framework to
boost performance
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
General Dynamic Scene Reconstruction from Multiple View Video
This paper introduces a general approach to dynamic scene reconstruction from
multiple moving cameras without prior knowledge or limiting constraints on the
scene structure, appearance, or illumination. Existing techniques for dynamic
scene reconstruction from multiple wide-baseline camera views primarily focus
on accurate reconstruction in controlled environments, where the cameras are
fixed and calibrated and background is known. These approaches are not robust
for general dynamic scenes captured with sparse moving cameras. Previous
approaches for outdoor dynamic scene reconstruction assume prior knowledge of
the static background appearance and structure. The primary contributions of
this paper are twofold: an automatic method for initial coarse dynamic scene
segmentation and reconstruction without prior knowledge of background
appearance or structure; and a general robust approach for joint segmentation
refinement and dense reconstruction of dynamic scenes from multiple
wide-baseline static or moving cameras. Evaluation is performed on a variety of
indoor and outdoor scenes with cluttered backgrounds and multiple dynamic
non-rigid objects such as people. Comparison with state-of-the-art approaches
demonstrates improved accuracy in both multiple view segmentation and dense
reconstruction. The proposed approach also eliminates the requirement for prior
knowledge of scene structure and appearance
Difference of Normals as a Multi-Scale Operator in Unorganized Point Clouds
A novel multi-scale operator for unorganized 3D point clouds is introduced.
The Difference of Normals (DoN) provides a computationally efficient,
multi-scale approach to processing large unorganized 3D point clouds. The
application of DoN in the multi-scale filtering of two different real-world
outdoor urban LIDAR scene datasets is quantitatively and qualitatively
demonstrated. In both datasets the DoN operator is shown to segment large 3D
point clouds into scale-salient clusters, such as cars, people, and lamp posts
towards applications in semi-automatic annotation, and as a pre-processing step
in automatic object recognition. The application of the operator to
segmentation is evaluated on a large public dataset of outdoor LIDAR scenes
with ground truth annotations.Comment: To be published in proceedings of 3DIMPVT 201
Grounding semantics in robots for Visual Question Answering
In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning
Event-Based Motion Segmentation by Motion Compensation
In contrast to traditional cameras, whose pixels have a common exposure time,
event-based cameras are novel bio-inspired sensors whose pixels work
independently and asynchronously output intensity changes (called "events"),
with microsecond resolution. Since events are caused by the apparent motion of
objects, event-based cameras sample visual information based on the scene
dynamics and are, therefore, a more natural fit than traditional cameras to
acquire motion, especially at high speeds, where traditional cameras suffer
from motion blur. However, distinguishing between events caused by different
moving objects and by the camera's ego-motion is a challenging task. We present
the first per-event segmentation method for splitting a scene into
independently moving objects. Our method jointly estimates the event-object
associations (i.e., segmentation) and the motion parameters of the objects (or
the background) by maximization of an objective function, which builds upon
recent results on event-based motion-compensation. We provide a thorough
evaluation of our method on a public dataset, outperforming the
state-of-the-art by as much as 10%. We also show the first quantitative
evaluation of a segmentation algorithm for event cameras, yielding around 90%
accuracy at 4 pixels relative displacement.Comment: When viewed in Acrobat Reader, several of the figures animate. Video:
https://youtu.be/0q6ap_OSBA
- …