25,500 research outputs found
Robust tracking of objects with dynamic topology
In many instances of the object tracking problem the topological properties of objects can change over time. Such changes include the splitting of an object into multiple objects or merging of multiple objects into a single object. We propose a novel tracking model which is robust to such changes. This model is formulated terms of homology theory whereby 0-dimensional homology classes, which correspond to path-connected components, are tracked. A generalisation of this model for tracking spatially close objects lying in an ambient metric space is also proposed. This generalisation is particularly suitable for tracking spatial-temporal phenomena such as weather phenomena. The utility of the proposed model is demonstrated with respect to tracking rain clouds in radar imagery
SurfelWarp: Efficient Non-Volumetric Single View Dynamic Reconstruction
We contribute a dense SLAM system that takes a live stream of depth images as
input and reconstructs non-rigid deforming scenes in real time, without
templates or prior models. In contrast to existing approaches, we do not
maintain any volumetric data structures, such as truncated signed distance
function (TSDF) fields or deformation fields, which are performance and memory
intensive. Our system works with a flat point (surfel) based representation of
geometry, which can be directly acquired from commodity depth sensors. Standard
graphics pipelines and general purpose GPU (GPGPU) computing are leveraged for
all central operations: i.e., nearest neighbor maintenance, non-rigid
deformation field estimation and fusion of depth measurements. Our pipeline
inherently avoids expensive volumetric operations such as marching cubes,
volumetric fusion and dense deformation field update, leading to significantly
improved performance. Furthermore, the explicit and flexible surfel based
geometry representation enables efficient tackling of topology changes and
tracking failures, which makes our reconstructions consistent with updated
depth observations. Our system allows robots to maintain a scene description
with non-rigidly deformed objects that potentially enables interactions with
dynamic working environments.Comment: RSS 2018. The video and source code are available on
https://sites.google.com/view/surfelwarp/hom
ROAM: a Rich Object Appearance Model with Application to Rotoscoping
Rotoscoping, the detailed delineation of scene elements through a video shot,
is a painstaking task of tremendous importance in professional post-production
pipelines. While pixel-wise segmentation techniques can help for this task,
professional rotoscoping tools rely on parametric curves that offer the artists
a much better interactive control on the definition, editing and manipulation
of the segments of interest. Sticking to this prevalent rotoscoping paradigm,
we propose a novel framework to capture and track the visual aspect of an
arbitrary object in a scene, given a first closed outline of this object. This
model combines a collection of local foreground/background appearance models
spread along the outline, a global appearance model of the enclosed object and
a set of distinctive foreground landmarks. The structure of this rich
appearance model allows simple initialization, efficient iterative optimization
with exact minimization at each step, and on-line adaptation in videos. We
demonstrate qualitatively and quantitatively the merit of this framework
through comparisons with tools based on either dynamic segmentation with a
closed curve or pixel-wise binary labelling
CNN for Very Fast Ground Segmentation in Velodyne LiDAR Data
This paper presents a novel method for ground segmentation in Velodyne point
clouds. We propose an encoding of sparse 3D data from the Velodyne sensor
suitable for training a convolutional neural network (CNN). This general
purpose approach is used for segmentation of the sparse point cloud into ground
and non-ground points. The LiDAR data are represented as a multi-channel 2D
signal where the horizontal axis corresponds to the rotation angle and the
vertical axis the indexes channels (i.e. laser beams). Multiple topologies of
relatively shallow CNNs (i.e. 3-5 convolutional layers) are trained and
evaluated using a manually annotated dataset we prepared. The results show
significant improvement of performance over the state-of-the-art method by
Zhang et al. in terms of speed and also minor improvements in terms of
accuracy.Comment: ICRA 2018 submissio
Unsupervised Object Discovery and Tracking in Video Collections
This paper addresses the problem of automatically localizing dominant objects
as spatio-temporal tubes in a noisy collection of videos with minimal or even
no supervision. We formulate the problem as a combination of two complementary
processes: discovery and tracking. The first one establishes correspondences
between prominent regions across videos, and the second one associates
successive similar object regions within the same video. Interestingly, our
algorithm also discovers the implicit topology of frames associated with
instances of the same object class across different videos, a role normally
left to supervisory information in the form of class labels in conventional
image and video understanding methods. Indeed, as demonstrated by our
experiments, our method can handle video collections featuring multiple object
classes, and substantially outperforms the state of the art in colocalization,
even though it tackles a broader problem with much less supervision
- …