1,679 research outputs found
Learning non-maximum suppression
Object detectors have hugely profited from moving towards an end-to-end
learning paradigm: proposals, features, and the classifier becoming one neural
network improved results two-fold on general object detection. One
indispensable component is non-maximum suppression (NMS), a post-processing
algorithm responsible for merging all detections that belong to the same
object. The de facto standard NMS algorithm is still fully hand-crafted,
suspiciously simple, and -- being based on greedy clustering with a fixed
distance threshold -- forces a trade-off between recall and precision. We
propose a new network architecture designed to perform NMS, using only boxes
and their score. We report experiments for person detection on PETS and for
general object categories on the COCO dataset. Our approach shows promise
providing improved localization and occlusion handling.Comment: Added "Supplementary material" titl
Search Tracker: Human-derived object tracking in-the-wild through large-scale search and retrieval
Humans use context and scene knowledge to easily localize moving objects in
conditions of complex illumination changes, scene clutter and occlusions. In
this paper, we present a method to leverage human knowledge in the form of
annotated video libraries in a novel search and retrieval based setting to
track objects in unseen video sequences. For every video sequence, a document
that represents motion information is generated. Documents of the unseen video
are queried against the library at multiple scales to find videos with similar
motion characteristics. This provides us with coarse localization of objects in
the unseen video. We further adapt these retrieved object locations to the new
video using an efficient warping scheme. The proposed method is validated on
in-the-wild video surveillance datasets where we outperform state-of-the-art
appearance-based trackers. We also introduce a new challenging dataset with
complex object appearance changes.Comment: Under review with the IEEE Transactions on Circuits and Systems for
Video Technolog
Stochastic Occupancy Grid Map Prediction in Dynamic Scenes
This paper presents two variations of a novel stochastic prediction algorithm
that enables mobile robots to accurately and robustly predict the future state
of complex dynamic scenes. The proposed algorithm uses a variational
autoencoder to predict a range of possible future states of the environment.
The algorithm takes full advantage of the motion of the robot itself, the
motion of dynamic objects, and the geometry of static objects in the scene to
improve prediction accuracy. Three simulated and real-world datasets collected
by different robot models are used to demonstrate that the proposed algorithm
is able to achieve more accurate and robust prediction performance than other
prediction algorithms. Furthermore, a predictive uncertainty-aware planner is
proposed to demonstrate the effectiveness of the proposed predictor in
simulation and real-world navigation experiments. Implementations are open
source at https://github.com/TempleRAIL/SOGMP.Comment: Accepted by 7th Annual Conference on Robot Learning (CoRL), 202
- …