2 research outputs found
IROS 2019 Lifelong Robotic Vision Challenge -- Lifelong Object Recognition Report
This report summarizes IROS 2019-Lifelong Robotic Vision Competition
(Lifelong Object Recognition Challenge) with methods and results from the top
finalists (out of over~ teams). The competition dataset (L)ifel(O)ng
(R)obotic V(IS)ion (OpenLORIS) - Object Recognition (OpenLORIS-object) is
designed for driving lifelong/continual learning research and application in
robotic vision domain, with everyday objects in home, office, campus, and mall
scenarios. The dataset explicitly quantifies the variants of illumination,
object occlusion, object size, camera-object distance/angles, and clutter
information. Rules are designed to quantify the learning capability of the
robotic vision system when faced with the objects appearing in the dynamic
environments in the contest. Individual reports, dataset information, rules,
and released source code can be found at the project homepage:
"https://lifelong-robotic-vision.github.io/competition/".Comment: 9 pages, 11 figures, 3 tables, accepted into IEEE Robotics and
Automation Magazine. arXiv admin note: text overlap with arXiv:1911.0648
A Spectral Nonlocal Block for Neural Networks
The nonlocal-based blocks are designed for capturing long-range
spatial-temporal dependencies in computer vision tasks. Although having shown
excellent performances, they lack the mechanism to encode the rich, structured
information among elements in an image. In this paper, to theoretically analyze
the property of these nonlocal-based blocks, we provide a unified approach to
interpreting them, where we view them as a graph filter generated on a
fully-connected graph. When the graph filter is approximated by Chebyshev
polynomials, a generalized formulation can be derived for explaining the
existing nonlocal-based blocks ( nonlocal block, nonlocal
stage, double attention block). Furthermore, we propose an efficient and robust
spectral nonlocal block, which can be flexibly inserted into deep neural
networks to catch the long-range dependencies between spatial pixels or
temporal frames. Experimental results demonstrate the clear-cut improvements
and practical applicabilities of the spectral nonlocal block on image
classification (Cifar-10/100, ImageNet), fine-grained image classification
(CUB-200), action recognition (UCF-101), and person re-identification
(ILID-SVID, Mars, Prid-2011) tasks