17,724 research outputs found
Developmental object learning through manipulation and human demonstration
International audienceWe present a cognitive developmental approach for a humanoid robot exploring its close environment in an interactive scenario, taking inspiration from the way infants learn about objects. The proposed approach allows to detect physical entities in the visual space, to create multi-view appearance models of these entities and to categorize them into robot parts, human parts and manipulated objects without supervision and without prior knowledge about their appearances. All information about the entities appearances and behaviour is incrementally acquired while the robot and its human partner interact with objects
Mining Object Parts from CNNs via Active Question-Answering
Given a convolutional neural network (CNN) that is pre-trained for object
classification, this paper proposes to use active question-answering to
semanticize neural patterns in conv-layers of the CNN and mine part concepts.
For each part concept, we mine neural patterns in the pre-trained CNN, which
are related to the target part, and use these patterns to construct an And-Or
graph (AOG) to represent a four-layer semantic hierarchy of the part. As an
interpretable model, the AOG associates different CNN units with different
explicit object parts. We use an active human-computer communication to
incrementally grow such an AOG on the pre-trained CNN as follows. We allow the
computer to actively identify objects, whose neural patterns cannot be
explained by the current AOG. Then, the computer asks human about the
unexplained objects, and uses the answers to automatically discover certain CNN
patterns corresponding to the missing knowledge. We incrementally grow the AOG
to encode new knowledge discovered during the active-learning process. In
experiments, our method exhibits high learning efficiency. Our method uses
about 1/6-1/3 of the part annotations for training, but achieves similar or
better part-localization performance than fast-RCNN methods.Comment: Published in CVPR 201
Boosting Deep Open World Recognition by Clustering
While convolutional neural networks have brought significant advances in
robot vision, their ability is often limited to closed world scenarios, where
the number of semantic concepts to be recognized is determined by the available
training set. Since it is practically impossible to capture all possible
semantic concepts present in the real world in a single training set, we need
to break the closed world assumption, equipping our robot with the capability
to act in an open world. To provide such ability, a robot vision system should
be able to (i) identify whether an instance does not belong to the set of known
categories (i.e. open set recognition), and (ii) extend its knowledge to learn
new classes over time (i.e. incremental learning). In this work, we show how we
can boost the performance of deep open world recognition algorithms by means of
a new loss formulation enforcing a global to local clustering of class-specific
features. In particular, a first loss term, i.e. global clustering, forces the
network to map samples closer to the class centroid they belong to while the
second one, local clustering, shapes the representation space in such a way
that samples of the same class get closer in the representation space while
pushing away neighbours belonging to other classes. Moreover, we propose a
strategy to learn class-specific rejection thresholds, instead of heuristically
estimating a single global threshold, as in previous works. Experiments on
RGB-D Object and Core50 datasets show the effectiveness of our approach.Comment: IROS/RAL 202
Incremental Learning of Object Detectors without Catastrophic Forgetting
Despite their success for object detection, convolutional neural networks are
ill-equipped for incremental learning, i.e., adapting the original model
trained on a set of classes to additionally detect objects of new classes, in
the absence of the initial training data. They suffer from "catastrophic
forgetting" - an abrupt degradation of performance on the original set of
classes, when the training objective is adapted to the new classes. We present
a method to address this issue, and learn object detectors incrementally, when
neither the original training data nor annotations for the original classes in
the new training set are available. The core of our proposed solution is a loss
function to balance the interplay between predictions on the new classes and a
new distillation loss which minimizes the discrepancy between responses for old
classes from the original and the updated networks. This incremental learning
can be performed multiple times, for a new set of classes in each step, with a
moderate drop in performance compared to the baseline network trained on the
ensemble of data. We present object detection results on the PASCAL VOC 2007
and COCO datasets, along with a detailed empirical analysis of the approach.Comment: To appear in ICCV 201
CAT: LoCalization and IdentificAtion Cascade Detection Transformer for Open-World Object Detection
Open-world object detection (OWOD), as a more general and challenging goal,
requires the model trained from data on known objects to detect both known and
unknown objects and incrementally learn to identify these unknown objects. The
existing works which employ standard detection framework and fixed
pseudo-labelling mechanism (PLM) have the following problems: (i) The inclusion
of detecting unknown objects substantially reduces the model's ability to
detect known ones. (ii) The PLM does not adequately utilize the priori
knowledge of inputs. (iii) The fixed selection manner of PLM cannot guarantee
that the model is trained in the right direction. We observe that humans
subconsciously prefer to focus on all foreground objects and then identify each
one in detail, rather than localize and identify a single object
simultaneously, for alleviating the confusion. This motivates us to propose a
novel solution called CAT: LoCalization and IdentificAtion Cascade Detection
Transformer which decouples the detection process via the shared decoder in the
cascade decoding way. In the meanwhile, we propose the self-adaptive
pseudo-labelling mechanism which combines the model-driven with input-driven
PLM and self-adaptively generates robust pseudo-labels for unknown objects,
significantly improving the ability of CAT to retrieve unknown objects.
Comprehensive experiments on two benchmark datasets, i.e., MS-COCO and PASCAL
VOC, show that our model outperforms the state-of-the-art in terms of all
metrics in the task of OWOD, incremental object detection (IOD) and open-set
detection.Comment: CVPR 2023 camera-ready versio
- …