5,506 research outputs found
Interactive and life-long learning for identification and categorization tasks
Abstract (engl.)
This thesis focuses on life-long and interactive learning for recognition tasks. To achieve these targets the separation into a short-term memory (STM) and a long-term memory (LTM) is proposed. For the incremental build up of the STM a similarity-based one-shot learning method was developed. Furthermore two consolidation algorithms were proposed enabling the incremental learning of LTM representations. Based on the Learning Vector Quantization (LVQ) network architecture an error-based node insertion rule and a node dependent learning rate are proposed to enable life-long learning. For learning of categories additionally a forward-feature selection method was introduced to separate co-occurring categories. In experiments the performance of these learning methods could be shown for difficult visual recognition problems
FAME: Face Association through Model Evolution
We attack the problem of learning face models for public faces from
weakly-labelled images collected from web through querying a name. The data is
very noisy even after face detection, with several irrelevant faces
corresponding to other people. We propose a novel method, Face Association
through Model Evolution (FAME), that is able to prune the data in an iterative
way, for the face models associated to a name to evolve. The idea is based on
capturing discriminativeness and representativeness of each instance and
eliminating the outliers. The final models are used to classify faces on novel
datasets with possibly different characteristics. On benchmark datasets, our
results are comparable to or better than state-of-the-art studies for the task
of face identification.Comment: Draft version of the stud
On the Challenges of Open World Recognitionunder Shifting Visual Domains
Robotic visual systems operating in the wild must act in unconstrained
scenarios, under different environmental conditions while facing a variety of
semantic concepts, including unknown ones. To this end, recent works tried to
empower visual object recognition methods with the capability to i) detect
unseen concepts and ii) extended their knowledge over time, as images of new
semantic classes arrive. This setting, called Open World Recognition (OWR), has
the goal to produce systems capable of breaking the semantic limits present in
the initial training set. However, this training set imposes to the system not
only its own semantic limits, but also environmental ones, due to its bias
toward certain acquisition conditions that do not necessarily reflect the high
variability of the real-world. This discrepancy between training and test
distribution is called domain-shift. This work investigates whether OWR
algorithms are effective under domain-shift, presenting the first benchmark
setup for assessing fairly the performances of OWR algorithms, with and without
domain-shift. We then use this benchmark to conduct analyses in various
scenarios, showing how existing OWR algorithms indeed suffer a severe
performance degradation when train and test distributions differ. Our analysis
shows that this degradation is only slightly mitigated by coupling OWR with
domain generalization techniques, indicating that the mere plug-and-play of
existing algorithms is not enough to recognize new and unknown categories in
unseen domains. Our results clearly point toward open issues and future
research directions, that need to be investigated for building robot visual
systems able to function reliably under these challenging yet very real
conditions. Code available at
https://github.com/DarioFontanel/OWR-VisualDomainsComment: RAL/ICRA 202
- …