15,630 research outputs found
Incremental Few-Shot Object Detection
Most existing object detection methods rely on the availability of abundant
labelled training samples per class and offline model training in a batch mode.
These requirements substantially limit their scalability to open-ended
accommodation of novel classes with limited labelled training data. We present
a study aiming to go beyond these limitations by considering the Incremental
Few-Shot Detection (iFSD) problem setting, where new classes must be registered
incrementally (without revisiting base classes) and with few examples. To this
end we propose OpeN-ended Centre nEt (ONCE), a detector designed for
incrementally learning to detect novel class objects with few examples. This is
achieved by an elegant adaptation of the CentreNet detector to the few-shot
learning scenario, and meta-learning a class-specific code generator model for
registering novel classes. ONCE fully respects the incremental learning
paradigm, with novel class registration requiring only a single forward pass of
few-shot training samples, and no access to base classes -- thus making it
suitable for deployment on embedded devices. Extensive experiments conducted on
both the standard object detection and fashion landmark detection tasks show
the feasibility of iFSD for the first time, opening an interesting and very
important line of research.Comment: CVPR 202
Context-Transformer: Tackling Object Confusion for Few-Shot Detection
Few-shot object detection is a challenging but realistic scenario, where only
a few annotated training images are available for training detectors. A popular
approach to handle this problem is transfer learning, i.e., fine-tuning a
detector pretrained on a source-domain benchmark. However, such transferred
detector often fails to recognize new objects in the target domain, due to low
data diversity of training samples. To tackle this problem, we propose a novel
Context-Transformer within a concise deep transfer framework. Specifically,
Context-Transformer can effectively leverage source-domain object knowledge as
guidance, and automatically exploit contexts from only a few training images in
the target domain. Subsequently, it can adaptively integrate these relational
clues to enhance the discriminative power of detector, in order to reduce
object confusion in few-shot scenarios. Moreover, Context-Transformer is
flexibly embedded in the popular SSD-style detectors, which makes it a
plug-and-play module for end-to-end few-shot learning. Finally, we evaluate
Context-Transformer on the challenging settings of few-shot detection and
incremental few-shot detection. The experimental results show that, our
framework outperforms the recent state-of-the-art approaches.Comment: Accepted by AAAI-202
Incremental Few-Shot Object Detection via Simple Fine-Tuning Approach
In this paper, we explore incremental few-shot object detection (iFSD), which
incrementally learns novel classes using only a few examples without revisiting
base classes. Previous iFSD works achieved the desired results by applying
meta-learning. However, meta-learning approaches show insufficient performance
that is difficult to apply to practical problems. In this light, we propose a
simple fine-tuning-based approach, the Incremental Two-stage Fine-tuning
Approach (iTFA) for iFSD, which contains three steps: 1) base training using
abundant base classes with the class-agnostic box regressor, 2) separation of
the RoI feature extractor and classifier into the base and novel class branches
for preserving base knowledge, and 3) fine-tuning the novel branch using only a
few novel class examples. We evaluate our iTFA on the real-world datasets
PASCAL VOC, COCO, and LVIS. iTFA achieves competitive performance in COCO and
shows a 30% higher AP accuracy than meta-learning methods in the LVIS dataset.
Experimental results show the effectiveness and applicability of our proposed
method.Comment: Accepted to ICRA 202
Few-shot Class-incremental Learning: A Survey
Few-shot Class-Incremental Learning (FSCIL) presents a unique challenge in
machine learning, as it necessitates the continuous learning of new classes
from sparse labeled training samples without forgetting previous knowledge.
While this field has seen recent progress, it remains an active area of
exploration. This paper aims to provide a comprehensive and systematic review
of FSCIL. In our in-depth examination, we delve into various facets of FSCIL,
encompassing the problem definition, the discussion of primary challenges of
unreliable empirical risk minimization and the stability-plasticity dilemma,
general schemes, and relevant problems of incremental learning and few-shot
learning. Besides, we offer an overview of benchmark datasets and evaluation
metrics. Furthermore, we introduce the classification methods in FSCIL from
data-based, structure-based, and optimization-based approaches and the object
detection methods in FSCIL from anchor-free and anchor-based approaches. Beyond
these, we illuminate several promising research directions within FSCIL that
merit further investigation
A Survey on Few-Shot Class-Incremental Learning
Large deep learning models are impressive, but they struggle when real-time
data is not available. Few-shot class-incremental learning (FSCIL) poses a
significant challenge for deep neural networks to learn new tasks from just a
few labeled samples without forgetting the previously learned ones. This setup
easily leads to catastrophic forgetting and overfitting problems, severely
affecting model performance. Studying FSCIL helps overcome deep learning model
limitations on data volume and acquisition time, while improving practicality
and adaptability of machine learning models. This paper provides a
comprehensive survey on FSCIL. Unlike previous surveys, we aim to synthesize
few-shot learning and incremental learning, focusing on introducing FSCIL from
two perspectives, while reviewing over 30 theoretical research studies and more
than 20 applied research studies. From the theoretical perspective, we provide
a novel categorization approach that divides the field into five subcategories,
including traditional machine learning methods, meta-learning based methods,
feature and feature space-based methods, replay-based methods, and dynamic
network structure-based methods. We also evaluate the performance of recent
theoretical research on benchmark datasets of FSCIL. From the application
perspective, FSCIL has achieved impressive achievements in various fields of
computer vision such as image classification, object detection, and image
segmentation, as well as in natural language processing and graph. We summarize
the important applications. Finally, we point out potential future research
directions, including applications, problem setups, and theory development.
Overall, this paper offers a comprehensive analysis of the latest advances in
FSCIL from a methodological, performance, and application perspective
A Survey on Few-Shot Class-Incremental Learning
Large deep learning models are impressive, but they struggle when real-time data is not available. Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks from just a few labeled samples without forgetting the previously learned ones. This setup can easily leads to catastrophic forgetting and overfitting problems, severely affecting model performance. Studying FSCIL helps overcome deep learning model limitations on data volume and acquisition time, while improving practicality and adaptability of machine learning models. This paper provides a comprehensive survey on FSCIL. Unlike previous surveys, we aim to synthesize few-shot learning and incremental learning, focusing on introducing FSCIL from two perspectives, while reviewing over 30 theoretical research studies and more than 20 applied research studies. From the theoretical perspective, we provide a novel categorization approach that divides the field into five subcategories, including traditional machine learning methods, meta learning-based methods, feature and feature space-based methods, replay-based methods, and dynamic network structure-based methods. We also evaluate the performance of recent theoretical research on benchmark datasets of FSCIL. From the application perspective, FSCIL has achieved impressive achievements in various fields of computer vision such as image classification, object detection, and image segmentation, as well as in natural language processing and graph. We summarize the important applications. Finally, we point out potential future research directions, including applications, problem setups, and theory development. Overall, this paper offers a comprehensive analysis of the latest advances in FSCIL from a methodological, performance, and application perspective
Memory Based Online Learning of Deep Representations from Video Streams
We present a novel online unsupervised method for face identity learning from
video streams. The method exploits deep face descriptors together with a memory
based learning mechanism that takes advantage of the temporal coherence of
visual data. Specifically, we introduce a discriminative feature matching
solution based on Reverse Nearest Neighbour and a feature forgetting strategy
that detect redundant features and discard them appropriately while time
progresses. It is shown that the proposed learning procedure is asymptotically
stable and can be effectively used in relevant applications like multiple face
identification and tracking from unconstrained video streams. Experimental
results show that the proposed method achieves comparable results in the task
of multiple face tracking and better performance in face identification with
offline approaches exploiting future information. Code will be publicly
available.Comment: arXiv admin note: text overlap with arXiv:1708.0361
- …