85 research outputs found
PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning
Online class-incremental continual learning is a specific task of continual
learning. It aims to continuously learn new classes from data stream and the
samples of data stream are seen only once, which suffers from the catastrophic
forgetting issue, i.e., forgetting historical knowledge of old classes.
Existing replay-based methods effectively alleviate this issue by saving and
replaying part of old data in a proxy-based or contrastive-based replay manner.
Although these two replay manners are effective, the former would incline to
new classes due to class imbalance issues, and the latter is unstable and hard
to converge because of the limited number of samples. In this paper, we conduct
a comprehensive analysis of these two replay manners and find that they can be
complementary. Inspired by this finding, we propose a novel replay-based method
called proxy-based contrastive replay (PCR). The key operation is to replace
the contrastive samples of anchors with corresponding proxies in the
contrastive-based way. It alleviates the phenomenon of catastrophic forgetting
by effectively addressing the imbalance issue, as well as keeps a faster
convergence of the model. We conduct extensive experiments on three real-world
benchmark datasets, and empirical results consistently demonstrate the
superiority of PCR over various state-of-the-art methods.Comment: To appear in CVPR 2023. 10 pages, 8 figures and 3 table
MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning
Few-Shot Learning (FSL) is a challenging task, \emph{i.e.}, how to recognize
novel classes with few examples? Pre-training based methods effectively tackle
the problem by pre-training a feature extractor and then predicting novel
classes via a cosine nearest neighbor classifier with mean-based prototypes.
Nevertheless, due to the data scarcity, the mean-based prototypes are usually
biased. In this paper, we attempt to diminish the prototype bias by regarding
it as a prototype optimization problem. To this end, we propose a novel
meta-learning based prototype optimization framework to rectify prototypes,
\emph{i.e.}, introducing a meta-optimizer to optimize prototypes. Although the
existing meta-optimizers can also be adapted to our framework, they all
overlook a crucial gradient bias issue, \emph{i.e.}, the mean-based gradient
estimation is also biased on sparse data. To address the issue, we regard the
gradient and its flow as meta-knowledge and then propose a novel Neural
Ordinary Differential Equation (ODE)-based meta-optimizer to polish prototypes,
called MetaNODE. In this meta-optimizer, we first view the mean-based
prototypes as initial prototypes, and then model the process of prototype
optimization as continuous-time dynamics specified by a Neural ODE. A gradient
flow inference network is carefully designed to learn to estimate the
continuous gradient flow for prototype dynamics. Finally, the optimal
prototypes can be obtained by solving the Neural ODE. Extensive experiments on
miniImagenet, tieredImagenet, and CUB-200-2011 show the effectiveness of our
method.Comment: Accepted by AAAI 202
UER: A Heuristic Bias Addressing Approach for Online Continual Learning
Online continual learning aims to continuously train neural networks from a
continuous data stream with a single pass-through data. As the most effective
approach, the rehearsal-based methods replay part of previous data. Commonly
used predictors in existing methods tend to generate biased dot-product logits
that prefer to the classes of current data, which is known as a bias issue and
a phenomenon of forgetting. Many approaches have been proposed to overcome the
forgetting problem by correcting the bias; however, they still need to be
improved in online fashion. In this paper, we try to address the bias issue by
a more straightforward and more efficient method. By decomposing the
dot-product logits into an angle factor and a norm factor, we empirically find
that the bias problem mainly occurs in the angle factor, which can be used to
learn novel knowledge as cosine logits. On the contrary, the norm factor
abandoned by existing methods helps remember historical knowledge. Based on
this observation, we intuitively propose to leverage the norm factor to balance
the new and old knowledge for addressing the bias. To this end, we develop a
heuristic approach called unbias experience replay (UER). UER learns current
samples only by the angle factor and further replays previous samples by both
the norm and angle factors. Extensive experiments on three datasets show that
UER achieves superior performance over various state-of-the-art methods. The
code is in https://github.com/FelixHuiweiLin/UER.Comment: 9 pages, 12 figures, ACM MM202
HPCR: Holistic Proxy-based Contrastive Replay for Online Continual Learning
Online continual learning (OCL) aims to continuously learn new data from a
single pass over the online data stream. It generally suffers from the
catastrophic forgetting issue. Existing replay-based methods effectively
alleviate this issue by replaying part of old data in a proxy-based or
contrastive-based replay manner. In this paper, we conduct a comprehensive
analysis of these two replay manners and find they can be complementary.
Inspired by this finding, we propose a novel replay-based method called
proxy-based contrastive replay (PCR), which replaces anchor-to-sample pairs
with anchor-to-proxy pairs in the contrastive-based loss to alleviate the
phenomenon of forgetting. Based on PCR, we further develop a more advanced
method named holistic proxy-based contrastive replay (HPCR), which consists of
three components. The contrastive component conditionally incorporates
anchor-to-sample pairs to PCR, learning more fine-grained semantic information
with a large training batch. The second is a temperature component that
decouples the temperature coefficient into two parts based on their impacts on
the gradient and sets different values for them to learn more novel knowledge.
The third is a distillation component that constrains the learning process to
keep more historical knowledge. Experiments on four datasets consistently
demonstrate the superiority of HPCR over various state-of-the-art methods.Comment: 18 pages, 11 figure
Improving Utility of GPU in Accelerating Industrial Applications with User-centred Automatic Code Translation
SMEs (Small and medium-sized enterprises), particularly those whose business is focused on developing innovative produces, are limited by a major bottleneck on the speed of computation in many applications. The recent developments in GPUs have been the marked increase in their versatility in many computational areas. But due to the lack of specialist GPU (Graphics processing units) programming skills, the explosion of GPU power has not been fully utilized in general SME applications by inexperienced users. Also, existing automatic CPU-to-GPU code translators are mainly designed for research purposes with poor user interface design and hard-to-use. Little attentions have been paid to the applicability, usability and learnability of these tools for normal users. In this paper, we present an online automated CPU-to-GPU source translation system, (GPSME) for inexperienced users to utilize GPU capability in accelerating general SME applications. This system designs and implements a directive programming model with new kernel generation scheme and memory management hierarchy to optimize its performance. A web-service based interface is designed for inexperienced users to easily and flexibly invoke the automatic resource translator. Our experiments with non-expert GPU users in 4 SMEs reflect that GPSME system can efficiently accelerate real-world applications with at least 4x and have a better applicability, usability and learnability than existing automatic CPU-to-GPU source translators
- …