142 research outputs found
CLR: Channel-wise Lightweight Reprogramming for Continual Learning
Continual learning aims to emulate the human ability to continually
accumulate knowledge over sequential tasks. The main challenge is to maintain
performance on previously learned tasks after learning new tasks, i.e., to
avoid catastrophic forgetting. We propose a Channel-wise Lightweight
Reprogramming (CLR) approach that helps convolutional neural networks (CNNs)
overcome catastrophic forgetting during continual learning. We show that a CNN
model trained on an old task (or self-supervised proxy task) could be
``reprogrammed" to solve a new task by using our proposed lightweight (very
cheap) reprogramming parameter. With the help of CLR, we have a better
stability-plasticity trade-off to solve continual learning problems: To
maintain stability and retain previous task ability, we use a common
task-agnostic immutable part as the shared ``anchor" parameter set. We then add
task-specific lightweight reprogramming parameters to reinterpret the outputs
of the immutable parts, to enable plasticity and integrate new knowledge. To
learn sequential tasks, we only train the lightweight reprogramming parameters
to learn each new task. Reprogramming parameters are task-specific and
exclusive to each task, which makes our method immune to catastrophic
forgetting. To minimize the parameter requirement of reprogramming to learn new
tasks, we make reprogramming lightweight by only adjusting essential kernels
and learning channel-wise linear mappings from anchor parameters to
task-specific domain knowledge. We show that, for general CNNs, the CLR
parameter increase is less than 0.6\% for any new task. Our method outperforms
13 state-of-the-art continual learning baselines on a new challenging sequence
of 53 image classification datasets. Code and data are available at
https://github.com/gyhandy/Channel-wise-Lightweight-ReprogrammingComment: ICCV 202
Building One-class Detector for Anything: Open-vocabulary Zero-shot OOD Detection Using Text-image Models
We focus on the challenge of out-of-distribution (OOD) detection in deep
learning models, a crucial aspect in ensuring reliability. Despite considerable
effort, the problem remains significantly challenging in deep learning models
due to their propensity to output over-confident predictions for OOD inputs. We
propose a novel one-class open-set OOD detector that leverages text-image
pre-trained models in a zero-shot fashion and incorporates various descriptions
of in-domain and OOD. Our approach is designed to detect anything not in-domain
and offers the flexibility to detect a wide variety of OOD, defined via fine-
or coarse-grained labels, or even in natural language. We evaluate our approach
on challenging benchmarks including large-scale datasets containing
fine-grained, semantically similar classes, distributionally shifted images,
and multi-object images containing a mixture of in-domain and OOD objects. Our
method shows superior performance over previous methods on all benchmarks. Code
is available at https://github.com/gyhandy/One-Class-AnythingComment: 16 pages (including appendix and references), 3 figure
- …