926,533 research outputs found
Expert Gate: Lifelong Learning with a Network of Experts
In this paper we introduce a model of lifelong learning, based on a Network
of Experts. New tasks / experts are learned and added to the model
sequentially, building on what was learned before. To ensure scalability of
this process,data from previous tasks cannot be stored and hence is not
available when learning a new task. A critical issue in such context, not
addressed in the literature so far, relates to the decision which expert to
deploy at test time. We introduce a set of gating autoencoders that learn a
representation for the task at hand, and, at test time, automatically forward
the test sample to the relevant expert. This also brings memory efficiency as
only one expert network has to be loaded into memory at any given time.
Further, the autoencoders inherently capture the relatedness of one task to
another, based on which the most relevant prior model to be used for training a
new expert, with finetuning or learning without-forgetting, can be selected. We
evaluate our method on image classification and video prediction problems.Comment: CVPR 2017 pape
Generalizing attentional control across dimensions and tasks: evidence from transfer of proportion-congruent effects
Three experiments investigated transfer of list-wide proportion congruent (LWPC) effects from a set of congruent and incongruent items with different frequency (inducer task) to a set of congruent and incongruent items with equal frequency (diagnostic task). Experiments 1 and 2 mixed items from horizontal and vertical Simon tasks. Tasks always involved different stimuli that varied on the same dimension (colour) in Experiment 1 and on different dimensions (colour, shape) in Experiment 2. Experiment 3 mixed trials from a manual Simon task with trials from a vocal Stroop task, with colour being the relevant stimulus in both tasks. There were two major results. First, we observed transfer of LWPC effects in Experiments 1 and 3, when tasks shared the relevant dimension, but not in Experiment 2. Second, sequential modulations of congruency effects transferred in Experiment 1 only. Hence, the different transfer patterns suggest that LWPC effects and sequential modulations arise from different mechanisms. Moreover, the observation of transfer supports an account of LWPC effects in terms of list-wide cognitive control, while being at odds with accounts in terms of stimulus–response (contingency) learning and item-specific control
Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
In a wide range of multimodal tasks, contrastive learning has become a
particularly appealing approach since it can successfully learn representations
from abundant unlabeled data with only pairing information (e.g., image-caption
or video-audio pairs). Underpinning these approaches is the assumption of
multi-view redundancy - that shared information between modalities is necessary
and sufficient for downstream tasks. However, in many real-world settings,
task-relevant information is also contained in modality-unique regions:
information that is only present in one modality but still relevant to the
task. How can we learn self-supervised multimodal representations to capture
both shared and unique information relevant to downstream tasks? This paper
proposes FactorCL, a new multimodal representation learning method to go beyond
multi-view redundancy. FactorCL is built from three new contributions: (1)
factorizing task-relevant information into shared and unique representations,
(2) capturing task-relevant information via maximizing MI lower bounds and
removing task-irrelevant information via minimizing MI upper bounds, and (3)
multimodal data augmentations to approximate task relevance without labels. On
large-scale real-world datasets, FactorCL captures both shared and unique
information and achieves state-of-the-art results on six benchmarks.Comment: Code available at: https://github.com/pliang279/FactorC
Active Transfer Learning with Zero-Shot Priors: Reusing Past Datasets for Future Tasks
How can we reuse existing knowledge, in the form of available datasets, when
solving a new and apparently unrelated target task from a set of unlabeled
data? In this work we make a first contribution to answer this question in the
context of image classification. We frame this quest as an active learning
problem and use zero-shot classifiers to guide the learning process by linking
the new task to the existing classifiers. By revisiting the dual formulation of
adaptive SVM, we reveal two basic conditions to choose greedily only the most
relevant samples to be annotated. On this basis we propose an effective active
learning algorithm which learns the best possible target classification model
with minimum human labeling effort. Extensive experiments on two challenging
datasets show the value of our approach compared to the state-of-the-art active
learning methodologies, as well as its potential to reuse past datasets with
minimal effort for future tasks
- …