926,533 research outputs found

    Expert Gate: Lifelong Learning with a Network of Experts

    Full text link
    In this paper we introduce a model of lifelong learning, based on a Network of Experts. New tasks / experts are learned and added to the model sequentially, building on what was learned before. To ensure scalability of this process,data from previous tasks cannot be stored and hence is not available when learning a new task. A critical issue in such context, not addressed in the literature so far, relates to the decision which expert to deploy at test time. We introduce a set of gating autoencoders that learn a representation for the task at hand, and, at test time, automatically forward the test sample to the relevant expert. This also brings memory efficiency as only one expert network has to be loaded into memory at any given time. Further, the autoencoders inherently capture the relatedness of one task to another, based on which the most relevant prior model to be used for training a new expert, with finetuning or learning without-forgetting, can be selected. We evaluate our method on image classification and video prediction problems.Comment: CVPR 2017 pape

    Generalizing attentional control across dimensions and tasks: evidence from transfer of proportion-congruent effects

    Get PDF
    Three experiments investigated transfer of list-wide proportion congruent (LWPC) effects from a set of congruent and incongruent items with different frequency (inducer task) to a set of congruent and incongruent items with equal frequency (diagnostic task). Experiments 1 and 2 mixed items from horizontal and vertical Simon tasks. Tasks always involved different stimuli that varied on the same dimension (colour) in Experiment 1 and on different dimensions (colour, shape) in Experiment 2. Experiment 3 mixed trials from a manual Simon task with trials from a vocal Stroop task, with colour being the relevant stimulus in both tasks. There were two major results. First, we observed transfer of LWPC effects in Experiments 1 and 3, when tasks shared the relevant dimension, but not in Experiment 2. Second, sequential modulations of congruency effects transferred in Experiment 1 only. Hence, the different transfer patterns suggest that LWPC effects and sequential modulations arise from different mechanisms. Moreover, the observation of transfer supports an account of LWPC effects in terms of list-wide cognitive control, while being at odds with accounts in terms of stimulus–response (contingency) learning and item-specific control

    Factorized Contrastive Learning: Going Beyond Multi-view Redundancy

    Full text link
    In a wide range of multimodal tasks, contrastive learning has become a particularly appealing approach since it can successfully learn representations from abundant unlabeled data with only pairing information (e.g., image-caption or video-audio pairs). Underpinning these approaches is the assumption of multi-view redundancy - that shared information between modalities is necessary and sufficient for downstream tasks. However, in many real-world settings, task-relevant information is also contained in modality-unique regions: information that is only present in one modality but still relevant to the task. How can we learn self-supervised multimodal representations to capture both shared and unique information relevant to downstream tasks? This paper proposes FactorCL, a new multimodal representation learning method to go beyond multi-view redundancy. FactorCL is built from three new contributions: (1) factorizing task-relevant information into shared and unique representations, (2) capturing task-relevant information via maximizing MI lower bounds and removing task-irrelevant information via minimizing MI upper bounds, and (3) multimodal data augmentations to approximate task relevance without labels. On large-scale real-world datasets, FactorCL captures both shared and unique information and achieves state-of-the-art results on six benchmarks.Comment: Code available at: https://github.com/pliang279/FactorC

    Active Transfer Learning with Zero-Shot Priors: Reusing Past Datasets for Future Tasks

    Get PDF
    How can we reuse existing knowledge, in the form of available datasets, when solving a new and apparently unrelated target task from a set of unlabeled data? In this work we make a first contribution to answer this question in the context of image classification. We frame this quest as an active learning problem and use zero-shot classifiers to guide the learning process by linking the new task to the existing classifiers. By revisiting the dual formulation of adaptive SVM, we reveal two basic conditions to choose greedily only the most relevant samples to be annotated. On this basis we propose an effective active learning algorithm which learns the best possible target classification model with minimum human labeling effort. Extensive experiments on two challenging datasets show the value of our approach compared to the state-of-the-art active learning methodologies, as well as its potential to reuse past datasets with minimal effort for future tasks
    • …
    corecore