19,936 research outputs found
Expert Gate: Lifelong Learning with a Network of Experts
In this paper we introduce a model of lifelong learning, based on a Network
of Experts. New tasks / experts are learned and added to the model
sequentially, building on what was learned before. To ensure scalability of
this process,data from previous tasks cannot be stored and hence is not
available when learning a new task. A critical issue in such context, not
addressed in the literature so far, relates to the decision which expert to
deploy at test time. We introduce a set of gating autoencoders that learn a
representation for the task at hand, and, at test time, automatically forward
the test sample to the relevant expert. This also brings memory efficiency as
only one expert network has to be loaded into memory at any given time.
Further, the autoencoders inherently capture the relatedness of one task to
another, based on which the most relevant prior model to be used for training a
new expert, with finetuning or learning without-forgetting, can be selected. We
evaluate our method on image classification and video prediction problems.Comment: CVPR 2017 pape
Learning Independent Causal Mechanisms
Statistical learning relies upon data sampled from a distribution, and we
usually do not care what actually generated it in the first place. From the
point of view of causal modeling, the structure of each distribution is induced
by physical mechanisms that give rise to dependences between observables.
Mechanisms, however, can be meaningful autonomous modules of generative models
that make sense beyond a particular entailed data distribution, lending
themselves to transfer between problems. We develop an algorithm to recover a
set of independent (inverse) mechanisms from a set of transformed data points.
The approach is unsupervised and based on a set of experts that compete for
data generated by the mechanisms, driving specialization. We analyze the
proposed method in a series of experiments on image data. Each expert learns to
map a subset of the transformed data back to a reference distribution. The
learned mechanisms generalize to novel domains. We discuss implications for
transfer learning and links to recent trends in generative modeling.Comment: ICML 201
Object-Oriented Dynamics Learning through Multi-Level Abstraction
Object-based approaches for learning action-conditioned dynamics has
demonstrated promise for generalization and interpretability. However, existing
approaches suffer from structural limitations and optimization difficulties for
common environments with multiple dynamic objects. In this paper, we present a
novel self-supervised learning framework, called Multi-level Abstraction
Object-oriented Predictor (MAOP), which employs a three-level learning
architecture that enables efficient object-based dynamics learning from raw
visual observations. We also design a spatial-temporal relational reasoning
mechanism for MAOP to support instance-level dynamics learning and handle
partial observability. Our results show that MAOP significantly outperforms
previous methods in terms of sample efficiency and generalization over novel
environments for learning environment models. We also demonstrate that learned
dynamics models enable efficient planning in unseen environments, comparable to
true environment models. In addition, MAOP learns semantically and visually
interpretable disentangled representations.Comment: Accepted to the Thirthy-Fourth AAAI Conference On Artificial
Intelligence (AAAI), 202
What is Computational Intelligence and where is it going?
What is Computational Intelligence (CI) and what are its relations with Artificial Intelligence (AI)? A brief survey of the scope of CI journals and books with ``computational intelligence'' in their title shows that at present it is an umbrella for three core technologies (neural, fuzzy and evolutionary), their applications, and selected fashionable pattern recognition methods. At present CI has no comprehensive foundations and is more a bag of tricks than a solid branch of science. The change of focus from methods to challenging problems is advocated, with CI defined as a part of computer and engineering sciences devoted to solution of non-algoritmizable problems. In this view AI is a part of CI focused on problems related to higher cognitive functions, while the rest of the CI community works on problems related to perception and control, or lower cognitive functions. Grand challenges on both sides of this spectrum are addressed
A Neural, Interactive-predictive System for Multimodal Sequence to Sequence Tasks
We present a demonstration of a neural interactive-predictive system for
tackling multimodal sequence to sequence tasks. The system generates text
predictions to different sequence to sequence tasks: machine translation, image
and video captioning. These predictions are revised by a human agent, who
introduces corrections in the form of characters. The system reacts to each
correction, providing alternative hypotheses, compelling with the feedback
provided by the user. The final objective is to reduce the human effort
required during this correction process.
This system is implemented following a client-server architecture. For
accessing the system, we developed a website, which communicates with the
neural model, hosted in a local server. From this website, the different tasks
can be tackled following the interactive-predictive framework. We open-source
all the code developed for building this system. The demonstration in hosted in
http://casmacat.prhlt.upv.es/interactive-seq2seq.Comment: ACL 2019 - System demonstration
- …