5,874 research outputs found
A Bayesian Perspective of Statistical Machine Learning for Big Data
Statistical Machine Learning (SML) refers to a body of algorithms and methods
by which computers are allowed to discover important features of input data
sets which are often very large in size. The very task of feature discovery
from data is essentially the meaning of the keyword `learning' in SML.
Theoretical justifications for the effectiveness of the SML algorithms are
underpinned by sound principles from different disciplines, such as Computer
Science and Statistics. The theoretical underpinnings particularly justified by
statistical inference methods are together termed as statistical learning
theory.
This paper provides a review of SML from a Bayesian decision theoretic point
of view -- where we argue that many SML techniques are closely connected to
making inference by using the so called Bayesian paradigm. We discuss many
important SML techniques such as supervised and unsupervised learning, deep
learning, online learning and Gaussian processes especially in the context of
very large data sets where these are often employed. We present a dictionary
which maps the key concepts of SML from Computer Science and Statistics. We
illustrate the SML techniques with three moderately large data sets where we
also discuss many practical implementation issues. Thus the review is
especially targeted at statisticians and computer scientists who are aspiring
to understand and apply SML for moderately large to big data sets.Comment: 26 pages, 3 figures, Review pape
Lifelong Metric Learning
The state-of-the-art online learning approaches are only capable of learning
the metric for predefined tasks. In this paper, we consider lifelong learning
problem to mimic "human learning", i.e., endowing a new capability to the
learned metric for a new task from new online samples and incorporating
previous experiences and knowledge. Therefore, we propose a new metric learning
framework: lifelong metric learning (LML), which only utilizes the data of the
new task to train the metric model while preserving the original capabilities.
More specifically, the proposed LML maintains a common subspace for all learned
metrics, named lifelong dictionary, transfers knowledge from the common
subspace to each new metric task with task-specific idiosyncrasy, and redefines
the common subspace over time to maximize performance across all metric tasks.
For model optimization, we apply online passive aggressive optimization
algorithm to solve the proposed LML framework, where the lifelong dictionary
and task-specific partition are optimized alternatively and consecutively.
Finally, we evaluate our approach by analyzing several multi-task metric
learning datasets. Extensive experimental results demonstrate effectiveness and
efficiency of the proposed framework.Comment: 10 pages, 6 figure
Answerer in Questioner's Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog
Goal-oriented dialog has been given attention due to its numerous
applications in artificial intelligence. Goal-oriented dialogue tasks occur
when a questioner asks an action-oriented question and an answerer responds
with the intent of letting the questioner know a correct action to take. To ask
the adequate question, deep learning and reinforcement learning have been
recently applied. However, these approaches struggle to find a competent
recurrent neural questioner, owing to the complexity of learning a series of
sentences. Motivated by theory of mind, we propose "Answerer in Questioner's
Mind" (AQM), a novel information theoretic algorithm for goal-oriented dialog.
With AQM, a questioner asks and infers based on an approximated probabilistic
model of the answerer. The questioner figures out the answerer's intention via
selecting a plausible question by explicitly calculating the information gain
of the candidate intentions and possible answers to each question. We test our
framework on two goal-oriented visual dialog tasks: "MNIST Counting Dialog" and
"GuessWhat?!". In our experiments, AQM outperforms comparative algorithms by a
large margin.Comment: Selected for a spotlight presentation at NIPS, 2018. Camera ready
versio
Active Contextual Entropy Search
Contextual policy search allows adapting robotic movement primitives to
different situations. For instance, a locomotion primitive might be adapted to
different terrain inclinations or desired walking speeds. Such an adaptation is
often achievable by modifying a small number of hyperparameters. However,
learning, when performed on real robotic systems, is typically restricted to a
small number of trials. Bayesian optimization has recently been proposed as a
sample-efficient means for contextual policy search that is well suited under
these conditions. In this work, we extend entropy search, a variant of Bayesian
optimization, such that it can be used for active contextual policy search
where the agent selects those tasks during training in which it expects to
learn the most. Empirical results in simulation suggest that this allows
learning successful behavior with less trials.Comment: Corrected title of reference #1
Reinforcement Learning
Reinforcement learning (RL) is a general framework for adaptive control,
which has proven to be efficient in many domains, e.g., board games, video
games or autonomous vehicles. In such problems, an agent faces a sequential
decision-making problem where, at every time step, it observes its state,
performs an action, receives a reward and moves to a new state. An RL agent
learns by trial and error a good policy (or controller) based on observations
and numeric reward feedback on the previously performed action. In this
chapter, we present the basic framework of RL and recall the two main families
of approaches that have been developed to learn a good policy. The first one,
which is value-based, consists in estimating the value of an optimal policy,
value from which a policy can be recovered, while the other, called policy
search, directly works in a policy space. Actor-critic methods can be seen as a
policy search technique where the policy value that is learned guides the
policy improvement. Besides, we give an overview of some extensions of the
standard RL framework, notably when risk-averse behavior needs to be taken into
account or when rewards are not available or not known.Comment: Chapter in "A Guided Tour of Artificial Intelligence Research",
Springe
Learning What Information to Give in Partially Observed Domains
In many robotic applications, an autonomous agent must act within and explore
a partially observed environment that is unobserved by its human teammate. We
consider such a setting in which the agent can, while acting, transmit
declarative information to the human that helps them understand aspects of this
unseen environment. In this work, we address the algorithmic question of how
the agent should plan out what actions to take and what information to
transmit. Naturally, one would expect the human to have preferences, which we
model information-theoretically by scoring transmitted information based on the
change it induces in weighted entropy of the human's belief state. We formulate
this setting as a belief MDP and give a tractable algorithm for solving it
approximately. Then, we give an algorithm that allows the agent to learn the
human's preferences online, through exploration. We validate our approach
experimentally in simulated discrete and continuous partially observed
search-and-recover domains. Visit http://tinyurl.com/chitnis-corl-18 for a
supplementary video.Comment: CoRL 2018 final versio
Learning to see like children: proof of concept
In the last few years we have seen a growing interest in machine learning
approaches to computer vision and, especially, to semantic labeling. Nowadays
state of the art systems use deep learning on millions of labeled images with
very successful results on benchmarks, though it is unlikely to expect similar
results in unrestricted visual environments. Most learning schemes essentially
ignore the inherent sequential structure of videos: this might be a critical
issue, since any visual recognition process is remarkably more complex when
shuffling video frames. Based on this remark, we propose a re-foundation of the
communication protocol between visual agents and the environment, which is
referred to as learning to see like children. Like for human interaction,
visual concepts are acquired by the agents solely by processing their own
visual stream along with human supervisions on selected pixels. We give a proof
of concept that remarkable semantic labeling can emerge within this protocol by
using only a few supervised examples. This is made possible by exploiting a
constraint of motion coherent labeling that virtually offers tons of
supervisions. Additional visual constraints, including those associated with
object supervisions, are used within the context of learning from constraints.
The framework is extended in the direction of lifelong learning, so as our
visual agents live in their own visual environment without distinguishing
learning and test set. Learning takes place in deep architectures under a
progressive developmental scheme. In order to evaluate our Developmental Visual
Agents (DVAs), in addition to classic benchmarks, we open the doors of our lab,
allowing people to evaluate DVAs by crowd-sourcing. Such assessment mechanism
might result in a paradigm shift in methodologies and algorithms for computer
vision, encouraging truly novel solutions within the proposed framework
Game theoretic modelling of infectious disease dynamics and intervention methods: a mini-review
We review research papers which use game theory to model the decision making
of individuals during an epidemic, attempting to classify the literature and
identify the emerging trends in this field. We show that the literature can be
classified based on (i) type of population modelling (compartmental or
network-based), (ii) frequency of the game (non-iterative or iterative), and
(iii) type of strategy adoption (self-evaluation or imitation). We highlight
that the choice of model depends on many factors such as the type of immunity
the disease confers, the type of immunity the vaccine confers, and size of
population and level of mixing therein. We show that while early studies used
compartmental modelling with self-evaluation based strategy adoption, the
recent trend is to use network-based modelling with imitation-based strategy
adoption. Our review indicates that game theory continues to be an effective
tool to model intervention (vaccination or social distancing) decision-making
by individuals.Comment: 24 pages, 10 figure
Provable Guarantees for Gradient-Based Meta-Learning
We study the problem of meta-learning through the lens of online convex
optimization, developing a meta-algorithm bridging the gap between popular
gradient-based meta-learning and classical regularization-based multi-task
transfer methods. Our method is the first to simultaneously satisfy good sample
efficiency guarantees in the convex setting, with generalization bounds that
improve with task-similarity, while also being computationally scalable to
modern deep learning architectures and the many-task setting. Despite its
simplicity, the algorithm matches, up to a constant factor, a lower bound on
the performance of any such parameter-transfer method under natural task
similarity assumptions. We use experiments in both convex and deep learning
settings to verify and demonstrate the applicability of our theory.Comment: ICML 201
General AI Challenge - Round One: Gradual Learning
The General AI Challenge is an initiative to encourage the wider artificial
intelligence community to focus on important problems in building intelligent
machines with more general scope than is currently possible. The challenge
comprises of multiple rounds, with the first round focusing on gradual
learning, i.e. the ability to re-use already learned knowledge for efficiently
learning to solve subsequent problems. In this article, we will present details
of the first round of the challenge, its inspiration and aims. We also outline
a more formal description of the challenge and present a preliminary analysis
of its curriculum, based on ideas from computational mechanics. We believe,
that such formalism will allow for a more principled approach towards
investigating tasks in the challenge, building new curricula and for
potentially improving consequent challenge rounds.Comment: Presented as keynote talk at IJCAI Workshop on Evaluating
General-Purpose AI (EGPAI 2017
- …