Search CORE

5,874 research outputs found

A Bayesian Perspective of Statistical Machine Learning for Big Data

Author: Das Sourish
Sahu Sujit K
Sambasivan Rajiv
Publication venue
Publication date: 12/11/2018
Field of study

Statistical Machine Learning (SML) refers to a body of algorithms and methods by which computers are allowed to discover important features of input data sets which are often very large in size. The very task of feature discovery from data is essentially the meaning of the keyword `learning' in SML. Theoretical justifications for the effectiveness of the SML algorithms are underpinned by sound principles from different disciplines, such as Computer Science and Statistics. The theoretical underpinnings particularly justified by statistical inference methods are together termed as statistical learning theory. This paper provides a review of SML from a Bayesian decision theoretic point of view -- where we argue that many SML techniques are closely connected to making inference by using the so called Bayesian paradigm. We discuss many important SML techniques such as supervised and unsupervised learning, deep learning, online learning and Gaussian processes especially in the context of very large data sets where these are often employed. We present a dictionary which maps the key concepts of SML from Computer Science and Statistics. We illustrate the SML techniques with three moderately large data sets where we also discuss many practical implementation issues. Thus the review is especially targeted at statisticians and computer scientists who are aspiring to understand and apply SML for moderately large to big data sets.Comment: 26 pages, 3 figures, Review pape

arXiv.org e-Print Archive

Lifelong Metric Learning

Author: Cong Yang
Liu Ji
Sun Gan
Xu Xiaowei
Publication venue
Publication date: 12/06/2017
Field of study

The state-of-the-art online learning approaches are only capable of learning the metric for predefined tasks. In this paper, we consider lifelong learning problem to mimic "human learning", i.e., endowing a new capability to the learned metric for a new task from new online samples and incorporating previous experiences and knowledge. Therefore, we propose a new metric learning framework: lifelong metric learning (LML), which only utilizes the data of the new task to train the metric model while preserving the original capabilities. More specifically, the proposed LML maintains a common subspace for all learned metrics, named lifelong dictionary, transfers knowledge from the common subspace to each new metric task with task-specific idiosyncrasy, and redefines the common subspace over time to maximize performance across all metric tasks. For model optimization, we apply online passive aggressive optimization algorithm to solve the proposed LML framework, where the lifelong dictionary and task-specific partition are optimized alternatively and consecutively. Finally, we evaluate our approach by analyzing several multi-task metric learning datasets. Extensive experimental results demonstrate effectiveness and efficiency of the proposed framework.Comment: 10 pages, 6 figure

arXiv.org e-Print Archive

Answerer in Questioner's Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog

Author: Heo Yu-Jung
Lee Sang-Woo
Zhang Byoung-Tak
Publication venue
Publication date: 28/11/2018
Field of study

Goal-oriented dialog has been given attention due to its numerous applications in artificial intelligence. Goal-oriented dialogue tasks occur when a questioner asks an action-oriented question and an answerer responds with the intent of letting the questioner know a correct action to take. To ask the adequate question, deep learning and reinforcement learning have been recently applied. However, these approaches struggle to find a competent recurrent neural questioner, owing to the complexity of learning a series of sentences. Motivated by theory of mind, we propose "Answerer in Questioner's Mind" (AQM), a novel information theoretic algorithm for goal-oriented dialog. With AQM, a questioner asks and infers based on an approximated probabilistic model of the answerer. The questioner figures out the answerer's intention via selecting a plausible question by explicitly calculating the information gain of the candidate intentions and possible answers to each question. We test our framework on two goal-oriented visual dialog tasks: "MNIST Counting Dialog" and "GuessWhat?!". In our experiments, AQM outperforms comparative algorithms by a large margin.Comment: Selected for a spotlight presentation at NIPS, 2018. Camera ready versio

arXiv.org e-Print Archive

Active Contextual Entropy Search

Author: Metzen Jan Hendrik
Publication venue
Publication date: 16/11/2015
Field of study

Contextual policy search allows adapting robotic movement primitives to different situations. For instance, a locomotion primitive might be adapted to different terrain inclinations or desired walking speeds. Such an adaptation is often achievable by modifying a small number of hyperparameters. However, learning, when performed on real robotic systems, is typically restricted to a small number of trials. Bayesian optimization has recently been proposed as a sample-efficient means for contextual policy search that is well suited under these conditions. In this work, we extend entropy search, a variant of Bayesian optimization, such that it can be used for active contextual policy search where the agent selects those tasks during training in which it expects to learn the most. Empirical results in simulation suggest that this allows learning successful behavior with less trials.Comment: Corrected title of reference #1

arXiv.org e-Print Archive

Reinforcement Learning

Author: Buffet Olivier
Pietquin Olivier
Weng Paul
Publication venue
Publication date: 13/06/2020
Field of study

Reinforcement learning (RL) is a general framework for adaptive control, which has proven to be efficient in many domains, e.g., board games, video games or autonomous vehicles. In such problems, an agent faces a sequential decision-making problem where, at every time step, it observes its state, performs an action, receives a reward and moves to a new state. An RL agent learns by trial and error a good policy (or controller) based on observations and numeric reward feedback on the previously performed action. In this chapter, we present the basic framework of RL and recall the two main families of approaches that have been developed to learn a good policy. The first one, which is value-based, consists in estimating the value of an optimal policy, value from which a policy can be recovered, while the other, called policy search, directly works in a policy space. Actor-critic methods can be seen as a policy search technique where the policy value that is learned guides the policy improvement. Besides, we give an overview of some extensions of the standard RL framework, notably when risk-averse behavior needs to be taken into account or when rewards are not available or not known.Comment: Chapter in "A Guided Tour of Artificial Intelligence Research", Springe

arXiv.org e-Print Archive

Learning What Information to Give in Partially Observed Domains

Author: Chitnis Rohan
Kaelbling Leslie Pack
Lozano-Pérez Tomás
Publication venue
Publication date: 27/09/2018
Field of study

In many robotic applications, an autonomous agent must act within and explore a partially observed environment that is unobserved by its human teammate. We consider such a setting in which the agent can, while acting, transmit declarative information to the human that helps them understand aspects of this unseen environment. In this work, we address the algorithmic question of how the agent should plan out what actions to take and what information to transmit. Naturally, one would expect the human to have preferences, which we model information-theoretically by scoring transmitted information based on the change it induces in weighted entropy of the human's belief state. We formulate this setting as a belief MDP and give a tractable algorithm for solving it approximately. Then, we give an algorithm that allows the agent to learn the human's preferences online, through exploration. We validate our approach experimentally in simulated discrete and continuous partially observed search-and-recover domains. Visit http://tinyurl.com/chitnis-corl-18 for a supplementary video.Comment: CoRL 2018 final versio

arXiv.org e-Print Archive

Learning to see like children: proof of concept

Author: Gori Marco
Lippi Marco
Maggini Marco
Melacci Stefano
Publication venue
Publication date: 11/08/2014
Field of study

In the last few years we have seen a growing interest in machine learning approaches to computer vision and, especially, to semantic labeling. Nowadays state of the art systems use deep learning on millions of labeled images with very successful results on benchmarks, though it is unlikely to expect similar results in unrestricted visual environments. Most learning schemes essentially ignore the inherent sequential structure of videos: this might be a critical issue, since any visual recognition process is remarkably more complex when shuffling video frames. Based on this remark, we propose a re-foundation of the communication protocol between visual agents and the environment, which is referred to as learning to see like children. Like for human interaction, visual concepts are acquired by the agents solely by processing their own visual stream along with human supervisions on selected pixels. We give a proof of concept that remarkable semantic labeling can emerge within this protocol by using only a few supervised examples. This is made possible by exploiting a constraint of motion coherent labeling that virtually offers tons of supervisions. Additional visual constraints, including those associated with object supervisions, are used within the context of learning from constraints. The framework is extended in the direction of lifelong learning, so as our visual agents live in their own visual environment without distinguishing learning and test set. Learning takes place in deep architectures under a progressive developmental scheme. In order to evaluate our Developmental Visual Agents (DVAs), in addition to classic benchmarks, we open the doors of our lab, allowing people to evaluate DVAs by crowd-sourcing. Such assessment mechanism might result in a paradigm shift in methodologies and algorithms for computer vision, encouraging truly novel solutions within the proposed framework

arXiv.org e-Print Archive

Game theoretic modelling of infectious disease dynamics and intervention methods: a mini-review

Author: Chang Sheryl L.
Pattison Philippa
Piraveenan Mahendra
Prokopenko Mikhail
Publication venue: 'Informa UK Limited'
Publication date: 14/01/2019
Field of study

We review research papers which use game theory to model the decision making of individuals during an epidemic, attempting to classify the literature and identify the emerging trends in this field. We show that the literature can be classified based on (i) type of population modelling (compartmental or network-based), (ii) frequency of the game (non-iterative or iterative), and (iii) type of strategy adoption (self-evaluation or imitation). We highlight that the choice of model depends on many factors such as the type of immunity the disease confers, the type of immunity the vaccine confers, and size of population and level of mixing therein. We show that while early studies used compartmental modelling with self-evaluation based strategy adoption, the recent trend is to use network-based modelling with imitation-based strategy adoption. Our review indicates that game theory continues to be an effective tool to model intervention (vaccination or social distancing) decision-making by individuals.Comment: 24 pages, 10 figure

arXiv.org e-Print Archive

Provable Guarantees for Gradient-Based Meta-Learning

Author: Balcan Maria-Florina
Khodak Mikhail
Talwalkar Ameet
Publication venue
Publication date: 16/05/2019
Field of study

We study the problem of meta-learning through the lens of online convex optimization, developing a meta-algorithm bridging the gap between popular gradient-based meta-learning and classical regularization-based multi-task transfer methods. Our method is the first to simultaneously satisfy good sample efficiency guarantees in the convex setting, with generalization bounds that improve with task-similarity, while also being computationally scalable to modern deep learning architectures and the many-task setting. Despite its simplicity, the algorithm matches, up to a constant factor, a lower bound on the performance of any such parameter-transfer method under natural task similarity assumptions. We use experiments in both convex and deep learning settings to verify and demonstrate the applicability of our theory.Comment: ICML 201

arXiv.org e-Print Archive

General AI Challenge - Round One: Gradual Learning

Author: Feyereisl Jan
Nikl Matej
Poliak Martin
Stransky Martin
Vlasak Michal
Publication venue
Publication date: 17/08/2017
Field of study

The General AI Challenge is an initiative to encourage the wider artificial intelligence community to focus on important problems in building intelligent machines with more general scope than is currently possible. The challenge comprises of multiple rounds, with the first round focusing on gradual learning, i.e. the ability to re-use already learned knowledge for efficiently learning to solve subsequent problems. In this article, we will present details of the first round of the challenge, its inspiration and aims. We also outline a more formal description of the challenge and present a preliminary analysis of its curriculum, based on ideas from computational mechanics. We believe, that such formalism will allow for a more principled approach towards investigating tasks in the challenge, building new curricula and for potentially improving consequent challenge rounds.Comment: Presented as keynote talk at IJCAI Workshop on Evaluating General-Purpose AI (EGPAI 2017

arXiv.org e-Print Archive