3,814 research outputs found
HAMLET -- A Learning Curve-Enabled Multi-Armed Bandit for Algorithm Selection
Automated algorithm selection and hyperparameter tuning facilitates the
application of machine learning. Traditional multi-armed bandit strategies look
to the history of observed rewards to identify the most promising arms for
optimizing expected total reward in the long run. When considering limited time
budgets and computational resources, this backward view of rewards is
inappropriate as the bandit should look into the future for anticipating the
highest final reward at the end of a specified time budget. This work addresses
that insight by introducing HAMLET, which extends the bandit approach with
learning curve extrapolation and computation time-awareness for selecting among
a set of machine learning algorithms. Results show that the HAMLET Variants 1-3
exhibit equal or better performance than other bandit-based algorithm selection
strategies in experiments with recorded hyperparameter tuning traces for the
majority of considered time budgets. The best performing HAMLET Variant 3
combines learning curve extrapolation with the well-known upper confidence
bound exploration bonus. That variant performs better than all non-HAMLET
policies with statistical significance at the 95% level for 1,485 runs.Comment: 8 pages, 8 figures; IJCNN 2020: International Joint Conference on
Neural Network
A Survey on Practical Applications of Multi-Armed and Contextual Bandits
In recent years, multi-armed bandit (MAB) framework has attracted a lot of
attention in various applications, from recommender systems and information
retrieval to healthcare and finance, due to its stellar performance combined
with certain attractive properties, such as learning from less feedback. The
multi-armed bandit field is currently flourishing, as novel problem settings
and algorithms motivated by various practical applications are being
introduced, building on top of the classical bandit problem. This article aims
to provide a comprehensive review of top recent developments in multiple
real-life applications of the multi-armed bandit. Specifically, we introduce a
taxonomy of common MAB-based applications and summarize state-of-art for each
of those domains. Furthermore, we identify important current trends and provide
new perspectives pertaining to the future of this exciting and fast-growing
field.Comment: under review by IJCAI 2019 Surve
Automated Curriculum Learning for Neural Networks
We introduce a method for automatically selecting the path, or syllabus, that
a neural network follows through a curriculum so as to maximise learning
efficiency. A measure of the amount that the network learns from each data
sample is provided as a reward signal to a nonstationary multi-armed bandit
algorithm, which then determines a stochastic syllabus. We consider a range of
signals derived from two distinct indicators of learning progress: rate of
increase in prediction accuracy, and rate of increase in network complexity.
Experimental results for LSTM networks on three curricula demonstrate that our
approach can significantly accelerate learning, in some cases halving the time
required to attain a satisfactory performance level
AutoSeM: Automatic Task Selection and Mixing in Multi-Task Learning
Multi-task learning (MTL) has achieved success over a wide range of problems,
where the goal is to improve the performance of a primary task using a set of
relevant auxiliary tasks. However, when the usefulness of the auxiliary tasks
w.r.t. the primary task is not known a priori, the success of MTL models
depends on the correct choice of these auxiliary tasks and also a balanced
mixing ratio of these tasks during alternate training. These two problems could
be resolved via manual intuition or hyper-parameter tuning over all
combinatorial task choices, but this introduces inductive bias or is not
scalable when the number of candidate auxiliary tasks is very large. To address
these issues, we present AutoSeM, a two-stage MTL pipeline, where the first
stage automatically selects the most useful auxiliary tasks via a
Beta-Bernoulli multi-armed bandit with Thompson Sampling, and the second stage
learns the training mixing ratio of these selected auxiliary tasks via a
Gaussian Process based Bayesian optimization framework. We conduct several MTL
experiments on the GLUE language understanding tasks, and show that our AutoSeM
framework can successfully find relevant auxiliary tasks and automatically
learn their mixing ratio, achieving significant performance boosts on several
primary tasks. Finally, we present ablations for each stage of AutoSeM and
analyze the learned auxiliary task choices.Comment: NAACL 2019 (12 pages
A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity
The key challenge in multiagent learning is learning a best response to the
behaviour of other agents, which may be non-stationary: if the other agents
adapt their strategy as well, the learning target moves. Disparate streams of
research have approached non-stationarity from several angles, which make a
variety of implicit assumptions that make it hard to keep an overview of the
state of the art and to validate the innovation and significance of new works.
This survey presents a coherent overview of work that addresses
opponent-induced non-stationarity with tools from game theory, reinforcement
learning and multi-armed bandits. Further, we reflect on the principle
approaches how algorithms model and cope with this non-stationarity, arriving
at a new framework and five categories (in increasing order of sophistication):
ignore, forget, respond to target models, learn models, and theory of mind. A
wide range of state-of-the-art algorithms is classified into a taxonomy, using
these categories and key characteristics of the environment (e.g.,
observability) and adaptation behaviour of the opponents (e.g., smooth,
abrupt). To clarify even further we present illustrative variations of one
domain, contrasting the strengths and limitations of each category. Finally, we
discuss in which environments the different approaches yield most merit, and
point to promising avenues of future research.Comment: 64 pages, 7 figures. Under review since November 201
Dynamic Multi-Level Multi-Task Learning for Sentence Simplification
Sentence simplification aims to improve readability and understandability,
based on several operations such as splitting, deletion, and paraphrasing.
However, a valid simplified sentence should also be logically entailed by its
input sentence. In this work, we first present a strong pointer-copy mechanism
based sequence-to-sequence sentence simplification model, and then improve its
entailment and paraphrasing capabilities via multi-task learning with related
auxiliary tasks of entailment and paraphrase generation. Moreover, we propose a
novel 'multi-level' layered soft sharing approach where each auxiliary task
shares different (higher versus lower) level layers of the sentence
simplification model, depending on the task's semantic versus lexico-syntactic
nature. We also introduce a novel multi-armed bandit based training approach
that dynamically learns how to effectively switch across tasks during
multi-task learning. Experiments on multiple popular datasets demonstrate that
our model outperforms competitive simplification systems in SARI and FKGL
automatic metrics, and human evaluation. Further, we present several ablation
analyses on alternative layer sharing methods, soft versus hard sharing,
dynamic multi-armed bandit sampling approaches, and our model's learned
entailment and paraphrasing skills.Comment: COLING 2018 (15 pages
An Empirical Comparison of Syllabuses for Curriculum Learning
Syllabuses for curriculum learning have been developed on an ad-hoc, per task
basis and little is known about the relative performance of different
syllabuses. We identify a number of syllabuses used in the literature. We
compare the identified syllabuses based on their effect on the speed of
learning and generalization ability of a LSTM network on three sequential
learning tasks. We find that the choice of syllabus has limited effect on the
generalization ability of a trained network. In terms of speed of learning our
results demonstrate that the best syllabus is task dependent but that a
recently proposed automated curriculum learning approach - Predictive Gain,
performs very competitively against all identified hand-crafted syllabuses. The
best performing hand-crafted syllabus which we term Look Back and Forward
combines a syllabus which steps through tasks in the order of their difficulty
with a uniform distribution over all tasks. Our experimental results provide an
empirical basis for the choice of syllabus on a new problem that could benefit
from curriculum learning. Additionally, insights derived from our results shed
light on how to successfully design new syllabuses
Benchmark and Survey of Automated Machine Learning Frameworks
Machine learning (ML) has become a vital part in many aspects of our daily
life. However, building well performing machine learning applications requires
highly specialized data scientists and domain experts. Automated machine
learning (AutoML) aims to reduce the demand for data scientists by enabling
domain experts to build machine learning applications automatically without
extensive knowledge of statistics and machine learning. This paper is a
combination of a survey on current AutoML methods and a benchmark of popular
AutoML frameworks on real data sets. Driven by the selected frameworks for
evaluation, we summarize and review important AutoML techniques and methods
concerning every step in building an ML pipeline. The selected AutoML
frameworks are evaluated on 137 data sets from established AutoML benchmark
suits.Comment: Revised version accepted for publication at Journal of Artificial
Intelligence Research (JAIR
Adaptive Model Selection Framework: An Application to Airline Pricing
Multiple machine learning and prediction models are often used for the same
prediction or recommendation task. In our recent work, where we develop and
deploy airline ancillary pricing models in an online setting, we found that
among multiple pricing models developed, no one model clearly dominates other
models for all incoming customer requests. Thus, as algorithm designers, we
face an exploration - exploitation dilemma. In this work, we introduce an
adaptive meta-decision framework that uses Thompson sampling, a popular
multi-armed bandit solution method, to route customer requests to various
pricing models based on their online performance. We show that this adaptive
approach outperform a uniformly random selection policy by improving the
expected revenue per offer by 43% and conversion score by 58% in an offline
simulation
AutoML from Service Provider's Perspective: Multi-device, Multi-tenant Model Selection with GP-EI
AutoML has become a popular service that is provided by most leading cloud
service providers today. In this paper, we focus on the AutoML problem from the
\emph{service provider's perspective}, motivated by the following practical
consideration: When an AutoML service needs to serve {\em multiple users} with
{\em multiple devices} at the same time, how can we allocate these devices to
users in an efficient way? We focus on GP-EI, one of the most popular
algorithms for automatic model selection and hyperparameter tuning, used by
systems such as Google Vizer. The technical contribution of this paper is the
first multi-device, multi-tenant algorithm for GP-EI that is aware of
\emph{multiple} computation devices and multiple users sharing the same set of
computation devices. Theoretically, given users and devices, we obtain
a regret bound of , where
refers to the maximal incremental uncertainty up to
time for the covariance matrix . Empirically, we evaluate our algorithm
on two applications of automatic model selection, and show that our algorithm
significantly outperforms the strategy of serving users independently.
Moreover, when multiple computation devices are available, we achieve
near-linear speedup when the number of users is much larger than the number of
devices
- …