60 research outputs found
Selecting Near-Optimal Learners via Incremental Data Allocation
We study a novel machine learning (ML) problem setting of sequentially
allocating small subsets of training data amongst a large set of classifiers.
The goal is to select a classifier that will give near-optimal accuracy when
trained on all data, while also minimizing the cost of misallocated samples.
This is motivated by large modern datasets and ML toolkits with many
combinations of learning algorithms and hyper-parameters. Inspired by the
principle of "optimism under uncertainty," we propose an innovative strategy,
Data Allocation using Upper Bounds (DAUB), which robustly achieves these
objectives across a variety of real-world datasets.
We further develop substantial theoretical support for DAUB in an idealized
setting where the expected accuracy of a classifier trained on samples can
be known exactly. Under these conditions we establish a rigorous sub-linear
bound on the regret of the approach (in terms of misallocated data), as well as
a rigorous bound on suboptimality of the selected classifier. Our accuracy
estimates using real-world datasets only entail mild violations of the
theoretical scenario, suggesting that the practical behavior of DAUB is likely
to approach the idealized behavior.Comment: AAAI-2016: The Thirtieth AAAI Conference on Artificial Intelligenc
Feature Engineering for Predictive Modeling using Reinforcement Learning
Feature engineering is a crucial step in the process of predictive modeling.
It involves the transformation of given feature space, typically using
mathematical functions, with the objective of reducing the modeling error for a
given target. However, there is no well-defined basis for performing effective
feature engineering. It involves domain knowledge, intuition, and most of all,
a lengthy process of trial and error. The human attention involved in
overseeing this process significantly influences the cost of model generation.
We present a new framework to automate feature engineering. It is based on
performance driven exploration of a transformation graph, which systematically
and compactly enumerates the space of given options. A highly efficient
exploration strategy is derived through reinforcement learning on past
examples
Memoizing a monadic mixin DSL
Modular extensibility is a highly desirable property of a domain-specific language (DSL): the ability to add new features without affecting the implementation of existing features. Functional mixins (also known as open recursion) are very suitable for this purpose.
We study the use of mixins in Haskell for a modular DSL for search heuristics used in systematic solvers for combinatorial problems, that generate optimized C++ code from a high-level specification. We show how to apply memoization techniques to tackle performance issues and code explosion due to the high recursion inherent to the semantics of combinatorial search.
As such heuristics are conventionally implemented as highly entangled imperative algorithms, our Haskell mixins are monadic. Memoization of monadic components causes further complications for us to deal with
An ADMM Based Framework for AutoML Pipeline Configuration
We study the AutoML problem of automatically configuring machine learning
pipelines by jointly selecting algorithms and their appropriate
hyper-parameters for all steps in supervised learning pipelines. This black-box
(gradient-free) optimization with mixed integer & continuous variables is a
challenging problem. We propose a novel AutoML scheme by leveraging the
alternating direction method of multipliers (ADMM). The proposed framework is
able to (i) decompose the optimization problem into easier sub-problems that
have a reduced number of variables and circumvent the challenge of mixed
variable categories, and (ii) incorporate black-box constraints along-side the
black-box optimization objective. We empirically evaluate the flexibility (in
utilizing existing AutoML techniques), effectiveness (against open source
AutoML toolkits),and unique capability (of executing AutoML with practically
motivated black-box constraints) of our proposed scheme on a collection of
binary classification data sets from UCI ML& OpenML repositories. We observe
that on an average our framework provides significant gains in comparison to
other AutoML frameworks (Auto-sklearn & TPOT), highlighting the practical
advantages of this framework
Choosing a Classical Planner with Graph Neural Networks
Online planner selection is the task of choosing a solver out of a predefined
set for a given planning problem. As planning is computationally hard, the
performance of solvers varies greatly on planning problems. Thus, the ability
to predict their performance on a given problem is of great importance. While a
variety of learning methods have been employed, for classical cost-optimal
planning the prevailing approach uses Graph Neural Networks (GNNs). In this
work, we continue the line of work on using GNNs for online planner selection.
We perform a thorough investigation of the impact of the chosen GNN model,
graph representation and node features, as well as prediction task. Going
further, we propose using the graph representation obtained by a GNN as an
input to the Extreme Gradient Boosting (XGBoost) model, resulting in a more
resource-efficient yet accurate approach. We show the effectiveness of a
variety of GNN-based online planner selection methods, opening up new exciting
avenues for research on online planner selection
Adaptive data augmentation for image classification
Data augmentation is the process of generating samples by transforming training data, with the target of improving the accuracy and robustness of classifiers. In this paper, we propose a new automatic and adaptive algorithm for choosing the transformations of the samples used in data augmentation. Specifically, for each sample, our main idea is to seek a small transformation that yields maximal classification loss on the transformed sample. We employ a trust-region optimization strategy, which consists of solving a sequence of linear programs. Our data augmentation scheme is then integrated into a Stochastic Gradient Descent algorithm for training deep neural networks. We perform experiments on two datasets, and show that that the proposed scheme outperforms random data augmentation algorithms in terms of accuracy and robustness, while yielding comparable or superior results with respect to existing selective sampling approaches
- …