6,802 research outputs found
Weighted Sampling for Combined Model Selection and Hyperparameter Tuning
The combined algorithm selection and hyperparameter tuning (CASH) problem is
characterized by large hierarchical hyperparameter spaces. Model-free
hyperparameter tuning methods can explore such large spaces efficiently since
they are highly parallelizable across multiple machines. When no prior
knowledge or meta-data exists to boost their performance, these methods
commonly sample random configurations following a uniform distribution. In this
work, we propose a novel sampling distribution as an alternative to uniform
sampling and prove theoretically that it has a better chance of finding the
best configuration in a worst-case setting. In order to compare competing
methods rigorously in an experimental setting, one must perform statistical
hypothesis testing. We show that there is little-to-no agreement in the
automated machine learning literature regarding which methods should be used.
We contrast this disparity with the methods recommended by the broader
statistics literature, and identify a suitable approach. We then select three
popular model-free solutions to CASH and evaluate their performance, with
uniform sampling as well as the proposed sampling scheme, across 67 datasets
from the OpenML platform. We investigate the trade-off between exploration and
exploitation across the three algorithms, and verify empirically that the
proposed sampling distribution improves performance in all cases.Comment: Accepted for presentation at The Thirty-Fourth AAAI Conference on
Artificial Intelligence (AAAI 2020
Is One Hyperparameter Optimizer Enough?
Hyperparameter tuning is the black art of automatically finding a good
combination of control parameters for a data miner. While widely applied in
empirical Software Engineering, there has not been much discussion on which
hyperparameter tuner is best for software analytics. To address this gap in the
literature, this paper applied a range of hyperparameter optimizers (grid
search, random search, differential evolution, and Bayesian optimization) to
defect prediction problem. Surprisingly, no hyperparameter optimizer was
observed to be `best' and, for one of the two evaluation measures studied here
(F-measure), hyperparameter optimization, in 50\% cases, was no better than
using default configurations.
We conclude that hyperparameter optimization is more nuanced than previously
believed. While such optimization can certainly lead to large improvements in
the performance of classifiers used in software analytics, it remains to be
seen which specific optimizers should be applied to a new dataset.Comment: 7 pages, 2 columns, accepted for SWAN1
A Nonparametric Bayesian Approach to Uncovering Rat Hippocampal Population Codes During Spatial Navigation
Rodent hippocampal population codes represent important spatial information
about the environment during navigation. Several computational methods have
been developed to uncover the neural representation of spatial topology
embedded in rodent hippocampal ensemble spike activity. Here we extend our
previous work and propose a nonparametric Bayesian approach to infer rat
hippocampal population codes during spatial navigation. To tackle the model
selection problem, we leverage a nonparametric Bayesian model. Specifically, to
analyze rat hippocampal ensemble spiking activity, we apply a hierarchical
Dirichlet process-hidden Markov model (HDP-HMM) using two Bayesian inference
methods, one based on Markov chain Monte Carlo (MCMC) and the other based on
variational Bayes (VB). We demonstrate the effectiveness of our Bayesian
approaches on recordings from a freely-behaving rat navigating in an open field
environment. We find that MCMC-based inference with Hamiltonian Monte Carlo
(HMC) hyperparameter sampling is flexible and efficient, and outperforms VB and
MCMC approaches with hyperparameters set by empirical Bayes
An Analysis of the Value of Information when Exploring Stochastic, Discrete Multi-Armed Bandits
In this paper, we propose an information-theoretic exploration strategy for
stochastic, discrete multi-armed bandits that achieves optimal regret. Our
strategy is based on the value of information criterion. This criterion
measures the trade-off between policy information and obtainable rewards. High
amounts of policy information are associated with exploration-dominant searches
of the space and yield high rewards. Low amounts of policy information favor
the exploitation of existing knowledge. Information, in this criterion, is
quantified by a parameter that can be varied during search. We demonstrate that
a simulated-annealing-like update of this parameter, with a sufficiently fast
cooling schedule, leads to an optimal regret that is logarithmic with respect
to the number of episodes.Comment: Entrop
- …