2 research outputs found
Transferable Neural Processes for Hyperparameter Optimization
Automated machine learning aims to automate the whole process of machine
learning, including model configuration. In this paper, we focus on automated
hyperparameter optimization (HPO) based on sequential model-based optimization
(SMBO). Though conventional SMBO algorithms work well when abundant HPO trials
are available, they are far from satisfactory in practical applications where a
trial on a huge dataset may be so costly that an optimal hyperparameter
configuration is expected to return in as few trials as possible. Observing
that human experts draw on their expertise in a machine learning model by
trying configurations that once performed well on other datasets, we are
inspired to speed up HPO by transferring knowledge from historical HPO trials
on other datasets. We propose an end-to-end and efficient HPO algorithm named
as Transfer Neural Processes (TNP), which achieves transfer learning by
incorporating trials on other datasets, initializing the model with
well-generalized parameters, and learning an initial set of hyperparameters to
evaluate. Experiments on extensive OpenML datasets and three computer vision
datasets show that the proposed model can achieve state-of-the-art performance
in at least one order of magnitude less trials.Comment: 11 pages, 12 figure
Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for Hyperparameter Recommendation
With the surge in the number of hyperparameters and training times of modern
machine learning models, hyperparameter tuning is becoming increasingly
expensive. Although methods have been proposed to speed up tuning via knowledge
transfer, they typically require the final performance of hyperparameters and
do not focus on low-fidelity information. Nevertheless, this common practice is
suboptimal and can incur an unnecessary use of resources. It is more
cost-efficient to instead leverage the low-fidelity tuning observations to
measure inter-task similarity and transfer knowledge from existing to new tasks
accordingly. However, performing multi-fidelity tuning comes with its own
challenges in the transfer setting: the noise in the additional observations
and the need for performance forecasting. Therefore, we conduct a thorough
analysis of the multi-task multi-fidelity Bayesian optimization framework,
which leads to the best instantiation--amortized auto-tuning (AT2). We further
present an offline-computed 27-task hyperparameter recommendation (HyperRec)
database to serve the community. Extensive experiments on HyperRec and other
real-world databases illustrate the effectiveness of our AT2 method