15,451 research outputs found
Hyperparameter optimization with approximate gradient
Most models in machine learning contain at least one hyperparameter to
control for model complexity. Choosing an appropriate set of hyperparameters is
both crucial in terms of model accuracy and computationally challenging. In
this work we propose an algorithm for the optimization of continuous
hyperparameters using inexact gradient information. An advantage of this method
is that hyperparameters can be updated before model parameters have fully
converged. We also give sufficient conditions for the global convergence of
this method, based on regularity conditions of the involved functions and
summability of errors. Finally, we validate the empirical performance of this
method on the estimation of regularization constants of L2-regularized logistic
regression and kernel Ridge regression. Empirical benchmarks indicate that our
approach is highly competitive with respect to state of the art methods.Comment: Proceedings of the International conference on Machine Learning
(ICML
Hyperparameter optimization with approximate gradient
Abstract Most models in machine learning contain at least one hyperparameter to control for model complexity. Choosing an appropriate set of hyperparameters is both crucial in terms of model accuracy and computationally challenging. In this work we propose an algorithm for the optimization of continuous hyperparameters using inexact gradient information. An advantage of this method is that hyperparameters can be updated before model parameters have fully converged. We also give sufficient conditions for the global convergence of this method, based on regularity conditions of the involved functions and summability of errors. Finally, we validate the empirical performance of this method on the estimation of regularization constants of � 2 -regularized logistic regression and kernel Ridge regression. Empirical benchmarks indicate that our approach is highly competitive with respect to state of the art methods
CPMLHO:Hyperparameter Tuning via Cutting Plane and Mixed-Level Optimization
The hyperparameter optimization of neural network can be expressed as a
bilevel optimization problem. The bilevel optimization is used to automatically
update the hyperparameter, and the gradient of the hyperparameter is the
approximate gradient based on the best response function. Finding the best
response function is very time consuming. In this paper we propose CPMLHO, a
new hyperparameter optimization method using cutting plane method and
mixed-level objective function.The cutting plane is added to the inner layer to
constrain the space of the response function. To obtain more accurate
hypergradient,the mixed-level can flexibly adjust the loss function by using
the loss of the training set and the verification set. Compared to existing
methods, the experimental results show that our method can automatically update
the hyperparameters in the training process, and can find more superior
hyperparameters with higher accuracy and faster convergence
- …