37 research outputs found
On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice
Machine learning algorithms have been used widely in various applications and
areas. To fit a machine learning model into different problems, its
hyper-parameters must be tuned. Selecting the best hyper-parameter
configuration for machine learning models has a direct impact on the model's
performance. It often requires deep knowledge of machine learning algorithms
and appropriate hyper-parameter optimization techniques. Although several
automatic optimization techniques exist, they have different strengths and
drawbacks when applied to different types of problems. In this paper,
optimizing the hyper-parameters of common machine learning models is studied.
We introduce several state-of-the-art optimization techniques and discuss how
to apply them to machine learning algorithms. Many available libraries and
frameworks developed for hyper-parameter optimization problems are provided,
and some open challenges of hyper-parameter optimization research are also
discussed in this paper. Moreover, experiments are conducted on benchmark
datasets to compare the performance of different optimization methods and
provide practical examples of hyper-parameter optimization. This survey paper
will help industrial users, data analysts, and researchers to better develop
machine learning models by identifying the proper hyper-parameter
configurations effectively.Comment: 69 Pages, 10 tables, accepted in Neurocomputing, Elsevier. Github
link:
https://github.com/LiYangHart/Hyperparameter-Optimization-of-Machine-Learning-Algorithm
Weighted Random Search for Hyperparameter Optimization
We introduce an improved version of Random Search (RS), used here for hyperparameter optimization of machine learning algorithms. Unlike the standard RS, which generates for each trial new values for all hyperparameters, we generate new values for each hyperparameter with a probability of change. The intuition behind our approach is that a value that already triggered a good result is a good candidate for the next step, and should be tested in new combinations of hyperparameter values. Within the same computational budget, our method yields better results than the standard RS. Our theoretical results prove this statement. We test our method on a variation of one of the most commonly used objective function for this class of problems (the Grievank function) and for the hyperparameter optimization of a deep learning CNN architecture. Our results can be generalized to any optimization problem defined on a discrete domain
SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization
Computer vision is experiencing an AI renaissance, in which machine learning
models are expediting important breakthroughs in academic research and
commercial applications. Effectively training these models, however, is not
trivial due in part to hyperparameters: user-configured values that control a
model's ability to learn from data. Existing hyperparameter optimization
methods are highly parallel but make no effort to balance the search across
heterogeneous hardware or to prioritize searching high-impact spaces. In this
paper, we introduce a framework for massively Scalable Hardware-Aware
Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the
relative complexity of each search space and monitors performance on the
learning task over all trials. These metrics are then used as heuristics to
assign hyperparameters to distributed workers based on their hardware. We first
demonstrate that our framework achieves double the throughput of a standard
distributed hyperparameter optimization framework by optimizing SVM for MNIST
using 150 distributed workers. We then conduct model search with SHADHO over
the course of one week using 74 GPUs across two compute clusters to optimize
U-Net for a cell segmentation task, discovering 515 models that achieve a lower
validation loss than standard U-Net.Comment: 10 pages, 6 figure
Water Surfaces Detection from Sentinel-1 SAR Images using Deep Learning
Nowadays, Synthetic Aperture Radar (SAR) images have been widely used in the industry and the scientific community for different remote sensing applications. The main advantage of SAR technology is that it can acquire images from nighttime since it does not require sunlight. Additionally, it can capture images under the cloud where the traditional optical sensor is limited. It is very convenient to use SAR image for surface water detection because the flatness of the calm water surface reflects off all the energy from the radar and this makes the surface water appears in a SAR image as dark pixels. The traditional way to mark out water from SAR images is by just using the thresholding method which a pixel is classified as water when its value is below a certain threshold. This method works fine in a plain rural area, but the complex features of urban areas make it more challenging, for example, highways and buildings’ shadows can be easily misclassified as water. To solve this problem, we propose the Fully Convolutional Neural Network (FCN) Encoder method, a deep learning model based on the convolutional implementation of sliding windows. The FCN Encoder is designed to detect water from SAR images by considering both the pixel intensity and the spatial information of the pixel (i.e., its neighborhood). In our experiments, we first train the network using the So2sat dataset, which contains patches of Sentinel-1 satellite SAR images. Next, we use the trained neural network to detect water from SAR images of several cities. The obtained results show satisfactory scores and also visually appear accurate. In the final optimization phase, we: a) train the FCN Encoder with our custom HARD dataset - a dataset with images that are harder to classify, and b) we optimize the hyperparameters of the model. We test the resulted classifier on public SAR images and compare it with other methods such as Smooth Labeling, Random Forest, and FCN Segmentation