37 research outputs found

    On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice

    Full text link
    Machine learning algorithms have been used widely in various applications and areas. To fit a machine learning model into different problems, its hyper-parameters must be tuned. Selecting the best hyper-parameter configuration for machine learning models has a direct impact on the model's performance. It often requires deep knowledge of machine learning algorithms and appropriate hyper-parameter optimization techniques. Although several automatic optimization techniques exist, they have different strengths and drawbacks when applied to different types of problems. In this paper, optimizing the hyper-parameters of common machine learning models is studied. We introduce several state-of-the-art optimization techniques and discuss how to apply them to machine learning algorithms. Many available libraries and frameworks developed for hyper-parameter optimization problems are provided, and some open challenges of hyper-parameter optimization research are also discussed in this paper. Moreover, experiments are conducted on benchmark datasets to compare the performance of different optimization methods and provide practical examples of hyper-parameter optimization. This survey paper will help industrial users, data analysts, and researchers to better develop machine learning models by identifying the proper hyper-parameter configurations effectively.Comment: 69 Pages, 10 tables, accepted in Neurocomputing, Elsevier. Github link: https://github.com/LiYangHart/Hyperparameter-Optimization-of-Machine-Learning-Algorithm

    Weighted Random Search for Hyperparameter Optimization

    Get PDF
    We introduce an improved version of Random Search (RS), used here for hyperparameter optimization of machine learning algorithms. Unlike the standard RS, which generates for each trial new values for all hyperparameters, we generate new values for each hyperparameter with a probability of change. The intuition behind our approach is that a value that already triggered a good result is a good candidate for the next step, and should be tested in new combinations of hyperparameter values. Within the same computational budget, our method yields better results than the standard RS. Our theoretical results prove this statement. We test our method on a variation of one of the most commonly used objective function for this class of problems (the Grievank function) and for the hyperparameter optimization of a deep learning CNN architecture. Our results can be generalized to any optimization problem defined on a discrete domain

    SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization

    Full text link
    Computer vision is experiencing an AI renaissance, in which machine learning models are expediting important breakthroughs in academic research and commercial applications. Effectively training these models, however, is not trivial due in part to hyperparameters: user-configured values that control a model's ability to learn from data. Existing hyperparameter optimization methods are highly parallel but make no effort to balance the search across heterogeneous hardware or to prioritize searching high-impact spaces. In this paper, we introduce a framework for massively Scalable Hardware-Aware Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the relative complexity of each search space and monitors performance on the learning task over all trials. These metrics are then used as heuristics to assign hyperparameters to distributed workers based on their hardware. We first demonstrate that our framework achieves double the throughput of a standard distributed hyperparameter optimization framework by optimizing SVM for MNIST using 150 distributed workers. We then conduct model search with SHADHO over the course of one week using 74 GPUs across two compute clusters to optimize U-Net for a cell segmentation task, discovering 515 models that achieve a lower validation loss than standard U-Net.Comment: 10 pages, 6 figure

    Water Surfaces Detection from Sentinel-1 SAR Images using Deep Learning

    Get PDF
    Nowadays, Synthetic Aperture Radar (SAR) images have been widely used in the industry and the scientific community for different remote sensing applications. The main advantage of SAR technology is that it can acquire images from nighttime since it does not require sunlight. Additionally, it can capture images under the cloud where the traditional optical sensor is limited. It is very convenient to use SAR image for surface water detection because the flatness of the calm water surface reflects off all the energy from the radar and this makes the surface water appears in a SAR image as dark pixels. The traditional way to mark out water from SAR images is by just using the thresholding method which a pixel is classified as water when its value is below a certain threshold. This method works fine in a plain rural area, but the complex features of urban areas make it more challenging, for example, highways and buildings’ shadows can be easily misclassified as water. To solve this problem, we propose the Fully Convolutional Neural Network (FCN) Encoder method, a deep learning model based on the convolutional implementation of sliding windows. The FCN Encoder is designed to detect water from SAR images by considering both the pixel intensity and the spatial information of the pixel (i.e., its neighborhood). In our experiments, we first train the network using the So2sat dataset, which contains patches of Sentinel-1 satellite SAR images. Next, we use the trained neural network to detect water from SAR images of several cities. The obtained results show satisfactory scores and also visually appear accurate. In the final optimization phase, we: a) train the FCN Encoder with our custom HARD dataset - a dataset with images that are harder to classify, and b) we optimize the hyperparameters of the model. We test the resulted classifier on public SAR images and compare it with other methods such as Smooth Labeling, Random Forest, and FCN Segmentation
    corecore