1,012 research outputs found

    Basic Enhancement Strategies When Using Bayesian Optimization for Hyperparameter Tuning of Deep Neural Networks

    Get PDF
    Compared to the traditional machine learning models, deep neural networks (DNN) are known to be highly sensitive to the choice of hyperparameters. While the required time and effort for manual tuning has been rapidly decreasing for the well developed and commonly used DNN architectures, undoubtedly DNN hyperparameter optimization will continue to be a major burden whenever a new DNN architecture needs to be designed, a new task needs to be solved, a new dataset needs to be addressed, or an existing DNN needs to be improved further. For hyperparameter optimization of general machine learning problems, numerous automated solutions have been developed where some of the most popular solutions are based on Bayesian Optimization (BO). In this work, we analyze four fundamental strategies for enhancing BO when it is used for DNN hyperparameter optimization. Specifically, diversification, early termination, parallelization, and cost function transformation are investigated. Based on the analysis, we provide a simple yet robust algorithm for DNN hyperparameter optimization - DEEP-BO (Diversified, Early-termination-Enabled, and Parallel Bayesian Optimization). When evaluated over six DNN benchmarks, DEEP-BO mostly outperformed well-known solutions including GP-Hedge, BOHB, and the speed-up variants that use Median Stopping Rule or Learning Curve Extrapolation. In fact, DEEP-BO consistently provided the top, or at least close to the top, performance over all the benchmark types that we have tested. This indicates that DEEP-BO is a robust solution compared to the existing solutions. The DEEP-BO code is publicly available at <uri>https://github.com/snu-adsl/DEEP-BO</uri>

    Automated-tuned hyper-parameter deep neural network by using arithmetic optimization algorithm for Lorenz chaotic system

    Get PDF
    Deep neural networks (DNNs) are very dependent on their parameterization and require experts to determine which method to implement and modify the hyper-parameters value. This study proposes an automated-tuned hyper-parameter for DNN using a metaheuristic optimization algorithm, arithmetic optimization algorithm (AOA). AOA makes use of the distribution properties of mathematics’ primary arithmetic operators, including multiplication, division, addition, and subtraction. AOA is mathematically modeled and implemented to optimize processes across a broad range of search spaces. The performance of AOA is evaluated against 29 benchmark functions, and several real-world engineering design problems are to demonstrate AOA’s applicability. The hyper-parameter tuning framework consists of a set of Lorenz chaotic system datasets, hybrid DNN architecture, and AOA that works automatically. As a result, AOA produced the highest accuracy in the test dataset with a combination of optimized hyper-parameters for DNN architecture. The boxplot analysis also produced the ten AOA particles that are the most accurately chosen. Hence, AOA with ten particles had the smallest size of boxplot for all hyper-parameters, which concluded the best solution. In particular, the result for the proposed system is outperformed compared to the architecture tested with particle swarm optimization

    Construction of Reduced Order Models for Fluid Flows Using Deep Feedforward Neural Networks

    Full text link
    We present a numerical methodology for construction of reduced order models, ROMs, of fluid flows through the combination of flow modal decomposition and regression analysis. Spectral proper orthogonal decomposition, SPOD, is applied to reduce the dimensionality of the model and, at the same time, filter the POD temporal modes. The regression step is performed by a deep feedforward neural network, DNN, and the current framework is implemented in a context similar to the sparse identification of non-linear dynamics algorithm, SINDy. A discussion on the optimization of the DNN hyperparameters is provided for obtaining the best ROMs and an assessment of these models is presented for a canonical nonlinear oscillator and the compressible flow past a cylinder. Then, the method is tested on the reconstruction of a turbulent flow computed by a large eddy simulation of a plunging airfoil under dynamic stall. The reduced order model is able to capture the dynamics of the leading edge stall vortex and the subsequent trailing edge vortex. For the cases analyzed, the numerical framework allows the prediction of the flowfield beyond the training window using larger time increments than those employed by the full order model. We also demonstrate the robustness of the current ROMs constructed via deep feedforward neural networks through a comparison with sparse regression. The DNN approach is able to learn transient features of the flow and presents more accurate and stable long-term predictions compared to sparse regression
    corecore