Search CORE

4,643 research outputs found

Do optimization methods in deep learning applications matter?

Author: Kiran Mariam
Ozyildirim Buse Melis
Publication venue: eScholarship, University of California
Publication date: 28/02/2020
Field of study

With advances in deep learning, exponential data growth and increasing model complexity, developing efficient optimization methods are attracting much research attention. Several implementations favor the use of Conjugate Gradient (CG) and Stochastic Gradient Descent (SGD) as being practical and elegant solutions to achieve quick convergence, however, these optimization processes also present many limitations in learning across deep learning applications. Recent research is exploring higher-order optimization functions as better approaches, but these present very complex computational challenges for practical use. Comparing first and higher-order optimization functions, in this paper, our experiments reveal that Levemberg-Marquardt (LM) significantly supersedes optimal convergence but suffers from very large processing time increasing the training complexity of both, classification and reinforcement learning problems. Our experiments compare off-the-shelf optimization functions(CG, SGD, LM and L-BFGS) in standard CIFAR, MNIST, CartPole and FlappyBird experiments.The paper presents arguments on which optimization functions to use and further, which functions would benefit from parallelization efforts to improve pretraining time and learning rate convergence

arXiv.org e-Print Archive

eScholarship - University of California

Robust optimization of control parameters for WEC arrays using stochastic methods

Author: Ciaramella Gabriele
Gambarini Marco
Miglio Edie
Vanzan Tommaso
Publication venue
Publication date: 29/08/2023
Field of study

This work presents a new computational optimization framework for the robust control of parks of Wave Energy Converters (WEC) in irregular waves. The power of WEC parks is maximized with respect to the individual control damping and stiffness coefficients of each device. The results are robust with respect to the incident wave direction, which is treated as a random variable. Hydrodynamic properties are computed using the linear potential model, and the dynamics of the system is computed in the frequency domain. A slamming constraint is enforced to ensure that the results are physically realistic. We show that the stochastic optimization problem is well posed. Two optimization approaches for dealing with stochasticity are then considered: stochastic approximation and sample average approximation. The outcomes of the above mentioned methods in terms of accuracy and computational time are presented. The results of the optimization for complex and realistic array configurations of possible engineering interest are then discussed. Results of extensive numerical experiments demonstrate the efficiency of the proposed computational framework

arXiv.org e-Print Archive

Neural Networks: Training and Application to Nonlinear System Identification and Control

Author: Khodabandehlou Hamid
Publication venue
Publication date: 06/07/2018
Field of study

This dissertation investigates training neural networks for system identification and classification. The research contains two main contributions as follow:1. Reducing number of hidden layer nodes using a feedforward componentThis research reduces the number of hidden layer nodes and training time of neural networks to make them more suited to online identification and control applications by adding a parallel feedforward component. Implementing the feedforward component with a wavelet neural network and an echo state network provides good models for nonlinear systems.The wavelet neural network with feedforward component along with model predictive controller can reliably identify and control a seismically isolated structure during earthquake. The network model provides the predictions for model predictive control. Simulations of a 5-story seismically isolated structure with conventional lead-rubber bearings showed significant reductions of all response amplitudes for both near-field (pulse) and far-field ground motions, including reduced deformations along with corresponding reduction in acceleration response. The controller effectively regulated the apparent stiffness at the isolation level. The approach is also applied to the online identification and control of an unmanned vehicle. Lyapunov theory is used to prove the stability of the wavelet neural network and the model predictive controller. 2. Training neural networks using trajectory based optimization approachesTraining neural networks is a nonlinear non-convex optimization problem to determine the weights of the neural network. Traditional training algorithms can be inefficient and can get trapped in local minima. Two global optimization approaches are adapted to train neural networks and avoid the local minima problem. Lyapunov theory is used to prove the stability of the proposed methodology and its convergence in the presence of measurement errors. The first approach transforms the constraint satisfaction problem into unconstrained optimization. The constraints define a quotient gradient system (QGS) whose stable equilibrium points are local minima of the unconstrained optimization. The QGS is integrated to determine local minima and the local minimum with the best generalization performance is chosen as the optimal solution. The second approach uses the QGS together with a projected gradient system (PGS). The PGS is a nonlinear dynamical system, defined based on the optimization problem that searches the components of the feasible region for solutions. Lyapunov theory is used to prove the stability of PGS and QGS and their stability under presence of measurement noise

University of Nevada, Reno ScholarWorks Repository