13,923 research outputs found

    Directional k-Step Newton Methods in n Variables and its Semilocal Convergence Analysis

    Full text link
    [EN] The directional k-step Newton methods (k a positive integer) is developed for solving a single nonlinear equation in n variables. Its semilocal convergence analysis is established by using two different approaches (recurrent relations and recurrent functions) under the assumption that the first derivative satisfies a combination of the Lipschitz and the center-Lipschitz continuity conditions instead of only Lipschitz condition. The convergence theorems for the existence and uniqueness of the solution for each of them are established. Numerical examples including nonlinear Hammerstein-type integral equations are worked out and significantly improved results are obtained. It is shown that the second approach based on recurrent functions solves problems failed to be solved by first one using recurrent relations. This demonstrates the efficacy and applicability of these approaches. This work extends the directional one and two-step Newton methods for solving a single nonlinear equation in n variables. Their semilocal convergence analysis using majorizing sequences are studied in Levin (Math Comput 71(237): 251-262, 2002) and Ioannis (Num Algorithms 55(4): 503-528, 2010) under the assumption that the first derivative satisfies the Lipschitz and the combination of the Lipschitz and the center-Lipschitz continuity conditions, respectively. Finally, the computational order of convergence and the computational efficiency of developed method are studied.The authors thank the referees for their fruitful suggestions which have uncovered several weaknesses leading to the improvement in the paper. A. Kumar wishes to thank UGC-CSIR(Grant no. 2061441001), New Delhi and IIT Kharagpur, India, for their financial assistance during this work.Kumar, A.; Gupta, D.; Martínez Molada, E.; Singh, S. (2018). Directional k-Step Newton Methods in n Variables and its Semilocal Convergence Analysis. Mediterranean Journal of Mathematics. 15(2):15-34. https://doi.org/10.1007/s00009-018-1077-0S1534152Levin, Y., Ben-Israel, A.: Directional Newton methods in n variables. Math. Comput. 71(237), 251–262 (2002)Argyros, I.K., Hilout, S.: A convergence analysis for directional two-step Newton methods. Num. Algorithms 55(4), 503–528 (2010)Lukács, G.: The generalized inverse matrix and the surface-surface intersection problem. In: Theory and Practice of Geometric Modeling, pp. 167–185. Springer (1989)Argyros, I.K., Magreñán, Á.A.: Extending the applicability of Gauss–Newton method for convex composite optimization on Riemannian manifolds. Appl. Math. Comput. 249, 453–467 (2014)Argyros, I.K.: A semilocal convergence analysis for directional Newton methods. Math. Comput. 80(273), 327–343 (2011)Ortega, J.M., Rheinboldt, W.C.: Iterative solution of nonlinear equations in several variables. SIAM (2000)Argyros, I.K., Hilout, S.: Weaker conditions for the convergence of Newton’s method. J. Complex. 28(3), 364–387 (2012)Argyros, I.K., Hilout, S.: On an improved convergence analysis of Newton’s method. Appl. Math. Comput. 225, 372–386 (2013)Tapia, R.A.: The Kantorovich theorem for Newton’s method. Am. Math. Mon. 78(4), 389–392 (1971)Argyros, I.K., George, S.: Local convergence for some high convergence order Newton-like methods with frozen derivatives. SeMA J. 70(1), 47–59 (2015)Martínez, E., Singh, S., Hueso, J.L., Gupta, D.K.: Enlarging the convergence domain in local convergence studies for iterative methods in Banach spaces. Appl. Math. Comput. 281, 252–265 (2016)Argyros, I.K., Behl, R. Motsa,S.S.: Ball convergence for a family of quadrature-based methods for solving equations in banach Space. Int. J. Comput. Methods, pp. 1750017 (2016)Parhi, S.K., Gupta, D.K.: Convergence of Stirling’s method under weak differentiability condition. Math. Methods Appl. Sci. 34(2), 168–175 (2011)Prashanth, M., Gupta, D.K.: A continuation method and its convergence for solving nonlinear equations in Banach spaces. Int. J. Comput. Methods 10(04), 1350021 (2013)Parida, P.K., Gupta, D.K.: Recurrence relations for semilocal convergence of a Newton-like method in banach spaces. J. Math. Anal. Appl. 345(1), 350–361 (2008)Argyros, I.K., Hilout, S.: Convergence of Directional Methods under mild differentiability and applications. Appl. Math. Comput. 217(21), 8731–8746 (2011)Amat, S, Bermúdez, C., Hernández-Verón, M.A., Martínez, E.: On an efficient k-step iterative method for nonlinear equations. J. Comput. Appl. Math. 302, 258–271 (2016)Hernández-Verón, M.A., Martínez, E., Teruel, C.: Semilocal convergence of a k-step iterative process and its application for solving a special kind of conservative problems. Num. Algorithms, pp. 1–23Argyros, M., Hernández, I.K., Hilout, S., Romero, N.: Directional Chebyshev-type methods for solving equations. Math. Comput. 84(292), 815–830 (2015)Davis, P.J., Rabinowitz, P.: Methods of numerical integration. Courier Corporation (2007)Cordero, A, Torregrosa, J.R.: Variants of Newton’s method using fifth-order quadrature formulas. Appl. Math. Computation . 190(1), 686–698 (2007)Weerakoon, S., Fernando, T.G.I.: A variant of Newton’s method with accelerated third-order convergence. Appl. Math. Lett. 13(8), 87–93 (2000

    Stochastic Training of Neural Networks via Successive Convex Approximations

    Full text link
    This paper proposes a new family of algorithms for training neural networks (NNs). These are based on recent developments in the field of non-convex optimization, going under the general name of successive convex approximation (SCA) techniques. The basic idea is to iteratively replace the original (non-convex, highly dimensional) learning problem with a sequence of (strongly convex) approximations, which are both accurate and simple to optimize. Differently from similar ideas (e.g., quasi-Newton algorithms), the approximations can be constructed using only first-order information of the neural network function, in a stochastic fashion, while exploiting the overall structure of the learning problem for a faster convergence. We discuss several use cases, based on different choices for the loss function (e.g., squared loss and cross-entropy loss), and for the regularization of the NN's weights. We experiment on several medium-sized benchmark problems, and on a large-scale dataset involving simulated physical data. The results show how the algorithm outperforms state-of-the-art techniques, providing faster convergence to a better minimum. Additionally, we show how the algorithm can be easily parallelized over multiple computational units without hindering its performance. In particular, each computational unit can optimize a tailored surrogate function defined on a randomly assigned subset of the input variables, whose dimension can be selected depending entirely on the available computational power.Comment: Preprint submitted to IEEE Transactions on Neural Networks and Learning System

    Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

    Full text link
    A central challenge to many fields of science and engineering involves minimizing non-convex error functions over continuous, high dimensional spaces. Gradient descent or quasi-Newton methods are almost ubiquitously used to perform such minimizations, and it is often thought that a main source of difficulty for these local methods to find the global minimum is the proliferation of local minima with much higher error than the global minimum. Here we argue, based on results from statistical physics, random matrix theory, neural network theory, and empirical evidence, that a deeper and more profound difficulty originates from the proliferation of saddle points, not local minima, especially in high dimensional problems of practical interest. Such saddle points are surrounded by high error plateaus that can dramatically slow down learning, and give the illusory impression of the existence of a local minimum. Motivated by these arguments, we propose a new approach to second-order optimization, the saddle-free Newton method, that can rapidly escape high dimensional saddle points, unlike gradient descent and quasi-Newton methods. We apply this algorithm to deep or recurrent neural network training, and provide numerical evidence for its superior optimization performance.Comment: The theoretical review and analysis in this article draw heavily from arXiv:1405.4604 [cs.LG
    corecore