13,923 research outputs found
Directional k-Step Newton Methods in n Variables and its Semilocal Convergence Analysis
[EN] The directional k-step Newton methods (k a positive integer) is developed for solving a single nonlinear equation in n variables. Its semilocal convergence analysis is established by using two different approaches (recurrent relations and recurrent functions) under the assumption that the first derivative satisfies a combination of the Lipschitz and the center-Lipschitz continuity conditions instead of only Lipschitz condition. The convergence theorems for the existence and uniqueness of the solution for each of them are established. Numerical examples including nonlinear Hammerstein-type integral equations are worked out and significantly improved results are obtained. It is shown that the second approach based on recurrent functions solves problems failed to be solved by first one using recurrent relations. This demonstrates the efficacy and applicability of these approaches. This work extends the directional one and two-step Newton methods for solving a single nonlinear equation in n variables. Their semilocal convergence analysis using majorizing sequences are studied in Levin (Math Comput 71(237): 251-262, 2002) and Ioannis (Num Algorithms 55(4): 503-528, 2010) under the assumption that the first derivative satisfies the Lipschitz and the combination of the Lipschitz and the center-Lipschitz continuity conditions, respectively. Finally, the computational order of convergence and the computational efficiency of developed method are studied.The authors thank the referees for their fruitful suggestions which have uncovered several weaknesses leading to the improvement in the paper. A. Kumar wishes to thank UGC-CSIR(Grant no. 2061441001), New Delhi and IIT Kharagpur, India, for their financial assistance during this work.Kumar, A.; Gupta, D.; Martínez Molada, E.; Singh, S. (2018). Directional k-Step Newton Methods in n Variables and its Semilocal Convergence Analysis. Mediterranean Journal of Mathematics. 15(2):15-34. https://doi.org/10.1007/s00009-018-1077-0S1534152Levin, Y., Ben-Israel, A.: Directional Newton methods in n variables. Math. Comput. 71(237), 251–262 (2002)Argyros, I.K., Hilout, S.: A convergence analysis for directional two-step Newton methods. Num. Algorithms 55(4), 503–528 (2010)Lukács, G.: The generalized inverse matrix and the surface-surface intersection problem. In: Theory and Practice of Geometric Modeling, pp. 167–185. Springer (1989)Argyros, I.K., Magreñán, Á.A.: Extending the applicability of Gauss–Newton method for convex composite optimization on Riemannian manifolds. Appl. Math. Comput. 249, 453–467 (2014)Argyros, I.K.: A semilocal convergence analysis for directional Newton methods. Math. Comput. 80(273), 327–343 (2011)Ortega, J.M., Rheinboldt, W.C.: Iterative solution of nonlinear equations in several variables. SIAM (2000)Argyros, I.K., Hilout, S.: Weaker conditions for the convergence of Newton’s method. J. Complex. 28(3), 364–387 (2012)Argyros, I.K., Hilout, S.: On an improved convergence analysis of Newton’s method. Appl. Math. Comput. 225, 372–386 (2013)Tapia, R.A.: The Kantorovich theorem for Newton’s method. Am. Math. Mon. 78(4), 389–392 (1971)Argyros, I.K., George, S.: Local convergence for some high convergence order Newton-like methods with frozen derivatives. SeMA J. 70(1), 47–59 (2015)Martínez, E., Singh, S., Hueso, J.L., Gupta, D.K.: Enlarging the convergence domain in local convergence studies for iterative methods in Banach spaces. Appl. Math. Comput. 281, 252–265 (2016)Argyros, I.K., Behl, R. Motsa,S.S.: Ball convergence for a family of quadrature-based methods for solving equations in banach Space. Int. J. Comput. Methods, pp. 1750017 (2016)Parhi, S.K., Gupta, D.K.: Convergence of Stirling’s method under weak differentiability condition. Math. Methods Appl. Sci. 34(2), 168–175 (2011)Prashanth, M., Gupta, D.K.: A continuation method and its convergence for solving nonlinear equations in Banach spaces. Int. J. Comput. Methods 10(04), 1350021 (2013)Parida, P.K., Gupta, D.K.: Recurrence relations for semilocal convergence of a Newton-like method in banach spaces. J. Math. Anal. Appl. 345(1), 350–361 (2008)Argyros, I.K., Hilout, S.: Convergence of Directional Methods under mild differentiability and applications. Appl. Math. Comput. 217(21), 8731–8746 (2011)Amat, S, Bermúdez, C., Hernández-Verón, M.A., Martínez, E.: On an efficient k-step iterative method for nonlinear equations. J. Comput. Appl. Math. 302, 258–271 (2016)Hernández-Verón, M.A., Martínez, E., Teruel, C.: Semilocal convergence of a k-step iterative process and its application for solving a special kind of conservative problems. Num. Algorithms, pp. 1–23Argyros, M., Hernández, I.K., Hilout, S., Romero, N.: Directional Chebyshev-type methods for solving equations. Math. Comput. 84(292), 815–830 (2015)Davis, P.J., Rabinowitz, P.: Methods of numerical integration. Courier Corporation (2007)Cordero, A, Torregrosa, J.R.: Variants of Newton’s method using fifth-order quadrature formulas. Appl. Math. Computation . 190(1), 686–698 (2007)Weerakoon, S., Fernando, T.G.I.: A variant of Newton’s method with accelerated third-order convergence. Appl. Math. Lett. 13(8), 87–93 (2000
Stochastic Training of Neural Networks via Successive Convex Approximations
This paper proposes a new family of algorithms for training neural networks
(NNs). These are based on recent developments in the field of non-convex
optimization, going under the general name of successive convex approximation
(SCA) techniques. The basic idea is to iteratively replace the original
(non-convex, highly dimensional) learning problem with a sequence of (strongly
convex) approximations, which are both accurate and simple to optimize.
Differently from similar ideas (e.g., quasi-Newton algorithms), the
approximations can be constructed using only first-order information of the
neural network function, in a stochastic fashion, while exploiting the overall
structure of the learning problem for a faster convergence. We discuss several
use cases, based on different choices for the loss function (e.g., squared loss
and cross-entropy loss), and for the regularization of the NN's weights. We
experiment on several medium-sized benchmark problems, and on a large-scale
dataset involving simulated physical data. The results show how the algorithm
outperforms state-of-the-art techniques, providing faster convergence to a
better minimum. Additionally, we show how the algorithm can be easily
parallelized over multiple computational units without hindering its
performance. In particular, each computational unit can optimize a tailored
surrogate function defined on a randomly assigned subset of the input
variables, whose dimension can be selected depending entirely on the available
computational power.Comment: Preprint submitted to IEEE Transactions on Neural Networks and
Learning System
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
A central challenge to many fields of science and engineering involves
minimizing non-convex error functions over continuous, high dimensional spaces.
Gradient descent or quasi-Newton methods are almost ubiquitously used to
perform such minimizations, and it is often thought that a main source of
difficulty for these local methods to find the global minimum is the
proliferation of local minima with much higher error than the global minimum.
Here we argue, based on results from statistical physics, random matrix theory,
neural network theory, and empirical evidence, that a deeper and more profound
difficulty originates from the proliferation of saddle points, not local
minima, especially in high dimensional problems of practical interest. Such
saddle points are surrounded by high error plateaus that can dramatically slow
down learning, and give the illusory impression of the existence of a local
minimum. Motivated by these arguments, we propose a new approach to
second-order optimization, the saddle-free Newton method, that can rapidly
escape high dimensional saddle points, unlike gradient descent and quasi-Newton
methods. We apply this algorithm to deep or recurrent neural network training,
and provide numerical evidence for its superior optimization performance.Comment: The theoretical review and analysis in this article draw heavily from
arXiv:1405.4604 [cs.LG
- …