Search CORE

464,595 research outputs found

Ensemble Kalman filter for neural network based one-shot inversion

Author: Guth Philipp A.
Schillings Claudia
Weissmann Simon
Publication venue
Publication date: 01/01/2020
Field of study

We study the use of novel techniques arising in machine learning for inverse problems. Our approach replaces the complex forward model by a neural network, which is trained simultaneously in a one-shot sense when estimating the unknown parameters from data, i.e. the neural network is trained only for the unknown parameter. By establishing a link to the Bayesian approach to inverse problems, an algorithmic framework is developed which ensures the feasibility of the parameter estimate w.r. to the forward model. We propose an efficient, derivative-free optimization method based on variants of the ensemble Kalman inversion. Numerical experiments show that the ensemble Kalman filter for neural network based one-shot inversion is a promising direction combining optimization and machine learning techniques for inverse problems

arXiv.org e-Print Archive

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

Adaptive Momentum for Neural Network Optimization

Author: Rashidi Zana
Publication venue
Publication date: 11/05/2020
Field of study

In this thesis, we develop a novel and efficient algorithm for optimizing neural networks inspired by a recently proposed geodesic optimization algorithm. Our algorithm, which we call Stochastic Geodesic Optimization (SGeO), utilizes an adaptive coefficient on top of Polyaks Heavy Ball method effectively controlling the amount of weight put on the previous update to the parameters based on the change of direction in the optimization path. Experimental results on strongly convex functions with Lipschitz gradients and deep Autoencoder benchmarks show that SGeO reaches lower errors than established first-order methods and competes well with lower or similar errors to a recent second-order method called K-FAC (Kronecker-Factored Approximate Curvature). We also incorporate Nesterov style lookahead gradient into our algorithm (SGeO-N) and observe notable improvements. We believe that our research will open up new directions for high-dimensional neural network optimization where combining the efficiency of first-order methods and the effectiveness of second-order methods proves a promising avenue to explore

YorkSpace

How degenerate is the parametrization of neural networks with the ReLU activation function?

Author: Berner Julius
Elbrächter Dennis
Grohs Philipp
Publication venue
Publication date: 28/10/2019
Field of study

Neural network training is usually accomplished by solving a non-convex optimization problem using stochastic gradient descent. Although one optimizes over the networks parameters, the main loss function generally only depends on the realization of the neural network, i.e. the function it computes. Studying the optimization problem over the space of realizations opens up new ways to understand neural network training. In particular, usual loss functions like mean squared error and categorical cross entropy are convex on spaces of neural network realizations, which themselves are non-convex. Approximation capabilities of neural networks can be used to deal with the latter non-convexity, which allows us to establish that for sufficiently large networks local minima of a regularized optimization problem on the realization space are almost optimal. Note, however, that each realization has many different, possibly degenerate, parametrizations. In particular, a local minimum in the parametrization space needs not correspond to a local minimum in the realization space. To establish such a connection, inverse stability of the realization map is required, meaning that proximity of realizations must imply proximity of corresponding parametrizations. We present pathologies which prevent inverse stability in general, and, for shallow networks, proceed to establish a restricted space of parametrizations on which we have inverse stability w.r.t. to a Sobolev norm. Furthermore, we show that by optimizing over such restricted sets, it is still possible to learn any function which can be learned by optimization over unrestricted sets.Comment: Accepted at NeurIPS 201

arXiv.org e-Print Archive

Bidirectional optimization of the melting spinning process

Author: Ding Y
Hao K
Hone K
Liang X
Wang H
Wang Z
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2014
Field of study

This is the author's accepted manuscript (under the provisional title "Bi-directional optimization of the melting spinning process with an immune-enhanced neural network"). The final published article is available from the link below. Copyright 2014 @ IEEE.A bidirectional optimizing approach for the melting spinning process based on an immune-enhanced neural network is proposed. The proposed bidirectional model can not only reveal the internal nonlinear relationship between the process configuration and the quality indices of the fibers as final product, but also provide a tool for engineers to develop new fiber products with expected quality specifications. A neural network is taken as the basis for the bidirectional model, and an immune component is introduced to enlarge the searching scope of the solution field so that the neural network has a larger possibility to find the appropriate and reasonable solution, and the error of prediction can therefore be eliminated. The proposed intelligent model can also help to determine what kind of process configuration should be made in order to produce satisfactory fiber products. To make the proposed model practical to the manufacturing, a software platform is developed. Simulation results show that the proposed model can eliminate the approximation error raised by the neural network-based optimizing model, which is due to the extension of focusing scope by the artificial immune mechanism. Meanwhile, the proposed model with the corresponding software can conduct optimization in two directions, namely, the process optimization and category development, and the corresponding results outperform those with an ordinary neural network-based intelligent model. It is also proved that the proposed model has the potential to act as a valuable tool from which the engineers and decision makers of the spinning process could benefit.National Nature Science Foundation of China, Ministry of Education of China, the Shanghai Committee of Science and Technology), and the Fundamental Research Funds for the Central Universities

Crossref

Brunel University Research Archive

Genetically Generated Neural Networks I: Representational Effects

Author: Marti Leonardo
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/01/1992
Field of study

This paper studies several applications of genetic algorithms (GAs) within the neural networks field. After generating a robust GA engine, the system was used to generate neural network circuit architectures. This was accomplished by using the GA to determine the weights in a fully interconnected network. The importance of the internal genetic representation was shown by testing different approaches. The effects in speed of optimization of varying the constraints imposed upon the desired network were also studied. It was observed that relatively loose constraints provided results comparable to a fully constrained system. The type of neural network circuits generated were recurrent competitive fields as described by Grossberg (1982)

Boston University Institutional Repository (OpenBU)

eScholarship - University of California