464,595 research outputs found
Ensemble Kalman filter for neural network based one-shot inversion
We study the use of novel techniques arising in machine learning for inverse
problems. Our approach replaces the complex forward model by a neural network,
which is trained simultaneously in a one-shot sense when estimating the unknown
parameters from data, i.e. the neural network is trained only for the unknown
parameter. By establishing a link to the Bayesian approach to inverse problems,
an algorithmic framework is developed which ensures the feasibility of the
parameter estimate w.r. to the forward model. We propose an efficient,
derivative-free optimization method based on variants of the ensemble Kalman
inversion. Numerical experiments show that the ensemble Kalman filter for
neural network based one-shot inversion is a promising direction combining
optimization and machine learning techniques for inverse problems
Adaptive Momentum for Neural Network Optimization
In this thesis, we develop a novel and efficient algorithm for optimizing neural networks inspired by a recently proposed geodesic optimization algorithm. Our algorithm, which we call Stochastic Geodesic Optimization (SGeO), utilizes an adaptive coefficient on top of Polyaks Heavy Ball method effectively controlling the amount of weight put on the previous update to the parameters based on the change of direction in the optimization path. Experimental results on strongly convex functions with Lipschitz gradients and deep Autoencoder benchmarks show that SGeO reaches lower errors than established first-order methods and competes well with lower or similar errors to a recent second-order method called K-FAC (Kronecker-Factored Approximate Curvature). We also incorporate Nesterov style lookahead gradient into our algorithm (SGeO-N) and observe notable improvements. We believe that our research will open up new directions for high-dimensional neural network optimization where combining the efficiency of first-order methods and the effectiveness of second-order methods proves a promising avenue to explore
How degenerate is the parametrization of neural networks with the ReLU activation function?
Neural network training is usually accomplished by solving a non-convex
optimization problem using stochastic gradient descent. Although one optimizes
over the networks parameters, the main loss function generally only depends on
the realization of the neural network, i.e. the function it computes. Studying
the optimization problem over the space of realizations opens up new ways to
understand neural network training. In particular, usual loss functions like
mean squared error and categorical cross entropy are convex on spaces of neural
network realizations, which themselves are non-convex. Approximation
capabilities of neural networks can be used to deal with the latter
non-convexity, which allows us to establish that for sufficiently large
networks local minima of a regularized optimization problem on the realization
space are almost optimal. Note, however, that each realization has many
different, possibly degenerate, parametrizations. In particular, a local
minimum in the parametrization space needs not correspond to a local minimum in
the realization space. To establish such a connection, inverse stability of the
realization map is required, meaning that proximity of realizations must imply
proximity of corresponding parametrizations. We present pathologies which
prevent inverse stability in general, and, for shallow networks, proceed to
establish a restricted space of parametrizations on which we have inverse
stability w.r.t. to a Sobolev norm. Furthermore, we show that by optimizing
over such restricted sets, it is still possible to learn any function which can
be learned by optimization over unrestricted sets.Comment: Accepted at NeurIPS 201
Bidirectional optimization of the melting spinning process
This is the author's accepted manuscript (under the provisional title "Bi-directional optimization of the melting spinning process with an immune-enhanced neural network"). The final published article is available from the link below. Copyright 2014 @ IEEE.A bidirectional optimizing approach for the melting spinning process based on an immune-enhanced neural network is proposed. The proposed bidirectional model can not only reveal the internal nonlinear relationship between the process configuration and the quality indices of the fibers as final product, but also provide a tool for engineers to develop new fiber products with expected quality specifications. A neural network is taken as the basis for the bidirectional model, and an immune component is introduced to enlarge the searching scope of the solution field so that the neural network has a larger possibility to find the appropriate and reasonable solution, and the error of prediction can therefore be eliminated. The proposed intelligent model can also help to determine what kind of process configuration should be made in order to produce satisfactory fiber products. To make the proposed model practical to the manufacturing, a software platform is developed. Simulation results show that the proposed model can eliminate the approximation error raised by the neural network-based optimizing model, which is due to the extension of focusing scope by the artificial immune mechanism. Meanwhile, the proposed model with the corresponding software can conduct optimization in two directions, namely, the process optimization and category development, and the corresponding results outperform those with an ordinary neural network-based intelligent model. It is also proved that the proposed model has the potential to act as a valuable tool from which the engineers and decision makers of the spinning process could benefit.National Nature Science Foundation of China, Ministry of Education of China, the Shanghai Committee of Science and Technology), and the Fundamental Research Funds for the Central Universities
Genetically Generated Neural Networks I: Representational Effects
This paper studies several applications of genetic algorithms (GAs) within the neural networks field. After generating a robust GA engine, the system was used to generate neural network circuit architectures. This was accomplished by using the GA to determine the weights in a fully interconnected network. The importance of the internal genetic representation was shown by testing different approaches. The effects in speed of optimization of varying the constraints imposed upon the desired network were also studied. It was observed that relatively loose constraints provided results comparable to a fully constrained system. The type of neural network circuits generated were recurrent competitive fields as described by Grossberg (1982)
- …
