50,679 research outputs found
Learning Specialized Activation Functions for Physics-informed Neural Networks
Physics-informed neural networks (PINNs) are known to suffer from
optimization difficulty. In this work, we reveal the connection between the
optimization difficulty of PINNs and activation functions. Specifically, we
show that PINNs exhibit high sensitivity to activation functions when solving
PDEs with distinct properties. Existing works usually choose activation
functions by inefficient trial-and-error. To avoid the inefficient manual
selection and to alleviate the optimization difficulty of PINNs, we introduce
adaptive activation functions to search for the optimal function when solving
different problems. We compare different adaptive activation functions and
discuss their limitations in the context of PINNs. Furthermore, we propose to
tailor the idea of learning combinations of candidate activation functions to
the PINNs optimization, which has a higher requirement for the smoothness and
diversity on learned functions. This is achieved by removing activation
functions which cannot provide higher-order derivatives from the candidate set
and incorporating elementary functions with different properties according to
our prior knowledge about the PDE at hand. We further enhance the search space
with adaptive slopes. The proposed adaptive activation function can be used to
solve different PDE systems in an interpretable way. Its effectiveness is
demonstrated on a series of benchmarks. Code is available at
https://github.com/LeapLabTHU/AdaAFforPINNs
Variable neural networks for adaptive control of nonlinear systems
This paper is concerned with the adaptive control of continuous-time nonlinear dynamical systems using neural networks. A novel neural network architecture, referred to as a variable neural network, is proposed and shown to be useful in approximating the unknown nonlinearities of dynamical systems. In the variable neural networks, the number of basis functions can be either increased or decreased with time, according to specified design strategies, so that the network will not overfit or underfit the data set. Based on the Gaussian radial basis function (GRBF) variable neural network, an adaptive control scheme is presented. The location of the centers and the determination of the widths of the GRBFs in the variable neural network are analyzed to make a compromise between orthogonality and smoothness. The weight-adaptive laws developed using the Lyapunov synthesis approach guarantee the stability of the overall control scheme, even in the presence of modeling error(s). The tracking errors converge to the required accuracy through the adaptive control algorithm derived by combining the variable neural network and Lyapunov synthesis techniques. The operation of an adaptive control scheme using the variable neural network is demonstrated using two simulated example
System Identification for Nonlinear Control Using Neural Networks
An approach to incorporating artificial neural networks in nonlinear, adaptive control systems is described. The controller contains three principal elements: a nonlinear inverse dynamic control law whose coefficients depend on a comprehensive model of the plant, a neural network that models system dynamics, and a state estimator whose outputs drive the control law and train the neural network. Attention is focused on the system identification task, which combines an extended Kalman filter with generalized spline function approximation. Continual learning is possible during normal operation, without taking the system off line for specialized training. Nonlinear inverse dynamic control requires smooth derivatives as well as function estimates, imposing stringent goals on the approximating technique
Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images
There are two types of information in each handwritten word image: explicit
information which can be easily read or derived directly, such as lexical
content or word length, and implicit attributes such as the author's identity.
Whether features learned by a neural network for one task can be used for
another task remains an open question. In this paper, we present a deep
adaptive learning method for writer identification based on single-word images
using multi-task learning. An auxiliary task is added to the training process
to enforce the emergence of reusable features. Our proposed method transfers
the benefits of the learned features of a convolutional neural network from an
auxiliary task such as explicit content recognition to the main task of writer
identification in a single procedure. Specifically, we propose a new adaptive
convolutional layer to exploit the learned deep features. A multi-task neural
network with one or several adaptive convolutional layers is trained
end-to-end, to exploit robust generic features for a specific main task, i.e.,
writer identification. Three auxiliary tasks, corresponding to three explicit
attributes of handwritten word images (lexical content, word length and
character attributes), are evaluated. Experimental results on two benchmark
datasets show that the proposed deep adaptive learning method can improve the
performance of writer identification based on single-word images, compared to
non-adaptive and simple linear-adaptive approaches.Comment: Under view of Pattern Recognitio
On-the-fly adaptivity for nonlinear twoscale simulations using artificial neural networks and reduced order modeling
A multi-fidelity surrogate model for highly nonlinear multiscale problems is
proposed. It is based on the introduction of two different surrogate models and
an adaptive on-the-fly switching. The two concurrent surrogates are built
incrementally starting from a moderate set of evaluations of the full order
model. Therefore, a reduced order model (ROM) is generated. Using a hybrid
ROM-preconditioned FE solver, additional effective stress-strain data is
simulated while the number of samples is kept to a moderate level by using a
dedicated and physics-guided sampling technique. Machine learning (ML) is
subsequently used to build the second surrogate by means of artificial neural
networks (ANN). Different ANN architectures are explored and the features used
as inputs of the ANN are fine tuned in order to improve the overall quality of
the ML model. Additional ANN surrogates for the stress errors are generated.
Therefore, conservative design guidelines for error surrogates are presented by
adapting the loss functions of the ANN training in pure regression or pure
classification settings. The error surrogates can be used as quality indicators
in order to adaptively select the appropriate -- i.e. efficient yet accurate --
surrogate. Two strategies for the on-the-fly switching are investigated and a
practicable and robust algorithm is proposed that eliminates relevant technical
difficulties attributed to model switching. The provided algorithms and ANN
design guidelines can easily be adopted for different problem settings and,
thereby, they enable generalization of the used machine learning techniques for
a wide range of applications. The resulting hybrid surrogate is employed in
challenging multilevel FE simulations for a three-phase composite with
pseudo-plastic micro-constituents. Numerical examples highlight the performance
of the proposed approach
- …