3,017 research outputs found
Formal Derivation of Mesh Neural Networks with Their Forward-Only Gradient Propagation
This paper proposes the Mesh Neural Network (MNN), a novel architecture which allows neurons to be connected in any topology, to efficiently route information. In MNNs, information is propagated between neurons throughout a state transition function. State and error gradients are then directly computed from state updates without backward computation. The MNN architecture and the error propagation schema is formalized and derived in tensor algebra. The proposed computational model can fully supply a gradient descent process, and is potentially suitable for very large scale sparse NNs, due to its expressivity and training efficiency, with respect to NNs based on back-propagation and computational graphs
Variational operator learning: A unified paradigm for training neural operators and solving partial differential equations
Based on the variational method, we propose a novel paradigm that provides a
unified framework of training neural operators and solving partial differential
equations (PDEs) with the variational form, which we refer to as the
variational operator learning (VOL). We first derive the functional
approximation of the system from the node solution prediction given by neural
operators, and then conduct the variational operation by automatic
differentiation, constructing a forward-backward propagation loop to derive the
residual of the linear system. One or several update steps of the steepest
decent method (SD) and the conjugate gradient method (CG) are provided in every
iteration as a cheap yet effective update for training the neural operators.
Experimental results show the proposed VOL can learn a variety of solution
operators in PDEs of the steady heat transfer and the variable stiffness
elasticity with satisfactory results and small error. The proposed VOL achieves
nearly label-free training. Only five to ten labels are used for the output
distribution-shift session in all experiments. Generalization benefits of the
VOL are investigated and discussed.Comment: 35 pages, 22 figure
End-to-End Differentiable Molecular Mechanics Force Field Construction
Molecular mechanics (MM) potentials have long been a workhorse of
computational chemistry. Leveraging accuracy and speed, these functional forms
find use in a wide variety of applications from rapid virtual screening to
detailed free energy calculations. Traditionally, MM potentials have relied on
human-curated, inflexible, and poorly extensible discrete chemical perception
rules (atom types) for applying parameters to molecules or biopolymers, making
them difficult to optimize to fit quantum chemical or physical property data.
Here, we propose an alternative approach that uses graph nets to perceive
chemical environments, producing continuous atom embeddings from which valence
and nonbonded parameters can be predicted using a feed-forward neural network.
Since all stages are built using smooth functions, the entire process of
chemical perception and parameter assignment is differentiable end-to-end with
respect to model parameters, allowing new force fields to be easily
constructed, extended, and applied to arbitrary molecules. We show that this
approach has the capacity to reproduce legacy atom types and can be fit to MM
and QM energies and forces, among other targets
Solving Coupled Differential Equation Groups Using PINO-CDE
As a fundamental mathmatical tool in many engineering disciplines, coupled
differential equation groups are being widely used to model complex structures
containing multiple physical quantities. Engineers constantly adjust structural
parameters at the design stage, which requires a highly efficient solver. The
rise of deep learning technologies has offered new perspectives on this task.
Unfortunately, existing black-box models suffer from poor accuracy and
robustness, while the advanced methodologies of single-output operator
regression cannot deal with multiple quantities simultaneously. To address
these challenges, we propose PINO-CDE, a deep learning framework for solving
coupled differential equation groups (CDEs) along with an equation
normalization algorithm for performance enhancing. Based on the theory of
physics-informed neural operator (PINO), PINO-CDE uses a single network for all
quantities in a CDEs, instead of training dozens, or even hundreds of networks
as in the existing literature. We demonstrate the flexibility and feasibility
of PINO-CDE for one toy example and two engineering applications: vehicle-track
coupled dynamics (VTCD) and reliability assessment for a four-storey building
(uncertainty propagation). The performance of VTCD indicates that PINO-CDE
outperforms existing software and deep learning-based methods in terms of
efficiency and precision, respectively. For the uncertainty propagation task,
PINO-CDE provides higher-resolution results in less than a quarter of the time
incurred when using the probability density evolution method (PDEM). This
framework integrates engineering dynamics and deep learning technologies and
may reveal a new concept for CDEs solving and uncertainty propagation
Graph Neural Networks and Applied Linear Algebra
Sparse matrix computations are ubiquitous in scientific computing. With the
recent interest in scientific machine learning, it is natural to ask how sparse
matrix computations can leverage neural networks (NN). Unfortunately,
multi-layer perceptron (MLP) neural networks are typically not natural for
either graph or sparse matrix computations. The issue lies with the fact that
MLPs require fixed-sized inputs while scientific applications generally
generate sparse matrices with arbitrary dimensions and a wide range of nonzero
patterns (or matrix graph vertex interconnections). While convolutional NNs
could possibly address matrix graphs where all vertices have the same number of
nearest neighbors, a more general approach is needed for arbitrary sparse
matrices, e.g. arising from discretized partial differential equations on
unstructured meshes. Graph neural networks (GNNs) are one approach suitable to
sparse matrices. GNNs define aggregation functions (e.g., summations) that
operate on variable size input data to produce data of a fixed output size so
that MLPs can be applied. The goal of this paper is to provide an introduction
to GNNs for a numerical linear algebra audience. Concrete examples are provided
to illustrate how many common linear algebra tasks can be accomplished using
GNNs. We focus on iterative methods that employ computational kernels such as
matrix-vector products, interpolation, relaxation methods, and
strength-of-connection measures. Our GNN examples include cases where
parameters are determined a-priori as well as cases where parameters must be
learned. The intent with this article is to help computational scientists
understand how GNNs can be used to adapt machine learning concepts to
computational tasks associated with sparse matrices. It is hoped that this
understanding will stimulate data-driven extensions of classical sparse linear
algebra tasks
Full Waveform Inversion Guided Wave Tomography Based on Recurrent Neural Network
Corrosion quantitative detection of plate or plate-like structures is a critical and challenging topic in industrial Non-Destructive Testing (NDT) research which determines the remaining life of material. Compared with other methods (X-ray, magnetic powder, eddy current), ultrasonic guided wave tomography has the advantages of non-invasiveness, high efficiency, high precision and low cost. Among various ultrasonic guided wave tomography algorithms, travel time or diffraction algorithms can be used to reconstruct defect or corrosion model, but the accuracy is low and heavily influenced by the noise. Full Waveform Inversion (FWI) can build accurate reconstructions of physical properties in plate structures, however, it requires a relatively accurate initial model, and there is still room for improvement in the convergence speed, imaging resolution and robustness.
This thesis starting with the physical principle of ultrasonic guided waves, the dispersion characteristic curve of the guided wave propagating in the plate structure converts the change of the remaining thickness of the plate structure material into the wave velocity variation when the ultrasonic guided wave propagates in it, and provides a physical principle for obtaining the thickness distribution map from the velocity reconstruction. Secondly, a guided wave tomography method based on Recurrent Neural Network Full Waveform Inversion (RNN-FWI) is proposed. Finally, the efficiency of the above method is verified through practical experiments. The main work of the thesis includes:
The feasibility of conventional full waveform inversion for guided wave tomography is introduced and verified.
An FWI algorithm based on RNN is proposed. In the framework of RNN-FWI, the effects of different optimization algorithms on imaging performance and the effects of different sensor numbers and positions on imaging performance are analyzed.
The quadratic Wasserstein distance is used as the objective equation to further reduce the dependence on the initial model. The depth image prior (DIP) based on convolutional neural network (CNN) is used as the regularization method to further improve the conventional FWI algorithm, and the effectiveness of the improved algorithm is verified by simulation and actual experiments
- …