1,159 research outputs found
Modelling of methanol synthesis in a network of forced unsteady-state ring reactors by artificial neural networks for control purposes
A numerical model based on artificial neural networks (ANN) was developed to simulate the dynamic behaviour of a three reactors network (or ring reactor), with periodic change of the feed position, when low-pressure methanol synthesis is carried out. A multilayer, feedforward, fully connected ANN was designed and the history stack adaptation algorithm was implemented and tested with quite good results both in terms of model identification and learning rates. The influence of the ANN parameters was addressed, leading to simple guidelines for the selection of their values. A detailed model was used to generate the patterns adopted for the learning and testing phases. The simplified model was finalised to develop a model predictive control scheme in order to maximise methanol yield and to fulfil process constraints
A Joint Optimization of Momentum Item and Levenberg-Marquardt Algorithm to Level Up the BPNN’s Generalization Ability
Back propagation neural network (BPNN) as a kind of artificial neural network is widely used in pattern recognition and trend prediction. For standard BPNN, it has many drawbacks such as trapping into local optima, oscillation, and long training time. Because training the standard BPNN is based on gradient descent method, and the learning rate is fixed. Momentum item and Levenberg-Marquardt (LM) algorithm are two ways to adjust the weights among the neurons and improve the BPNN’s performance. However, there is still much space to improve the two algorithms. The hybrid optimization of damping factor of LM and the dynamic momentum item is proposed in this paper. The improved BPNN is validated by Fisher Iris data and wine data. Then, it is used to predict the visit_spend. The database is provided by Dunnhumby's Shopper Challenge. Compared with the other two improved BPNNs, the proposed method gets a better performance. Therefore, the proposed method can be used to do the pattern recognition and time series prediction more effectively
Global stability of first-order methods for coercive tame functions
We consider first-order methods with constant step size for minimizing
locally Lipschitz coercive functions that are tame in an o-minimal structure on
the real field. We prove that if the method is approximated by subgradient
trajectories, then the iterates eventually remain in a neighborhood of a
connected component of the set of critical points. Under suitable
method-dependent regularity assumptions, this result applies to the subgradient
method with momentum, the stochastic subgradient method with random reshuffling
and momentum, and the random-permutations cyclic coordinate descent method.Comment: 30 pages, 1 figur
Deep learning applied to computational mechanics: A comprehensive review, state of the art, and the classics
Three recent breakthroughs due to AI in arts and science serve as motivation:
An award winning digital image, protein folding, fast matrix multiplication.
Many recent developments in artificial neural networks, particularly deep
learning (DL), applied and relevant to computational mechanics (solid, fluids,
finite-element technology) are reviewed in detail. Both hybrid and pure machine
learning (ML) methods are discussed. Hybrid methods combine traditional PDE
discretizations with ML methods either (1) to help model complex nonlinear
constitutive relations, (2) to nonlinearly reduce the model order for efficient
simulation (turbulence), or (3) to accelerate the simulation by predicting
certain components in the traditional integration methods. Here, methods (1)
and (2) relied on Long-Short-Term Memory (LSTM) architecture, with method (3)
relying on convolutional neural networks. Pure ML methods to solve (nonlinear)
PDEs are represented by Physics-Informed Neural network (PINN) methods, which
could be combined with attention mechanism to address discontinuous solutions.
Both LSTM and attention architectures, together with modern and generalized
classic optimizers to include stochasticity for DL networks, are extensively
reviewed. Kernel machines, including Gaussian processes, are provided to
sufficient depth for more advanced works such as shallow networks with infinite
width. Not only addressing experts, readers are assumed familiar with
computational mechanics, but not with DL, whose concepts and applications are
built up from the basics, aiming at bringing first-time learners quickly to the
forefront of research. History and limitations of AI are recounted and
discussed, with particular attention at pointing out misstatements or
misconceptions of the classics, even in well-known references. Positioning and
pointing control of a large-deformable beam is given as an example.Comment: 275 pages, 158 figures. Appeared online on 2023.03.01 at
CMES-Computer Modeling in Engineering & Science
Empirical Study of Overfitting in Deep FNN Prediction Models for Breast Cancer Metastasis
Overfitting is defined as the fact that the current model fits a specific
data set perfectly, resulting in weakened generalization, and ultimately may
affect the accuracy in predicting future data. In this research we used an EHR
dataset concerning breast cancer metastasis to study overfitting of deep
feedforward Neural Networks (FNNs) prediction models. We included 11
hyperparameters of the deep FNNs models and took an empirical approach to study
how each of these hyperparameters was affecting both the prediction performance
and overfitting when given a large range of values. We also studied how some of
the interesting pairs of hyperparameters were interacting to influence the
model performance and overfitting. The 11 hyperparameters we studied include
activate function; weight initializer, number of hidden layers, learning rate,
momentum, decay, dropout rate, batch size, epochs, L1, and L2. Our results show
that most of the single hyperparameters are either negatively or positively
corrected with model prediction performance and overfitting. In particular, we
found that overfitting overall tends to negatively correlate with learning
rate, decay, batch sides, and L2, but tends to positively correlate with
momentum, epochs, and L1. According to our results, learning rate, decay, and
batch size may have a more significant impact on both overfitting and
prediction performance than most of the other hyperparameters, including L1,
L2, and dropout rate, which were designed for minimizing overfitting. We also
find some interesting interacting pairs of hyperparameters such as learning
rate and momentum, learning rate and decay, and batch size and epochs.
Keywords: Deep learning, overfitting, prediction, grid search, feedforward
neural networks, breast cancer metastasis
On-the-fly adaptivity for nonlinear twoscale simulations using artificial neural networks and reduced order modeling
A multi-fidelity surrogate model for highly nonlinear multiscale problems is
proposed. It is based on the introduction of two different surrogate models and
an adaptive on-the-fly switching. The two concurrent surrogates are built
incrementally starting from a moderate set of evaluations of the full order
model. Therefore, a reduced order model (ROM) is generated. Using a hybrid
ROM-preconditioned FE solver, additional effective stress-strain data is
simulated while the number of samples is kept to a moderate level by using a
dedicated and physics-guided sampling technique. Machine learning (ML) is
subsequently used to build the second surrogate by means of artificial neural
networks (ANN). Different ANN architectures are explored and the features used
as inputs of the ANN are fine tuned in order to improve the overall quality of
the ML model. Additional ANN surrogates for the stress errors are generated.
Therefore, conservative design guidelines for error surrogates are presented by
adapting the loss functions of the ANN training in pure regression or pure
classification settings. The error surrogates can be used as quality indicators
in order to adaptively select the appropriate -- i.e. efficient yet accurate --
surrogate. Two strategies for the on-the-fly switching are investigated and a
practicable and robust algorithm is proposed that eliminates relevant technical
difficulties attributed to model switching. The provided algorithms and ANN
design guidelines can easily be adopted for different problem settings and,
thereby, they enable generalization of the used machine learning techniques for
a wide range of applications. The resulting hybrid surrogate is employed in
challenging multilevel FE simulations for a three-phase composite with
pseudo-plastic micro-constituents. Numerical examples highlight the performance
of the proposed approach
Single-Frequency Network Terrestrial Broadcasting with 5GNR Numerology
L'abstract è presente nell'allegato / the abstract is in the attachmen
- …