32 research outputs found
Analyzing the Impacts of Activation Functions on the Performance of Convolutional Neural Network Models
Activation functions are a very crucial part of convolutional neural networks (CNN) because to a very large extent, they determine the performance of the CNN model. Various activation functions have been developed over the years and the choice of activation function to use in a given model is usually a matter of trial and error. In this paper, we evaluate some of the most-used activation functions and how they impact the time to train a CNN model and the performance of the model. We make recommendations for the best activation functions to use based on the results of our experiment
Deep frequency principle towards understanding why deeper learning is faster
Understanding the effect of depth in deep learning is a critical problem. In
this work, we utilize the Fourier analysis to empirically provide a promising
mechanism to understand why feedforward deeper learning is faster. To this end,
we separate a deep neural network, trained by normal stochastic gradient
descent, into two parts during analysis, i.e., a pre-condition component and a
learning component, in which the output of the pre-condition one is the input
of the learning one. We use a filtering method to characterize the frequency
distribution of a high-dimensional function. Based on experiments of deep
networks and real dataset, we propose a deep frequency principle, that is, the
effective target function for a deeper hidden layer biases towards lower
frequency during the training. Therefore, the learning component effectively
learns a lower frequency function if the pre-condition component has more
layers. Due to the well-studied frequency principle, i.e., deep neural networks
learn lower frequency functions faster, the deep frequency principle provides a
reasonable explanation to why deeper learning is faster. We believe these
empirical studies would be valuable for future theoretical studies of the
effect of depth in deep learning
Boundary integrated neural networks (BINNs) for 2D elastostatic and piezoelectric problems: Theory and MATLAB code
In this paper, we make the first attempt to apply the boundary integrated
neural networks (BINNs) for the numerical solution of two-dimensional (2D)
elastostatic and piezoelectric problems. BINNs combine artificial neural
networks with the well-established boundary integral equations (BIEs) to
effectively solve partial differential equations (PDEs). The BIEs are utilized
to map all the unknowns onto the boundary, after which these unknowns are
approximated using artificial neural networks and resolved via a training
process. In contrast to traditional neural network-based methods, the current
BINNs offer several distinct advantages. First, by embedding BIEs into the
learning procedure, BINNs only need to discretize the boundary of the solution
domain, which can lead to a faster and more stable learning process (only the
boundary conditions need to be fitted during the training). Second, the
differential operator with respect to the PDEs is substituted by an integral
operator, which effectively eliminates the need for additional differentiation
of the neural networks (high-order derivatives of neural networks may lead to
instability in learning). Third, the loss function of the BINNs only contains
the residuals of the BIEs, as all the boundary conditions have been inherently
incorporated within the formulation. Therefore, there is no necessity for
employing any weighing functions, which are commonly used in traditional
methods to balance the gradients among different objective functions. Moreover,
BINNs possess the ability to tackle PDEs in unbounded domains since the
integral representation remains valid for both bounded and unbounded domains.
Extensive numerical experiments show that BINNs are much easier to train and
usually give more accurate learning solutions as compared to traditional neural
network-based methods
A data driven deep neural network model for predicting boiling heat transfer in helical coils under high gravity
In this article, a deep artificial neural network (ANN) model has been proposed to predict the boiling heat transfer in helical coils under high gravity conditions, which is compared with experimental data. A test rig is set up to provide high gravity up to 11 g with a heat flux up to 15100 W/m 2 and the mass velocity range from 40 to 2000 kg m −2 s −1. In the current work, a total 531 data samples have been used in the ANN model. The proposed model was developed in a Python Keras environment with Feed-forward Back-propagation (FFBP) Multi-layer Perceptron (MLP) using eight features (mass flow rate, thermal power, inlet temperature, inlet pressure, direction, acceleration, tube inner surface area, helical coil diameter) as the inputs and two features (wall temperature, heat transfer coefficient) as the outputs. The deep ANN model composed of three hidden layers with a total number of 1098 neurons and 300,266 trainable parameters has been found as optimal according to statistical error analysis. Performance evaluation is conducted based on six verification statistic metrics (R 2, MSE, MAE, MAPE, RMSE and cosine proximity) between the experimental data and predicted values. The results demonstrate that a 8-512-512-64-2 neural network has the best performance in predicting the helical coil characteristics with (R 2=0.853, MSE=0.018, MAE=0.074, MAPE=1.110, RMSE=0.136, cosine proximity=1.000) in the testing stage. It is indicated that with the utilisation of deep learning, the proposed model is able to successfully predict the heat transfer performance in helical coils, and especially achieved excellent performance in predicting outputs that have a very large range of value differences