32 research outputs found

    Analyzing the Impacts of Activation Functions on the Performance of Convolutional Neural Network Models

    Get PDF
    Activation functions are a very crucial part of convolutional neural networks (CNN) because to a very large extent, they determine the performance of the CNN model. Various activation functions have been developed over the years and the choice of activation function to use in a given model is usually a matter of trial and error. In this paper, we evaluate some of the most-used activation functions and how they impact the time to train a CNN model and the performance of the model. We make recommendations for the best activation functions to use based on the results of our experiment

    Deep frequency principle towards understanding why deeper learning is faster

    Full text link
    Understanding the effect of depth in deep learning is a critical problem. In this work, we utilize the Fourier analysis to empirically provide a promising mechanism to understand why feedforward deeper learning is faster. To this end, we separate a deep neural network, trained by normal stochastic gradient descent, into two parts during analysis, i.e., a pre-condition component and a learning component, in which the output of the pre-condition one is the input of the learning one. We use a filtering method to characterize the frequency distribution of a high-dimensional function. Based on experiments of deep networks and real dataset, we propose a deep frequency principle, that is, the effective target function for a deeper hidden layer biases towards lower frequency during the training. Therefore, the learning component effectively learns a lower frequency function if the pre-condition component has more layers. Due to the well-studied frequency principle, i.e., deep neural networks learn lower frequency functions faster, the deep frequency principle provides a reasonable explanation to why deeper learning is faster. We believe these empirical studies would be valuable for future theoretical studies of the effect of depth in deep learning

    Boundary integrated neural networks (BINNs) for 2D elastostatic and piezoelectric problems: Theory and MATLAB code

    Full text link
    In this paper, we make the first attempt to apply the boundary integrated neural networks (BINNs) for the numerical solution of two-dimensional (2D) elastostatic and piezoelectric problems. BINNs combine artificial neural networks with the well-established boundary integral equations (BIEs) to effectively solve partial differential equations (PDEs). The BIEs are utilized to map all the unknowns onto the boundary, after which these unknowns are approximated using artificial neural networks and resolved via a training process. In contrast to traditional neural network-based methods, the current BINNs offer several distinct advantages. First, by embedding BIEs into the learning procedure, BINNs only need to discretize the boundary of the solution domain, which can lead to a faster and more stable learning process (only the boundary conditions need to be fitted during the training). Second, the differential operator with respect to the PDEs is substituted by an integral operator, which effectively eliminates the need for additional differentiation of the neural networks (high-order derivatives of neural networks may lead to instability in learning). Third, the loss function of the BINNs only contains the residuals of the BIEs, as all the boundary conditions have been inherently incorporated within the formulation. Therefore, there is no necessity for employing any weighing functions, which are commonly used in traditional methods to balance the gradients among different objective functions. Moreover, BINNs possess the ability to tackle PDEs in unbounded domains since the integral representation remains valid for both bounded and unbounded domains. Extensive numerical experiments show that BINNs are much easier to train and usually give more accurate learning solutions as compared to traditional neural network-based methods

    A data driven deep neural network model for predicting boiling heat transfer in helical coils under high gravity

    Get PDF
    In this article, a deep artificial neural network (ANN) model has been proposed to predict the boiling heat transfer in helical coils under high gravity conditions, which is compared with experimental data. A test rig is set up to provide high gravity up to 11 g with a heat flux up to 15100 W/m 2 and the mass velocity range from 40 to 2000 kg m −2 s −1. In the current work, a total 531 data samples have been used in the ANN model. The proposed model was developed in a Python Keras environment with Feed-forward Back-propagation (FFBP) Multi-layer Perceptron (MLP) using eight features (mass flow rate, thermal power, inlet temperature, inlet pressure, direction, acceleration, tube inner surface area, helical coil diameter) as the inputs and two features (wall temperature, heat transfer coefficient) as the outputs. The deep ANN model composed of three hidden layers with a total number of 1098 neurons and 300,266 trainable parameters has been found as optimal according to statistical error analysis. Performance evaluation is conducted based on six verification statistic metrics (R 2, MSE, MAE, MAPE, RMSE and cosine proximity) between the experimental data and predicted values. The results demonstrate that a 8-512-512-64-2 neural network has the best performance in predicting the helical coil characteristics with (R 2=0.853, MSE=0.018, MAE=0.074, MAPE=1.110, RMSE=0.136, cosine proximity=1.000) in the testing stage. It is indicated that with the utilisation of deep learning, the proposed model is able to successfully predict the heat transfer performance in helical coils, and especially achieved excellent performance in predicting outputs that have a very large range of value differences
    corecore