Search CORE

8 research outputs found

The Adaptive Quadratic Linear Unit (AQuLU): Adaptive Non Monotonic Piecewise Activation Function

Author: Sui Yuanyuan
Wu Zhandong
Yu Haiye
Zhang Lei
Publication venue: Faculty of Mechanical Engineering in Slavonski Brod; Faculty of Electrical Engineering, Computer Science and Information Technology Osijek; Faculty of Civil Engineering in Osijek
Publication date: 01/01/2023
Field of study

The activation function plays a key role in inﬂuencing the performance and training dynamics of neural networks. There are hundreds of activation functions widely used as rectiﬁed linear units (ReLUs), but most of them are applied to complex and large neural networks, which often have gradient explosion and vanishing gradient problems. By studying a variety of non-monotonic activation functions, we propose a method to construct a non-monotonic activation function, x·Φ(x), with Φ(x) [0, 1]. With the hardening treatment of Φ(x), we propose an adaptive non-monotonic segmented activation function, called the adaptive quadratic linear unit, abbreviated as AQuLU, which ensures the sparsity of the input data and improves training efficiency. In image classiﬁcation based on different state-of-the-art neural network architectures, the performance of AQuLUs has signiﬁcant advantages for more complex and deeper architectures with various activation functions. The ablation experimental study further validates the compatibility and stability of AQuLUs with different depths, complexities, optimizers, learning rates, and batch sizes. We thus demonstrate the high efficiency, robustness, and simplicity of AQuLUs

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

An efficient semi-sigmoidal non-linear activation function approach for deep neural networks

Author: Chieng Hock Hung
Publication venue
Publication date: 01/01/2022
Field of study

A non-linear activation function is one of the key contributing factors to the success of Deep Learning (DL). Since the revival of DL takes place in 2012, Rectified Linear Unit (ReLU) has been regarded as a de facto standard for many DL models by the community. Despite its popularity, however, ReLU contains several shortcomings that could result in inefficient learning of the DL models. These shortcomings are: 1) the inherent negative cancellation property in ReLU tends to remove all negative inputs and causes massive information lost to the network; 2) the derivative of ReLU potentially causes the occurrence of dead neurons problem to the networks; 3) the mean activation generated by ReLU is highly positive and lead to bias shift effect in the network layers; 4) the inherent multilinear structure of ReLU restricts the nonlinear capability of the networks; 5) the predefined nature of ReLU limits the flexibility of the networks. To address these shortcomings, this study proposed a new variant of activation function based on the Semi-sigmoidal (Sig) approach. Based on this approach, three variants of activation functions are introduced, namely, Shifted Semisigmoidal (SSig), Adaptive Shifted Semi-sigmoidal (ASSig), and Bi-directional Adaptive Shifted Semi-sigmoidal (BiASSig). The proposed activation functions were tested against the ReLU (baseline) and state-of-the-art methods using eight Deep Neural Networks (DNNs) on seven benchmark image datasets. Further, Adaptive Moment Estimation (ADAM) and Stochastic Gradient Descent (SGD) were selected as optimizers to train the DNNs. The baseline comparison score and mean rank were used to consolidate and analyse the experimental results effectively. The experimental results in terms of the overall baseline comparison score shown that SSig, ASSig, and BiASSig obtained the score of 79, 87, and 86 out of 112, respectively, which achieving outstanding performance than ReLU in more than 70% of the cases. In terms of overall mean rank (OMR), ReLU ranked at tenth (10th), whereas SSig, ASSig, and BiASSig ranked at fifth (5th), first (1st), and second (2nd), showing remarkable performance than ReLU and other comparing methods

UTHM Institutional Repository

Improving deep neural network with Multiple Parametric Exponential Linear Units

Author: Bengio
Chunxiao Fan
He
He
Hinton
Huang
Jin
Li
Li
Li
Qiong Wu
Shah
Srivastava
Srivastava
Yang Li
Yong Li
Yue Ming
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Classification & prediction methods and their application

Author: Stutzki Jan
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 24/11/2017
Field of study