3 research outputs found

    The Adaptive Quadratic Linear Unit (AQuLU): Adaptive Non Monotonic Piecewise Activation Function

    Get PDF
    The activation function plays a key role in influencing the performance and training dynamics of neural networks. There are hundreds of activation functions widely used as rectified linear units (ReLUs), but most of them are applied to complex and large neural networks, which often have gradient explosion and vanishing gradient problems. By studying a variety of non-monotonic activation functions, we propose a method to construct a non-monotonic activation function, x·Φ(x), with Φ(x) [0, 1]. With the hardening treatment of Φ(x), we propose an adaptive non-monotonic segmented activation function, called the adaptive quadratic linear unit, abbreviated as AQuLU, which ensures the sparsity of the input data and improves training efficiency. In image classification based on different state-of-the-art neural network architectures, the performance of AQuLUs has significant advantages for more complex and deeper architectures with various activation functions. The ablation experimental study further validates the compatibility and stability of AQuLUs with different depths, complexities, optimizers, learning rates, and batch sizes. We thus demonstrate the high efficiency, robustness, and simplicity of AQuLUs

    How important are activation functions in regression and classification? A survey, performance comparison, and future directions

    Full text link
    Inspired by biological neurons, the activation functions play an essential part in the learning process of any artificial neural network commonly used in many real-world problems. Various activation functions have been proposed in the literature for classification as well as regression tasks. In this work, we survey the activation functions that have been employed in the past as well as the current state-of-the-art. In particular, we present various developments in activation functions over the years and the advantages as well as disadvantages or limitations of these activation functions. We also discuss classical (fixed) activation functions, including rectifier units, and adaptive activation functions. In addition to discussing the taxonomy of activation functions based on characterization, a taxonomy of activation functions based on applications is presented. To this end, the systematic comparison of various fixed and adaptive activation functions is performed for classification data sets such as the MNIST, CIFAR-10, and CIFAR- 100. In recent years, a physics-informed machine learning framework has emerged for solving problems related to scientific computations. For this purpose, we also discuss various requirements for activation functions that have been used in the physics-informed machine learning framework. Furthermore, various comparisons are made among different fixed and adaptive activation functions using various machine learning libraries such as TensorFlow, Pytorch, and JAX.Comment: 28 pages, 15 figure
    corecore