Search CORE

10 research outputs found

Learning activation functions from data using cubic spline interpolation

Author: CT Chen
E Trentin
J Schmidhuber
L Ma
L Vecci
M Scarpiniti
M Scarpiniti
M Zhang
P Chandra
S Goh
S Guarnieri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/05/2017
Field of study

Neural networks require a careful design in order to perform properly on a given task. In particular, selecting a good activation function (possibly in a data-dependent fashion) is a crucial step, which remains an open problem in the research community. Despite a large amount of investigations, most current implementations simply select one fixed function from a small set of candidates, which is not adapted during training, and is shared among all neurons throughout the different layers. However, neither two of these assumptions can be supposed optimal in practice. In this paper, we present a principled way to have data-dependent adaptation of the activation functions, which is performed independently for each neuron. This is achieved by leveraging over past and present advances on cubic spline interpolation, allowing for local adaptation of the functions around their regions of use. The resulting algorithm is relatively cheap to implement, and overfitting is counterbalanced by the inclusion of a novel damping criterion, which penalizes unwanted oscillations from a predefined shape. Experimental results validate the proposal over two well-known benchmarks.Comment: Submitted to the 27th Italian Workshop on Neural Networks (WIRN 2017

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Universal Activation Function For Machine Learning

Author: Dong Xiaodai
Hoang Minh Tu
Lu Tao
Yuen Brosnan
Publication venue
Publication date: 07/11/2020
Field of study

This article proposes a Universal Activation Function (UAF) that achieves near optimal performance in quantification, classification, and reinforcement learning (RL) problems. For any given problem, the optimization algorithms are able to evolve the UAF to a suitable activation function by tuning the UAF's parameters. For the CIFAR-10 classification and VGG-8, the UAF converges to the Mish like activation function, which has near optimal performance

F_{1} = 0.9017\pm0.0040

when compared to other activation functions. For the quantification of simulated 9-gas mixtures in 30 dB signal-to-noise ratio (SNR) environments, the UAF converges to the identity function, which has near optimal root mean square error of

0.4888 \pm 0.0032

\mu M

. In the BipedalWalker-v2 RL dataset, the UAF achieves the 250 reward in

961 \pm 193

epochs, which proves that the UAF converges in the lowest number of epochs. Furthermore, the UAF converges to a new activation function in the BipedalWalker-v2 RL dataset

arXiv.org e-Print Archive

A survey on modern trainable activation functions

Author: Apicella Andrea
Donnarumma Francesco
Isgrò Francesco
Prevete Roberto
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

In neural networks literature, there is a strong interest in identifying and defining activation functions which can improve neural network performance. In recent years there has been a renovated interest of the scientific community in investigating activation functions which can be trained during the learning process, usually referred to as "trainable", "learnable" or "adaptable" activation functions. They appear to lead to better network performance. Diverse and heterogeneous models of trainable activation function have been proposed in the literature. In this paper, we present a survey of these models. Starting from a discussion on the use of the term "activation function" in literature, we propose a taxonomy of trainable activation functions, highlight common and distinctive proprieties of recent and past models, and discuss main advantages and limitations of this type of approach. We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions and some simple local rule that constraints the corresponding weight layers.Comment: Published in "Neural Networks" journal (Elsevier

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

The Adaptive Quadratic Linear Unit (AQuLU): Adaptive Non Monotonic Piecewise Activation Function

Author: Sui Yuanyuan
Wu Zhandong
Yu Haiye
Zhang Lei
Publication venue: Faculty of Mechanical Engineering in Slavonski Brod; Faculty of Electrical Engineering, Computer Science and Information Technology Osijek; Faculty of Civil Engineering in Osijek
Publication date: 01/01/2023
Field of study

The activation function plays a key role in inﬂuencing the performance and training dynamics of neural networks. There are hundreds of activation functions widely used as rectiﬁed linear units (ReLUs), but most of them are applied to complex and large neural networks, which often have gradient explosion and vanishing gradient problems. By studying a variety of non-monotonic activation functions, we propose a method to construct a non-monotonic activation function, x·Φ(x), with Φ(x) [0, 1]. With the hardening treatment of Φ(x), we propose an adaptive non-monotonic segmented activation function, called the adaptive quadratic linear unit, abbreviated as AQuLU, which ensures the sparsity of the input data and improves training efficiency. In image classiﬁcation based on different state-of-the-art neural network architectures, the performance of AQuLUs has signiﬁcant advantages for more complex and deeper architectures with various activation functions. The ablation experimental study further validates the compatibility and stability of AQuLUs with different depths, complexities, optimizers, learning rates, and batch sizes. We thus demonstrate the high efficiency, robustness, and simplicity of AQuLUs

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Improving classification models with context knowledge and variable activation functions

Author: Apicella Andrea
Publication venue
Publication date: 10/12/2018
Field of study

This work proposes two methods to boost the performances of a given classifier: the first one, which works on a Neural Network classifier, is a new type of trainable activation function, that is a function which is adjusted during the learning phase, allowing the network to exploit the data better respect to use a classic activation function with fixed-shape; the second one provides two frameworks to use an external knowledge base to improve the classification results

Università degli Studi di Napoli Federico Il Open Archive

Learning activation functions from data using cubic spline interpolation

Author: CT Chen
E Trentin
J Schmidhuber
L Ma
L Vecci
M Scarpiniti
M Scarpiniti
M Zhang
P Chandra
S Goh
S Guarnieri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Neural networks require a careful design in order to perform properly on a given task. In particular, selecting a good activation function (possibly in a data-dependent fashion) is a crucial step, which remains an open problem in the research community. Despite a large amount of investigations, most current implementations simply select one fixed function from a small set of candidates, which is not adapted during training, and is shared among all neurons throughout the different layers. However, neither two of these assumptions can be supposed optimal in practice. In this paper, we present a principled way to have data-dependent adaptation of the activation functions, which is performed independently for each neuron. This is achieved by leveraging over past and present advances on cubic spline interpolation, allowing for local adaptation of the functions around their regions of use. The resulting algorithm is relatively cheap to implement, and overfitting is counterbalanced by the inclusion of a novel damping criterion, which penalizes unwanted oscillations from a predefined shape. Preliminary experimental results validate the proposal

Crossref

Archivio della ricerca- Università di Roma La Sapienza