Search CORE

176 research outputs found

Learning activation functions from data using cubic spline interpolation

Author: CT Chen
E Trentin
J Schmidhuber
L Ma
L Vecci
M Scarpiniti
M Scarpiniti
M Zhang
P Chandra
S Goh
S Guarnieri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/05/2017
Field of study

Neural networks require a careful design in order to perform properly on a given task. In particular, selecting a good activation function (possibly in a data-dependent fashion) is a crucial step, which remains an open problem in the research community. Despite a large amount of investigations, most current implementations simply select one fixed function from a small set of candidates, which is not adapted during training, and is shared among all neurons throughout the different layers. However, neither two of these assumptions can be supposed optimal in practice. In this paper, we present a principled way to have data-dependent adaptation of the activation functions, which is performed independently for each neuron. This is achieved by leveraging over past and present advances on cubic spline interpolation, allowing for local adaptation of the functions around their regions of use. The resulting algorithm is relatively cheap to implement, and overfitting is counterbalanced by the inclusion of a novel damping criterion, which penalizes unwanted oscillations from a predefined shape. Experimental results validate the proposal over two well-known benchmarks.Comment: Submitted to the 27th Italian Workshop on Neural Networks (WIRN 2017

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

A survey on modern trainable activation functions

Author: Apicella Andrea
Donnarumma Francesco
Isgrò Francesco
Prevete Roberto
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

In neural networks literature, there is a strong interest in identifying and defining activation functions which can improve neural network performance. In recent years there has been a renovated interest of the scientific community in investigating activation functions which can be trained during the learning process, usually referred to as "trainable", "learnable" or "adaptable" activation functions. They appear to lead to better network performance. Diverse and heterogeneous models of trainable activation function have been proposed in the literature. In this paper, we present a survey of these models. Starting from a discussion on the use of the term "activation function" in literature, we propose a taxonomy of trainable activation functions, highlight common and distinctive proprieties of recent and past models, and discuss main advantages and limitations of this type of approach. We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions and some simple local rule that constraints the corresponding weight layers.Comment: Published in "Neural Networks" journal (Elsevier

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Application of new adaptive higher order neural networks in data mining

Author: Chen L
Xu S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

This paper introduces an adaptive Higher Order Neural Network (HONN) model and applies it in data mining such as simulating and forecasting government taxation revenues. The proposed adaptive HONN model offers significant advantages over conventional Artificial Neural Network (ANN) models such as much reduced network size, faster training, as well as much improved simulation and forecasting errors. The generalization ability of this HONN model is explored and discussed. A new approach for determining the best number of hidden neurons is also proposed

Crossref

University of Tasmania Open Access Repository

A method to improve the performance of multilayer perceptron by utilizing various activation functions in the last hidden layer and the least squares method

Author: Krzysztof Halawa
Publication venue: Springer Nature
Publication date: 01/01/2011
Field of study

Springer - Publisher Connector

NIPUNA: A Novel Optimizer Activation Function for Deep Neural Networks

Author: Abdulaziz Alnowibet Khalid
Kautish Sandeep
Madhu Golla
Wagdy Mohamed Ali
Zawbaa Hossam
Publication venue: Technological University Dublin
Publication date: 01/01/2023
Field of study

In recent years, various deep neural networks with different learning paradigms have been widely employed in various applications, including medical diagnosis, image analysis, self-driving vehicles and others. The activation functions employed in deep neural networks have a huge impact on the training model and the reliability of the model. The Rectified Linear Unit (ReLU) has recently emerged as the most popular and extensively utilized activation function. ReLU has some flaws, such as the fact that it is only active when the units are positive during back-propagation and zero otherwise. This causes neurons to die (dying ReLU) and a shift in bias. However, unlike ReLU activation functions, Swish activation functions do not remain stable or move in a single direction. This research proposes a new activation function named NIPUNA for deep neural networks. We test this activation by training on customized convolutional neural networks (CCNN). On benchmark datasets (Fashion MNIST images of clothes, MNIST dataset of handwritten digits), the contributions are examined and compared to various activation functions. The proposed activation function can outperform traditional activation functions

Arrow@TUDublin

Application of Higher-Order Neural Networks to Financial Time-Series Prediction

Author: Fulcher J
Xu S
Zhang M
Publication venue: Idea Group Publishing
Publication date: 01/01/2006
Field of study

Financial time series data is characterized by non-linearities, discontinuities and high frequency, multi-polynomial components. Not surprisingly, conventional Artificial Neural Networks (ANNs) have difficulty in modelling such complex data. A more appropriate approach is to apply Higher-Order ANNs, which are capable of extracting higher order polynomial coefficients in the data. Moreover, since there is a one-to-one correspondence between network weights and polynomial coefficients, HONNs (unlike ANNs generally) can be considered open-, rather than 'closed box' solutions, and thus hold more appeal to the financial community. After developing Polynomial and Trigonometric HONNs, we introduce the concept of HONN groups. The latter incorporate piecewise continuous activation functions and thresholds, and as a result are capable of modelling discontinuous (piecewise continuous) data, and what's more to any degree of accuracy. Several other PHONN variants are also described. The performance of P(T)HONNs and HONN groups on representative financial time series is described (credit ratings and exchange rates). In short, HONNs offer roughly twice the performance of MLP/BP on financial time series prediction, and HONN groups around 10% further improvement

Crossref

University of Tasmania Open Access Repository

Research Online