Search CORE

92,136 research outputs found

Logical Activation Functions: Logit-space equivalents of Probabilistic Boolean Operators

Author: d'Eon Jason
Earle Robert
Lowe Scott C.
Oore Sageev
Trappenberg Thomas
Publication venue
Publication date: 29/11/2022
Field of study

The choice of activation functions and their motivation is a long-standing issue within the neural network community. Neuronal representations within artificial neural networks are commonly understood as logits, representing the log-odds score of presence of features within the stimulus. We derive logit-space operators equivalent to probabilistic Boolean logic-gates AND, OR, and XNOR for independent probabilities. Such theories are important to formalize more complex dendritic operations in real neurons, and these operations can be used as activation functions within a neural network, introducing probabilistic Boolean-logic as the core operation of the neural network. Since these functions involve taking multiple exponents and logarithms, they are computationally expensive and not well suited to be directly used within neural networks. Consequently, we construct efficient approximations named

\text{AND}_\text{AIL}

(the AND operator Approximate for Independent Logits),

\text{OR}_\text{AIL}

, and

\text{XNOR}_\text{AIL}

, which utilize only comparison and addition operations, have well-behaved gradients, and can be deployed as activation functions in neural networks. Like MaxOut,

\text{AND}_\text{AIL}

and

\text{OR}_\text{AIL}

are generalizations of ReLU to two-dimensions. While our primary aim is to formalize dendritic computations within a logit-space probabilistic-Boolean framework, we deploy these new activation functions, both in isolation and in conjunction to demonstrate their effectiveness on a variety of tasks including image classification, transfer learning, abstract reasoning, and compositional zero-shot learning

arXiv.org e-Print Archive

A survey on modern trainable activation functions

Author: Apicella Andrea
Donnarumma Francesco
Isgrò Francesco
Prevete Roberto
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

In neural networks literature, there is a strong interest in identifying and defining activation functions which can improve neural network performance. In recent years there has been a renovated interest of the scientific community in investigating activation functions which can be trained during the learning process, usually referred to as "trainable", "learnable" or "adaptable" activation functions. They appear to lead to better network performance. Diverse and heterogeneous models of trainable activation function have been proposed in the literature. In this paper, we present a survey of these models. Starting from a discussion on the use of the term "activation function" in literature, we propose a taxonomy of trainable activation functions, highlight common and distinctive proprieties of recent and past models, and discuss main advantages and limitations of this type of approach. We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions and some simple local rule that constraints the corresponding weight layers.Comment: Published in "Neural Networks" journal (Elsevier

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Learning Combinations of Activation Functions

Author: Manessi Franco
Rozza Alessandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/04/2019
Field of study

In the last decade, an active area of research has been devoted to design novel activation functions that are able to help deep neural networks to converge, obtaining better performance. The training procedure of these architectures usually involves optimization of the weights of their layers only, while non-linearities are generally pre-specified and their (possible) parameters are usually considered as hyper-parameters to be tuned manually. In this paper, we introduce two approaches to automatically learn different combinations of base activation functions (such as the identity function, ReLU, and tanh) during the training phase. We present a thorough comparison of our novel approaches with well-known architectures (such as LeNet-5, AlexNet, and ResNet-56) on three standard datasets (Fashion-MNIST, CIFAR-10, and ILSVRC-2012), showing substantial improvements in the overall performance, such as an increase in the top-1 accuracy for AlexNet on ILSVRC-2012 of 3.01 percentage points.Comment: 6 pages, 3 figures. Published as a conference paper at ICPR 2018. Code: https://bitbucket.org/francux/learning_combinations_of_activation_function

arXiv.org e-Print Archive

Crossref

CoCalc as a Learning Tool for Neural Network Simulation in the Special Course "Foundations of Mathematic Informatics"

Author: Markova Oksana
Popel Maiia
Semerikov Serhiy
Publication venue
Publication date: 02/07/2018
Field of study

The role of neural network modeling in the learning content of the special course "Foundations of Mathematical Informatics" was discussed. The course was developed for the students of technical universities - future IT-specialists and directed to breaking the gap between theoretic computer science and it's applied applications: software, system and computing engineering. CoCalc was justified as a learning tool of mathematical informatics in general and neural network modeling in particular. The elements of technique of using CoCalc at studying topic "Neural network and pattern recognition" of the special course "Foundations of Mathematic Informatics" are shown. The program code was presented in a CoffeeScript language, which implements the basic components of artificial neural network: neurons, synaptic connections, functions of activations (tangential, sigmoid, stepped) and their derivatives, methods of calculating the network's weights, etc. The features of the Kolmogorov-Arnold representation theorem application were discussed for determination the architecture of multilayer neural networks. The implementation of the disjunctive logical element and approximation of an arbitrary function using a three-layer neural network were given as an examples. According to the simulation results, a conclusion was made as for the limits of the use of constructed networks, in which they retain their adequacy. The framework topics of individual research of the artificial neural networks is proposed.Comment: 16 pages, 3 figures, Proceedings of the 13th International Conference on ICT in Education, Research and Industrial Applications. Integration, Harmonization and Knowledge Transfer (ICTERI, 2018

arXiv.org e-Print Archive

Directory of Open Access Journals

A Predictive Model for Assessment of Successful Outcome in Posterior Spinal Fusion Surgery

Author: Babaee Mahsa
Chizari Mahmoud
Mohseni Kabir Nima
Soleimani Paria
Zali Alireza
Publication venue: 'International Society for Phytocosmetic Sciences'
Publication date: 08/10/2017
Field of study

Background: Low back pain is a common problem in many people. Neurosurgeons recommend posterior spinal fusion (PSF) surgery as one of the therapeutic strategies to the patients with low back pain. Due to the high risk of this type of surgery and the critical importance of making the right decision, accurate prediction of the surgical outcome is one of the main concerns for the neurosurgeons.Methods: In this study, 12 types of multi-layer perceptron (MLP) networks and 66 radial basis function (RBF) networks as the types of artificial neural network methods and a logistic regression (LR) model created and compared to predict the satisfaction with PSF surgery as one of the most well-known spinal surgeries.Results: The most important clinical and radiologic features as twenty-seven factors for 480 patients (150 males, 330 females; mean age 52.32 ± 8.39 years) were considered as the model inputs that included: age, sex, type of disorder, duration of symptoms, job, walking distance without pain (WDP), walking distance without sensory (WDS) disorders, visual analog scale (VAS) scores, Japanese Orthopaedic Association (JOA) score, diabetes, smoking, knee pain (KP), pelvic pain (PP), osteoporosis, spinal deformity and etc. The indexes such as receiver operating characteristic–area under curve (ROC-AUC), positive predictive value, negative predictive value and accuracy calculated to determine the best model. Postsurgical satisfaction was 77.5% at 6 months follow-up. The patients divided into the training, testing, and validation data sets.Conclusion: The findings showed that the MLP model performed better in comparison with RBF and LR models for prediction of PSF surgery.Keywords: Posterior spinal fusion surgery (PSF); Prediction, Surgical satisfaction; Multi-layer perceptron (MLP); Logistic regression (LR) (PDF) A Predictive Model for Assessment of Successful Outcome in Posterior Spinal Fusion Surgery. Available from: https://www.researchgate.net/publication/325679954_A_Predictive_Model_for_Assessment_of_Successful_Outcome_in_Posterior_Spinal_Fusion_Surgery [accessed Jul 11 2019].Peer reviewe

University of Hertfordshire Research Archive

Journals Portal, Shahid Beheshti University of Medical Sciences