Search CORE

1,493 research outputs found

A survey on modern trainable activation functions

Author: Apicella Andrea
Donnarumma Francesco
Isgrò Francesco
Prevete Roberto
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

In neural networks literature, there is a strong interest in identifying and defining activation functions which can improve neural network performance. In recent years there has been a renovated interest of the scientific community in investigating activation functions which can be trained during the learning process, usually referred to as "trainable", "learnable" or "adaptable" activation functions. They appear to lead to better network performance. Diverse and heterogeneous models of trainable activation function have been proposed in the literature. In this paper, we present a survey of these models. Starting from a discussion on the use of the term "activation function" in literature, we propose a taxonomy of trainable activation functions, highlight common and distinctive proprieties of recent and past models, and discuss main advantages and limitations of this type of approach. We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions and some simple local rule that constraints the corresponding weight layers.Comment: Published in "Neural Networks" journal (Elsevier

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Recommended from our members

Design of a cognitive neural predictive controller for mobile robot

Author: Al‐Araji Ahmed
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2012
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityIn this thesis, a cognitive neural predictive controller system has been designed to guide a nonholonomic wheeled mobile robot during continuous and non-continuous trajectory tracking and to navigate through static obstacles with collision-free and minimum tracking error. The structure of the controller consists of two layers; the first layer is a neural network system that controls the mobile robot actuators in order to track a desired path. The second layer of the controller is cognitive layer that collects information from the environment and plans the optimal path. In addition to this, it detects if there is any obstacle in the path so it can be avoided by re-planning the trajectory using particle swarm optimisation (PSO) technique. Two neural networks models are used: the first model is modified Elman recurrent neural network model that describes the kinematic and dynamic model of the mobile robot and it is trained off-line and on-line stages to guarantee that the outputs of the model will accurately represent the actual outputs of the mobile robot system. The trained neural model acts as the position and orientation identifier. The second model is feedforward multi-layer perceptron neural network that describes a feedforward neural controller and it is trained off-line and its weights are adapted on-line to find the reference torques, which controls the steady-state outputs of the mobile robot system. The feedback neural controller is based on the posture neural identifier and quadratic performance index predictive optimisation algorithm for N step-ahead prediction in order to find the optimal torque action in the transient to stabilise the tracking error of the mobile robot system when the trajectory of the robot is drifted from the desired path during transient state. Three controller methodologies were developed: the first is the feedback neural controller; the second is the nonlinear PID neural feedback controller and the third is nonlinear inverse dynamic neural feedback controller, based on the back-stepping method and Lyapunov criterion. The main advantages of the presented approaches are to plan an optimal path for itself avoiding obstructions by using intelligent (PSO) technique as well as the analytically derived control law, which has significantly high computational accuracy with predictive optimisation technique to obtain the optimal torques control action and lead to minimum tracking error of the mobile robot for different types of trajectories. The proposed control algorithm has been applied to monitor a nonholonomic wheeled mobile robot, has demonstrated the capability of tracking different trajectories with continuous gradients (lemniscates and circular) or non-continuous gradients (square) with bounded external disturbances and static obstacles. Simulations results and experimental work showed the effectiveness of the proposed cognitive neural predictive control algorithm; this is demonstrated by the minimised tracking error to less than (1 cm) and obtained smoothness of the torque control signal less than maximum torque (0.236 N.m), especially when external disturbances are applied and navigating through static obstacles. Results show that the five steps-ahead prediction algorithm has better performance compared to one step-ahead for all the control methodologies because of a more complex control structure and taking into account future values of the desired one, not only the current value, as with one step-ahead method. The mean-square error method is used for each component of the state error vector to compare between each of the performance control methodologies in order to give better control results

Brunel University Research Archive

Multiresolution FIR neural-network-based learning algorithm applied to network traffic prediction

Author: Alarcon-Aquino V
Barria JA
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Published versio

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

A Review of Fault Diagnosing Methods in Power Transmission Systems

Author: Akmal Muhammad
Alquthami Thamer
Benrabah Abdeldjabar
Raza Ali
Publication venue: 'MDPI AG'
Publication date: 14/02/2020
Field of study

Transient stability is important in power systems. Disturbances like faults need to be segregated to restore transient stability. A comprehensive review of fault diagnosing methods in the power transmission system is presented in this paper. Typically, voltage and current samples are deployed for analysis. Three tasks/topics; fault detection, classification, and location are presented separately to convey a more logical and comprehensive understanding of the concepts. Feature extractions, transformations with dimensionality reduction methods are discussed. Fault classification and location techniques largely use artificial intelligence (AI) and signal processing methods. After the discussion of overall methods and concepts, advancements and future aspects are discussed. Generalized strengths and weaknesses of different AI and machine learning-based algorithms are assessed. A comparison of different fault detection, classification, and location methods is also presented considering features, inputs, complexity, system used and results. This paper may serve as a guideline for the researchers to understand different methods and techniques in this field

Multidisciplinary Digital Publishing Institute

Sheffield Hallam University Research Archive

Suitable MLP Network Activation Functions For Breast Cancer And Thyroid Disease Detection.

Author: A. H.
Ahmad K.A.
Isa , I.S.
Omar S.
Osman M.K.
Saad Z.
Sakim Mat
Publication venue
Publication date: 01/09/2010
Field of study

This paper presents a comparison study of various MLP activation functions for detection and classification problems

Crossref

Repository@USM

Intelligent flight control systems

Author: Stengel Robert F.
Publication venue
Publication date
Field of study

The capabilities of flight control systems can be enhanced by designing them to emulate functions of natural intelligence. Intelligent control functions fall in three categories. Declarative actions involve decision-making, providing models for system monitoring, goal planning, and system/scenario identification. Procedural actions concern skilled behavior and have parallels in guidance, navigation, and adaptation. Reflexive actions are spontaneous, inner-loop responses for control and estimation. Intelligent flight control systems learn knowledge of the aircraft and its mission and adapt to changes in the flight environment. Cognitive models form an efficient basis for integrating 'outer-loop/inner-loop' control functions and for developing robust parallel-processing algorithms

NASA Technical Reports Server

A Temporally Coherent Neural Algorithm for Artistic Style Transfer

Author: Dushkoff Michael
Publication venue: RIT Scholar Works
Publication date: 01/07/2016
Field of study

Within the fields of visual effects and animation, humans have historically spent countless painstaking hours mastering the skill of drawing frame-by-frame animations. One such animation technique that has been widely used in the animation and visual effects industry is called rotoscoping and has allowed uniquely stylized animations to capture the motion of real life action sequences, however it is a very complex and time consuming process. Automating this arduous technique would free animators from performing frame by frame stylization and allow them to concentrate on their own artistic contributions. This thesis introduces a new artificial system based on an existing neural style transfer method which creates artistically stylized animations that simultaneously reproduce both the motion of the original videos that they are derived from and the unique style of a given artistic work. This system utilizes a convolutional neural network framework to extract a hierarchy of image features used for generating images that appear visually similar to a given artistic style while at the same time faithfully preserving temporal content. The use of optical flow allows the combination of style and content to be integrated directly with the apparent motion over frames of a video to produce smooth and visually appealing transitions. The implementation described in this thesis demonstrates how biologically-inspired systems such as convolutional neural networks are rapidly approaching human-level behavior in tasks that were once thought impossible for computers. Such a complex task elucidates the current and future technical and artistic capabilities of such biologically-inspired neural systems as their horizons expand exponentially. Further, this research provides unique insights into the way that humans perceive and utilize temporal information in everyday tasks. A secondary implementation that is explored in this thesis seeks to improve existing convolutional neural networks using a biological approach to the way these models adapt to their inputs. This implementation shows how these pattern recognition systems can be greatly improved by integrating recent neuroscience research into already biologically inspired systems. Such a novel hybrid activation function model replicates recent findings in the field of neuroscience and shows significant advantages over existing static activation functions

RIT Scholar Works

Neural network-based emulation of interstellar medium models

The interpretation of observations of atomic and molecular tracers in the galactic and extragalactic interstellar medium (ISM) requires comparisons with state-of-the-art astrophysical models to infer some physical conditions. Usually, ISM models are too time-consuming for such inference procedures, as they call for numerous model evaluations. As a result, they are often replaced by an interpolation of a grid of precomputed models. We propose a new general method to derive faster, lighter, and more accurate approximations of the model from a grid of precomputed models. These emulators are defined with artificial neural networks (ANNs) designed and trained to address the specificities inherent in ISM models. Indeed, such models often predict many observables (e.g., line intensities) from just a few input physical parameters and can yield outliers due to numerical instabilities or physical bistabilities. We propose applying five strategies to address these characteristics: 1) an outlier removal procedure; 2) a clustering method that yields homogeneous subsets of lines that are simpler to predict with different ANNs; 3) a dimension reduction technique that enables to adequately size the network architecture; 4) the physical inputs are augmented with a polynomial transform to ease the learning of nonlinearities; and 5) a dense architecture to ease the learning of simple relations. We compare the proposed ANNs with standard classes of interpolation methods to emulate the Meudon PDR code, a representative ISM numerical model. Combinations of the proposed strategies outperform all interpolation methods by a factor of 2 on the average error, reaching 4.5% on the Meudon PDR code. These networks are also 1000 times faster than accurate interpolation methods and require ten to forty times less memory. This work will enable efficient inferences on wide-field multiline observations of the ISM

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Hal - Université Grenoble Alpes