4,574 research outputs found
A survey on modern trainable activation functions
In neural networks literature, there is a strong interest in identifying and
defining activation functions which can improve neural network performance. In
recent years there has been a renovated interest of the scientific community in
investigating activation functions which can be trained during the learning
process, usually referred to as "trainable", "learnable" or "adaptable"
activation functions. They appear to lead to better network performance.
Diverse and heterogeneous models of trainable activation function have been
proposed in the literature. In this paper, we present a survey of these models.
Starting from a discussion on the use of the term "activation function" in
literature, we propose a taxonomy of trainable activation functions, highlight
common and distinctive proprieties of recent and past models, and discuss main
advantages and limitations of this type of approach. We show that many of the
proposed approaches are equivalent to adding neuron layers which use fixed
(non-trainable) activation functions and some simple local rule that
constraints the corresponding weight layers.Comment: Published in "Neural Networks" journal (Elsevier
Fleet Prognosis with Physics-informed Recurrent Neural Networks
Services and warranties of large fleets of engineering assets is a very
profitable business. The success of companies in that area is often related to
predictive maintenance driven by advanced analytics. Therefore, accurate
modeling, as a way to understand how the complex interactions between operating
conditions and component capability define useful life, is key for services
profitability. Unfortunately, building prognosis models for large fleets is a
daunting task as factors such as duty cycle variation, harsh environments,
inadequate maintenance, and problems with mass production can lead to large
discrepancies between designed and observed useful lives. This paper introduces
a novel physics-informed neural network approach to prognosis by extending
recurrent neural networks to cumulative damage models. We propose a new
recurrent neural network cell designed to merge physics-informed and
data-driven layers. With that, engineers and scientists have the chance to use
physics-informed layers to model parts that are well understood (e.g., fatigue
crack growth) and use data-driven layers to model parts that are poorly
characterized (e.g., internal loads). A simple numerical experiment is used to
present the main features of the proposed physics-informed recurrent neural
network for damage accumulation. The test problem consist of predicting fatigue
crack length for a synthetic fleet of airplanes subject to different mission
mixes. The model is trained using full observation inputs (far-field loads) and
very limited observation of outputs (crack length at inspection for only a
portion of the fleet). The results demonstrate that our proposed hybrid
physics-informed recurrent neural network is able to accurately model fatigue
crack growth even when the observed distribution of crack length does not match
with the (unobservable) fleet distribution.Comment: Data and codes (including our implementation for both the multi-layer
perceptron, the stress intensity and Paris law layers, the cumulative damage
cell, as well as python driver scripts) used in this manuscript are publicly
available on GitHub at https://github.com/PML-UCF/pinn. The data and code are
released under the MIT Licens
Convolutional Drift Networks for Video Classification
Analyzing spatio-temporal data like video is a challenging task that requires
processing visual and temporal information effectively. Convolutional Neural
Networks have shown promise as baseline fixed feature extractors through
transfer learning, a technique that helps minimize the training cost on visual
information. Temporal information is often handled using hand-crafted features
or Recurrent Neural Networks, but this can be overly specific or prohibitively
complex. Building a fully trainable system that can efficiently analyze
spatio-temporal data without hand-crafted features or complex training is an
open challenge. We present a new neural network architecture to address this
challenge, the Convolutional Drift Network (CDN). Our CDN architecture combines
the visual feature extraction power of deep Convolutional Neural Networks with
the intrinsically efficient temporal processing provided by Reservoir
Computing. In this introductory paper on the CDN, we provide a very simple
baseline implementation tested on two egocentric (first-person) video activity
datasets.We achieve video-level activity classification results on-par with
state-of-the art methods. Notably, performance on this complex spatio-temporal
task was produced by only training a single feed-forward layer in the CDN.Comment: Published in IEEE Rebooting Computin
- …