191 research outputs found

    A survey on modern trainable activation functions

    Full text link
    In neural networks literature, there is a strong interest in identifying and defining activation functions which can improve neural network performance. In recent years there has been a renovated interest of the scientific community in investigating activation functions which can be trained during the learning process, usually referred to as "trainable", "learnable" or "adaptable" activation functions. They appear to lead to better network performance. Diverse and heterogeneous models of trainable activation function have been proposed in the literature. In this paper, we present a survey of these models. Starting from a discussion on the use of the term "activation function" in literature, we propose a taxonomy of trainable activation functions, highlight common and distinctive proprieties of recent and past models, and discuss main advantages and limitations of this type of approach. We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions and some simple local rule that constraints the corresponding weight layers.Comment: Published in "Neural Networks" journal (Elsevier

    Universal Approximation of Parametric Optimization via Neural Networks with Piecewise Linear Policy Approximation

    Full text link
    Parametric optimization solves a family of optimization problems as a function of parameters. It is a critical component in situations where optimal decision making is repeatedly performed for updated parameter values, but computation becomes challenging when complex problems need to be solved in real-time. Therefore, in this study, we present theoretical foundations on approximating optimal policy of parametric optimization problem through Neural Networks and derive conditions that allow the Universal Approximation Theorem to be applied to parametric optimization problems by constructing piecewise linear policy approximation explicitly. This study fills the gap on formally analyzing the constructed piecewise linear approximation in terms of feasibility and optimality and show that Neural Networks (with ReLU activations) can be valid approximator for this approximation in terms of generalization and approximation error. Furthermore, based on theoretical results, we propose a strategy to improve feasibility of approximated solution and discuss training with suboptimal solutions.Comment: 17 pages, 2 figures, preprint, under revie

    Greedy Shallow Networks: An Approach for Constructing and Training Neural Networks

    Get PDF
    We present a greedy-based approach to construct an efficient single hidden layer neural network with the ReLU activation that approximates a target function. In our approach we obtain a shallow network by utilizing a greedy algorithm with the prescribed dictionary provided by the available training data and a set of possible inner weights. To facilitate the greedy selection process we employ an integral representation of the network, based on the ridgelet transform, that significantly reduces the cardinality of the dictionary and hence promotes feasibility of the greedy selection. Our approach allows for the construction of efficient architectures which can be treated either as improved initializations to be used in place of random-based alternatives, or as fully-trained networks in certain cases, thus potentially nullifying the need for backpropagation training. Numerical experiments demonstrate the tenability of the proposed concept and its advantages compared to the conventional techniques for selecting architectures and initializations for neural networks
    • …
    corecore