Search CORE

6,808 research outputs found

Scalable and Sustainable Deep Learning via Randomized Hashing

Author: Chen Wenlin
Gionis Aristides
Indyk Piotr
Loosli Gaëlle
Lv Qin
McMahan H. Brendan
Recht Benjamin
Shrivastava Anshumali
Shrivastava Anshumali
Publication venue
Publication date: 04/12/2016
Field of study

Current deep learning architectures are growing larger in order to learn from complex datasets. These architectures require giant matrix multiplication operations to train millions of parameters. Conversely, there is another growing trend to bring deep learning to low-power, embedded devices. The matrix operations, associated with both training and testing of deep networks, are very expensive from a computational and energy standpoint. We present a novel hashing based technique to drastically reduce the amount of computation needed to train and test deep networks. Our approach combines recent ideas from adaptive dropouts and randomized hashing for maximum inner product search to select the nodes with the highest activation efficiently. Our new algorithm for deep learning reduces the overall computational cost of forward and back-propagation by operating on significantly fewer (sparse) nodes. As a consequence, our algorithm uses only 5% of the total multiplications, while keeping on average within 1% of the accuracy of the original model. A unique property of the proposed hashing based back-propagation is that the updates are always sparse. Due to the sparse gradient updates, our algorithm is ideally suited for asynchronous and parallel training leading to near linear speedup with increasing number of cores. We demonstrate the scalability and sustainability (energy efficiency) of our proposed algorithm via rigorous experimental evaluations on several real datasets

arXiv.org e-Print Archive

Crossref

Asymptotic stability for neural networks with mixed time-delays: The discrete-time case

Author: Liu Xiaohui.
Liu Yurong.
Wang Zidong.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

This is the post print version of the article. The official published version can be obtained from the link - Copyright 2009 Elsevier LtdThis paper is concerned with the stability analysis problem for a new class of discrete-time recurrent neural networks with mixed time-delays. The mixed time-delays that consist of both the discrete and distributed time-delays are addressed, for the first time, when analyzing the asymptotic stability for discrete-time neural networks. The activation functions are not required to be differentiable or strictly monotonic. The existence of the equilibrium point is first proved under mild conditions. By constructing a new Lyapnuov–Krasovskii functional, a linear matrix inequality (LMI) approach is developed to establish sufficient conditions for the discrete-time neural networks to be globally asymptotically stable. As an extension, we further consider the stability analysis problem for the same class of neural networks but with state-dependent stochastic disturbances. All the conditions obtained are expressed in terms of LMIs whose feasibility can be easily checked by using the numerically efficient Matlab LMI Toolbox. A simulation example is presented to show the usefulness of the derived LMI-based stability condition.This work was supported in part by the Biotechnology and Biological Sciences Research Council (BBSRC) of the UK under Grants BB/C506264/1 and 100/EGM17735, the Engineering and Physical Sciences Research Council (EPSRC) of the UK under Grants GR/S27658/01 and EP/C524586/1, an International Joint Project sponsored by the Royal Society of the UK, the Natural Science Foundation of Jiangsu Province of China under Grant BK2007075, the National Natural Science Foundation of China under Grant 60774073, and the Alexander von Humboldt Foundation of Germany

aCQUIRe

Brunel University Research Archive

ACQUIRE

Discrete-time recurrent neural networks with time-varying delays: Exponential stability analysis

Author: Alan Serrano
Arik
Cao
Cao
Chen
Chen
Fridman
Gao
Gao
Gao
Gicquel
Hu
Liang
Liang
Mohamad
Mohamad
Song
Stuart
Tan
Wang
Wang
Wang
Wang
Wang
Wang
Xiang
Xiaohui Liu
Xiong
Yu
Yuan
Yurong Liu
Zhao
Zidong Wang
Zou
Publication venue: 'Elsevier BV'
Publication date: 01/03/2007
Field of study

This is the post print version of the article. The official published version can be obtained from the link below - Copyright 2007 Elsevier LtdThis Letter is concerned with the analysis problem of exponential stability for a class of discrete-time recurrent neural networks (DRNNs) with time delays. The delay is of the time-varying nature, and the activation functions are assumed to be neither differentiable nor strict monotonic. Furthermore, the description of the activation functions is more general than the recently commonly used Lipschitz conditions. Under such mild conditions, we first prove the existence of the equilibrium point. Then, by employing a Lyapunov–Krasovskii functional, a unified linear matrix inequality (LMI) approach is developed to establish sufficient conditions for the DRNNs to be globally exponentially stable. It is shown that the delayed DRNNs are globally exponentially stable if a certain LMI is solvable, where the feasibility of such an LMI can be easily checked by using the numerically efficient Matlab LMI Toolbox. A simulation example is presented to show the usefulness of the derived LMI-based stability condition.This work was supported in part by the Engineering and Physical Sciences Research Council (EPSRC) of the UK under Grant GR/S27658/01, the Nuffield Foundation of the UK under Grant NAL/00630/G, the Alexander von Humboldt Foundation of Germany, the Natural Science Foundation of Jiangsu Education Committee of China (05KJB110154), the NSF of Jiangsu Province of China (BK2006064), and the National Natural Science Foundation of China (10471119)

Crossref

Brunel University Research Archive

Gaussian Error Linear Units (GELUs)

Author: Gimpel Kevin
Hendrycks Dan
Publication venue
Publication date: 08/07/2020
Field of study

We propose the Gaussian Error Linear Unit (GELU), a high-performing neural network activation function. The GELU activation function is

x\Phi(x)

, where

\Phi(x)

the standard Gaussian cumulative distribution function. The GELU nonlinearity weights inputs by their value, rather than gates inputs by their sign as in ReLUs (

x\mathbf{1}_{x>0}

). We perform an empirical evaluation of the GELU nonlinearity against the ReLU and ELU activations and find performance improvements across all considered computer vision, natural language processing, and speech tasks.Comment: Trimmed version of 2016 draft; add exact formul

arXiv.org e-Print Archive

Design of exponential state estimators for neural networks with mixed time delays

Author: Arik
Cao
Cao
Cao
Elanayar
Gahinet
Hale
Huang
Joy
Liu
Singh
Song
Song
Wang
Wang
Wang
Wang
Wang
Xiaohui Liu
Yurong Liu
Zhao
Zhao
Zidong Wang
Publication venue: 'Elsevier BV'
Publication date: 01/05/2007
Field of study

This is the post print version of the article. The official published version can be obtained from the link below - Copyright 2007 Elsevier Ltd.In this Letter, the state estimation problem is dealt with for a class of recurrent neural networks (RNNs) with mixed discrete and distributed delays. The activation functions are assumed to be neither monotonic, nor differentiable, nor bounded. We aim at designing a state estimator to estimate the neuron states, through available output measurements, such that the dynamics of the estimation error is globally exponentially stable in the presence of mixed time delays. By using the Laypunov–Krasovskii functional, a linear matrix inequality (LMI) approach is developed to establish sufficient conditions to guarantee the existence of the state estimators. We show that both the existence conditions and the explicit expression of the desired estimator can be characterized in terms of the solution to an LMI. A simulation example is exploited to show the usefulness of the derived LMI-based stability conditions.This work was supported in part by the Engineering and Physical Sciences Research Council (EPSRC) of the UK under Grant GR/S27658/01, the Nuffield Foundation of the UK under Grant NAL/00630/G, the Alexander von Humboldt Foundation of Germany, the Natural Science Foundation of Jiangsu Education Committee of China under Grants 05KJB110154 and BK2006064, and the National Natural Science Foundation of China under Grants 10471119 and 10671172

Crossref

Brunel University Research Archive