Search CORE

116,855 research outputs found

Achieve the Minimum Width of Neural Networks for Universal Approximation

Author: Cai Yongqiang
Publication venue
Publication date: 23/09/2022
Field of study

The universal approximation property (UAP) of neural networks is fundamental for deep learning, and it is well known that wide neural networks are universal approximators of continuous functions within both the

L^p

norm and the continuous/uniform norm. However, the exact minimum width,

w_{\min}

, for the UAP has not been studied thoroughly. Recently, using a decoder-memorizer-encoder scheme, \citet{Park2021Minimum} found that

w_{\min} = \max(d_x+1,d_y)

for both the

L^p

-UAP of ReLU networks and the

C

-UAP of ReLU+STEP networks, where

d_x,d_y

are the input and output dimensions, respectively. In this paper, we consider neural networks with an arbitrary set of activation functions. We prove that both

C

-UAP and

L^p

-UAP for functions on compact domains share a universal lower bound of the minimal width; that is,

w^*_{\min} = \max(d_x,d_y)

. In particular, the critical width,

w^*_{\min}

, for

L^p

-UAP can be achieved by leaky-ReLU networks, provided that the input or output dimension is larger than one. Our construction is based on the approximation power of neural ordinary differential equations and the ability to approximate flow maps by neural networks. The nonmonotone or discontinuous activation functions case and the one-dimensional case are also discussed

arXiv.org e-Print Archive

Discrete synaptic events induce global oscillations in balanced neural networks

Author: di Volo Matteo
Goldobin Denis S.
Torcini Alessandro
Publication venue
Publication date: 10/11/2023
Field of study

Neural dynamics is triggered by discrete synaptic inputs of finite amplitude. However, the neural response is usually obtained within the diffusion approximation (DA) representing the synaptic inputs as Gaussian noise. We derive a mean-field formalism encompassing synaptic shot-noise for sparse balanced networks of spiking neurons. For low (high) external drives (synaptic strengths) irregular global oscillations emerge via continuous and hysteretic transitions, correctly predicted by our approach, but not from the DA. These oscillations display frequencies in biologically relevant bands.Comment: 6 pages, 3 figure

arXiv.org e-Print Archive

Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons

Author: Shen Zuowei
Yang Haizhao
Zhang Shijun
Publication venue
Publication date: 04/07/2022
Field of study

This paper develops simple feed-forward neural networks that achieve the universal approximation property for all continuous functions with a fixed finite number of neurons. These neural networks are simple because they are designed with a simple and computable continuous activation function

\sigma

leveraging a triangular-wave function and the softsign function. We prove that

\sigma

-activated networks with width

36d(2d+1)

and depth

11

can approximate any continuous function on a

d

-dimensional hypercube within an arbitrarily small error. Hence, for supervised learning and its related regression problems, the hypothesis space generated by these networks with a size not smaller than

36d(2d+1)\times 11

is dense in the continuous function space

C([a,b]^d)

and therefore dense in the Lebesgue spaces

L^p([a,b]^d)

for

p\in [1,\infty)

. Furthermore, classification functions arising from image and signal classification are in the hypothesis space generated by

\sigma

-activated networks with width

36d(2d+1)

and depth

12

, when there exist pairwise disjoint bounded closed subsets of

\mathbb{R}^d

such that the samples of the same class are located in the same subset. Finally, we use numerical experimentation to show that replacing the ReLU activation function by ours would improve the experiment results

arXiv.org e-Print Archive

Simplified neural networks algorithms for function approximation and regression boosting on discrete input spaces

Author: Haider Syed Shabbir
Zeng Xiaojun
Publication venue
Publication date: 01/01/2011
Field of study

Function approximation capabilities of feedforward Neural Networks have been widely investigated over the past couple of decades. There has been quite a lot of work carried out in order to prove 'Universal Approximation Property' of these Networks. Most of the work in application of Neural Networks for function approximation has concentrated on problems where the input variables are continuous. However, there are many real world examples around us in which input variables constitute only discrete values, or a significant number of these input variables are discrete. Most of the learning algorithms proposed so far do not distinguish between different features of continuous and discrete input spaces and treat them in more or less the same way. Due to this reason, corresponding learning algorithms becomes unnecessarily complex and time consuming, especially when dealing with inputs mainly consisting of discrete variables. More recently, it has been shown that by focusing on special features of discrete input spaces, more simplified and robust algorithms can be developed. The main objective of this work is to address the function approximation capabilities of Artificial Neural Networks. There is particular emphasis on development, implementation, testing and analysis of new learning algorithms for the Simplified Neural Network approximation scheme for functions defined on discrete input spaces. By developing the corresponding learning algorithms, and testing with different benchmarking data sets, it is shown that comparing conventional multilayer neural networks for approximating functions on discrete input spaces, the proposed simplified neural network architecture and algorithms can achieve similar or better approximation accuracy. This is particularly the case when dealing with high dimensional-low sample cases, but with a much simpler architecture and less parameters. In order to investigate wider implications of simplified Neural Networks, their application has been extended to the Regression Boosting frame work. By developing, implementing and testing with empirical data it has been shown that these simplified Neural Network based algorithms also performs well in other Neural Network based ensembles.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository

GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution

Author: Hernández-Lobato JM
Kusner MJ
Publication venue: arXiv.org
Publication date: 12/11/2016
Field of study

Generative Adversarial Networks (GAN) have limitations when the goal is to generate sequences of discrete elements. The reason for this is that samples from a distribution on discrete objects such as the multinomial are not differentiable with respect to the distribution parameters. This problem can be avoided by using the Gumbel-softmax distribution, which is a continuous approximation to a multinomial distribution parameterized in terms of the softmax function. In this work, we evaluate the performance of GANs based on recurrent neural networks with Gumbel-softmax output distributions in the task of generating sequences of discrete elements

arXiv.org e-Print Archive

UCL Discovery

Function Approximation with Randomly Initialized Neural Networks for Approximate Model Reference Adaptive Control

Author: Lamperski Andrew
Lekang Tyler
Publication venue
Publication date: 28/03/2023
Field of study

Classical results in neural network approximation theory show how arbitrary continuous functions can be approximated by networks with a single hidden layer, under mild assumptions on the activation function. However, the classical theory does not give a constructive means to generate the network parameters that achieve a desired accuracy. Recent results have demonstrated that for specialized activation functions, such as ReLUs and some classes of analytic functions, high accuracy can be achieved via linear combinations of randomly initialized activations. These recent works utilize specialized integral representations of target functions that depend on the specific activation functions used. This paper defines mollified integral representations, which provide a means to form integral representations of target functions using activations for which no direct integral representation is currently known. The new construction enables approximation guarantees for randomly initialized networks for a variety of widely used activation functions

arXiv.org e-Print Archive