51,486 research outputs found
Theoretical Properties of Projection Based Multilayer Perceptrons with Functional Inputs
Many real world data are sampled functions. As shown by Functional Data
Analysis (FDA) methods, spectra, time series, images, gesture recognition data,
etc. can be processed more efficiently if their functional nature is taken into
account during the data analysis process. This is done by extending standard
data analysis methods so that they can apply to functional inputs. A general
way to achieve this goal is to compute projections of the functional data onto
a finite dimensional sub-space of the functional space. The coordinates of the
data on a basis of this sub-space provide standard vector representations of
the functions. The obtained vectors can be processed by any standard method. In
our previous work, this general approach has been used to define projection
based Multilayer Perceptrons (MLPs) with functional inputs. We study in this
paper important theoretical properties of the proposed model. We show in
particular that MLPs with functional inputs are universal approximators: they
can approximate to arbitrary accuracy any continuous mapping from a compact
sub-space of a functional space to R. Moreover, we provide a consistency result
that shows that any mapping from a functional space to R can be learned thanks
to examples by a projection based MLP: the generalization mean square error of
the MLP decreases to the smallest possible mean square error on the data when
the number of examples goes to infinity
Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units
We generalize recent theoretical work on the minimal number of layers of
narrow deep belief networks that can approximate any probability distribution
on the states of their visible units arbitrarily well. We relax the setting of
binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010;
Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the
vanishing approximation error to an arbitrary approximation error tolerance.
For example, we show that a -ary deep belief network with layers of width for some can approximate any probability
distribution on without exceeding a Kullback-Leibler
divergence of . Our analysis covers discrete restricted Boltzmann
machines and na\"ive Bayes models as special cases.Comment: 19 pages, 5 figures, 1 tabl
Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a Review
The paper characterizes classes of functions for which deep learning can be
exponentially better than shallow learning. Deep convolutional networks are a
special case of these conditions, though weight sharing is not the main reason
for their exponential advantage
Description of spreading dynamics by microscopic network models and macroscopic branching processes can differ due to coalescence
Spreading processes are conventionally monitored on a macroscopic level by
counting the number of incidences over time. The spreading process can then be
modeled either on the microscopic level, assuming an underlying interaction
network, or directly on the macroscopic level, assuming that microscopic
contributions are negligible. The macroscopic characteristics of both
descriptions are commonly assumed to be identical. In this work, we show that
these characteristics of microscopic and macroscopic descriptions can be
different due to coalescence, i.e., a node being activated at the same time by
multiple sources. In particular, we consider a (microscopic) branching network
(probabilistic cellular automaton) with annealed connectivity disorder, record
the macroscopic activity, and then approximate this activity by a (macroscopic)
branching process. In this framework, we analytically calculate the effect of
coalescence on the collective dynamics. We show that coalescence leads to a
universal non-linear scaling function for the conditional expectation value of
successive network activity. This allows us to quantify the difference between
the microscopic model parameter and established macroscopic estimates. To
overcome this difference, we propose a non-linear estimator that correctly
infers the model branching parameter for all system sizes.Comment: 13 page
Probability of local bifurcation type from a fixed point: A random matrix perspective
Results regarding probable bifurcations from fixed points are presented in
the context of general dynamical systems (real, random matrices), time-delay
dynamical systems (companion matrices), and a set of mappings known for their
properties as universal approximators (neural networks). The eigenvalue spectra
is considered both numerically and analytically using previous work of Edelman
et. al. Based upon the numerical evidence, various conjectures are presented.
The conclusion is that in many circumstances, most bifurcations from fixed
points of large dynamical systems will be due to complex eigenvalues.
Nevertheless, surprising situations are presented for which the aforementioned
conclusion is not general, e.g. real random matrices with Gaussian elements
with a large positive mean and finite variance.Comment: 21 pages, 19 figure
- …