51,486 research outputs found

    Theoretical Properties of Projection Based Multilayer Perceptrons with Functional Inputs

    Get PDF
    Many real world data are sampled functions. As shown by Functional Data Analysis (FDA) methods, spectra, time series, images, gesture recognition data, etc. can be processed more efficiently if their functional nature is taken into account during the data analysis process. This is done by extending standard data analysis methods so that they can apply to functional inputs. A general way to achieve this goal is to compute projections of the functional data onto a finite dimensional sub-space of the functional space. The coordinates of the data on a basis of this sub-space provide standard vector representations of the functions. The obtained vectors can be processed by any standard method. In our previous work, this general approach has been used to define projection based Multilayer Perceptrons (MLPs) with functional inputs. We study in this paper important theoretical properties of the proposed model. We show in particular that MLPs with functional inputs are universal approximators: they can approximate to arbitrary accuracy any continuous mapping from a compact sub-space of a functional space to R. Moreover, we provide a consistency result that shows that any mapping from a functional space to R can be learned thanks to examples by a projection based MLP: the generalization mean square error of the MLP decreases to the smallest possible mean square error on the data when the number of examples goes to infinity

    Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units

    Full text link
    We generalize recent theoretical work on the minimal number of layers of narrow deep belief networks that can approximate any probability distribution on the states of their visible units arbitrarily well. We relax the setting of binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010; Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the vanishing approximation error to an arbitrary approximation error tolerance. For example, we show that a qq-ary deep belief network with L2+qmδ1q1L\geq 2+\frac{q^{\lceil m-\delta \rceil}-1}{q-1} layers of width nm+logq(m)+1n \leq m + \log_q(m) + 1 for some mNm\in \mathbb{N} can approximate any probability distribution on {0,1,,q1}n\{0,1,\ldots,q-1\}^n without exceeding a Kullback-Leibler divergence of δ\delta. Our analysis covers discrete restricted Boltzmann machines and na\"ive Bayes models as special cases.Comment: 19 pages, 5 figures, 1 tabl

    Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a Review

    Get PDF
    The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning. Deep convolutional networks are a special case of these conditions, though weight sharing is not the main reason for their exponential advantage

    Description of spreading dynamics by microscopic network models and macroscopic branching processes can differ due to coalescence

    Full text link
    Spreading processes are conventionally monitored on a macroscopic level by counting the number of incidences over time. The spreading process can then be modeled either on the microscopic level, assuming an underlying interaction network, or directly on the macroscopic level, assuming that microscopic contributions are negligible. The macroscopic characteristics of both descriptions are commonly assumed to be identical. In this work, we show that these characteristics of microscopic and macroscopic descriptions can be different due to coalescence, i.e., a node being activated at the same time by multiple sources. In particular, we consider a (microscopic) branching network (probabilistic cellular automaton) with annealed connectivity disorder, record the macroscopic activity, and then approximate this activity by a (macroscopic) branching process. In this framework, we analytically calculate the effect of coalescence on the collective dynamics. We show that coalescence leads to a universal non-linear scaling function for the conditional expectation value of successive network activity. This allows us to quantify the difference between the microscopic model parameter and established macroscopic estimates. To overcome this difference, we propose a non-linear estimator that correctly infers the model branching parameter for all system sizes.Comment: 13 page

    Probability of local bifurcation type from a fixed point: A random matrix perspective

    Full text link
    Results regarding probable bifurcations from fixed points are presented in the context of general dynamical systems (real, random matrices), time-delay dynamical systems (companion matrices), and a set of mappings known for their properties as universal approximators (neural networks). The eigenvalue spectra is considered both numerically and analytically using previous work of Edelman et. al. Based upon the numerical evidence, various conjectures are presented. The conclusion is that in many circumstances, most bifurcations from fixed points of large dynamical systems will be due to complex eigenvalues. Nevertheless, surprising situations are presented for which the aforementioned conclusion is not general, e.g. real random matrices with Gaussian elements with a large positive mean and finite variance.Comment: 21 pages, 19 figure
    corecore