Search CORE

8,161 research outputs found

Theoretical Properties of Projection Based Multilayer Perceptrons with Functional Inputs

Author: A. R. Barron
B. A. Brumback
Brieuc Conan-Guez
C. Abraham
D. R. Hoover
F. Ferraty
F. Rossi
F. Rossi
F. Rossi
Fabrice Rossi
G. Biau
G. Lugosi
G. M. James
G. M. James
G. M. James
G. M. James
H. Cardot
H. Cardot
H. Luschgy
H. White
I. W. Sandberg
I. W. Sandberg
I. W. Sandberg
J. A. Rice
J. Dauxois
J. Dauxois
J. Deville
J. O. Ramsay
L. Ferré
M. B. Stinchcombe
P. Besse
P. Besse
S. Leurgans
T. Chen
T. Chen
T. Chen
T. Chen
T. Hastie
T. Hastie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2006
Field of study

Many real world data are sampled functions. As shown by Functional Data Analysis (FDA) methods, spectra, time series, images, gesture recognition data, etc. can be processed more efficiently if their functional nature is taken into account during the data analysis process. This is done by extending standard data analysis methods so that they can apply to functional inputs. A general way to achieve this goal is to compute projections of the functional data onto a finite dimensional sub-space of the functional space. The coordinates of the data on a basis of this sub-space provide standard vector representations of the functions. The obtained vectors can be processed by any standard method. In our previous work, this general approach has been used to define projection based Multilayer Perceptrons (MLPs) with functional inputs. We study in this paper important theoretical properties of the proposed model. We show in particular that MLPs with functional inputs are universal approximators: they can approximate to arbitrary accuracy any continuous mapping from a compact sub-space of a functional space to R. Moreover, we provide a consistency result that shows that any mapping from a functional space to R can be learned thanks to examples by a projection based MLP: the generalization mean square error of the MLP decreases to the smallest possible mean square error on the data when the number of examples goes to infinity

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Nonparametric regression using deep neural networks with ReLU activation function

Author: Schmidt-Hieber Johannes
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/08/2020
Field of study

Consider the multivariate nonparametric regression model. It is shown that estimators based on sparsely connected deep neural networks with ReLU activation function and properly chosen network architecture achieve the minimax rates of convergence (up to

\log n

-factors) under a general composition assumption on the regression function. The framework includes many well-studied structural constraints such as (generalized) additive models. While there is a lot of flexibility in the network architecture, the tuning parameter is the sparsity of the network. Specifically, we consider large networks with number of potential network parameters exceeding the sample size. The analysis gives some insights into why multilayer feedforward neural networks perform well in practice. Interestingly, for ReLU activation function the depth (number of layers) of the neural network architectures plays an important role and our theory suggests that for nonparametric regression, scaling the network depth with the sample size is natural. It is also shown that under the composition assumption wavelet estimators can only achieve suboptimal rates.Comment: article, rejoinder and supplementary materia

arXiv.org e-Print Archive

University of Twente Research Information

A survey on modern trainable activation functions

Author: Apicella Andrea
Donnarumma Francesco
Isgrò Francesco
Prevete Roberto
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

In neural networks literature, there is a strong interest in identifying and defining activation functions which can improve neural network performance. In recent years there has been a renovated interest of the scientific community in investigating activation functions which can be trained during the learning process, usually referred to as "trainable", "learnable" or "adaptable" activation functions. They appear to lead to better network performance. Diverse and heterogeneous models of trainable activation function have been proposed in the literature. In this paper, we present a survey of these models. Starting from a discussion on the use of the term "activation function" in literature, we propose a taxonomy of trainable activation functions, highlight common and distinctive proprieties of recent and past models, and discuss main advantages and limitations of this type of approach. We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions and some simple local rule that constraints the corresponding weight layers.Comment: Published in "Neural Networks" journal (Elsevier

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Multilayer Networks

Author: Arenas Alexandre
Barthelemy Marc
Gleeson James P.
Kivelä Mikko
Moreno Yamir
Porter Mason A.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 03/03/2014
Field of study

In most natural and engineered systems, a set of entities interact with each other in complicated patterns that can encompass multiple types of relationships, change in time, and include other types of complications. Such systems include multiple subsystems and layers of connectivity, and it is important to take such "multilayer" features into account to try to improve our understanding of complex systems. Consequently, it is necessary to generalize "traditional" network theory by developing (and validating) a framework and associated tools to study multilayer systems in a comprehensive fashion. The origins of such efforts date back several decades and arose in multiple disciplines, and now the study of multilayer networks has become one of the most important directions in network science. In this paper, we discuss the history of multilayer networks (and related concepts) and review the exploding body of work on such networks. To unify the disparate terminology in the large body of recent work, we discuss a general framework for multilayer networks, construct a dictionary of terminology to relate the numerous existing concepts to each other, and provide a thorough discussion that compares, contrasts, and translates between related notions such as multilayer networks, multiplex networks, interdependent networks, networks of networks, and many others. We also survey and discuss existing data sets that can be represented as multilayer networks. We review attempts to generalize single-layer-network diagnostics to multilayer networks. We also discuss the rapidly expanding research on multilayer-network models and notions like community structure, connected components, tensor decompositions, and various types of dynamical processes on multilayer networks. We conclude with a summary and an outlook.Comment: Working paper; 59 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX

Post-processing partitions to identify domains of modularity optimization

Author: Emmons Scott
Gibson Ryan
Mucha Peter J.
Taylor Dane
Weir William H.
Publication venue: 'MDPI AG'
Publication date: 01/08/2017
Field of study

We introduce the Convex Hull of Admissible Modularity Partitions (CHAMP) algorithm to prune and prioritize different network community structures identified across multiple runs of possibly various computational heuristics. Given a set of partitions, CHAMP identifies the domain of modularity optimization for each partition ---i.e., the parameter-space domain where it has the largest modularity relative to the input set---discarding partitions with empty domains to obtain the subset of partitions that are "admissible" candidate community structures that remain potentially optimal over indicated parameter domains. Importantly, CHAMP can be used for multi-dimensional parameter spaces, such as those for multilayer networks where one includes a resolution parameter and interlayer coupling. Using the results from CHAMP, a user can more appropriately select robust community structures by observing the sizes of domains of optimization and the pairwise comparisons between partitions in the admissible subset. We demonstrate the utility of CHAMP with several example networks. In these examples, CHAMP focuses attention onto pruned subsets of admissible partitions that are 20-to-1785 times smaller than the sets of unique partitions obtained by community detection heuristics that were input into CHAMP.Comment: http://www.mdpi.com/1999-4893/10/3/9

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals