Search CORE

43 research outputs found

AutoGraph: Automated Graph Neural Network

Author: F Scarselli
P Cui
Y LeCun
Y Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/11/2020
Field of study

Graphs play an important role in many applications. Recently, Graph Neural Networks (GNNs) have achieved promising results in graph analysis tasks. Some state-of-the-art GNN models have been proposed, e.g., Graph Convolutional Networks (GCNs), Graph Attention Networks (GATs), etc. Despite these successes, most of the GNNs only have shallow structure. This causes the low expressive power of the GNNs. To fully utilize the power of the deep neural network, some deep GNNs have been proposed recently. However, the design of deep GNNs requires significant architecture engineering. In this work, we propose a method to automate the deep GNNs design. In our proposed method, we add a new type of skip connection to the GNNs search space to encourage feature reuse and alleviate the vanishing gradient problem. We also allow our evolutionary algorithm to increase the layers of GNNs during the evolution to generate deeper networks. We evaluate our method in the graph node classification task. The experiments show that the GNNs generated by our method can obtain state-of-the-art results in Cora, Citeseer, Pubmed and PPI datasets.Comment: Accepted by ICONIP 202

arXiv.org e-Print Archive

Crossref

Notes on Hierarchical Splines, DCLNs and i-theory

Author: Anselmi Fabio
Cohen Nadav
Poggio Tomaso
Rosasco Lorenzo
Shashua Amnon
Publication venue: Center for Brains, Minds and Machines (CBMM)
Publication date: 29/09/2015
Field of study

We define an extension of classical additive splines for multivariate function approximation that we call hierarchical splines. We show that the case of hierarchical, additive, piece-wise linear splines includes present-day Deep Convolutional Learning Networks (DCLNs) with linear rectifiers and pooling (sum or max). We discuss how these observations together with i-theory may provide a framework for a general theory of deep networks.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216

DSpace@MIT

Limitations of the Empirical Fisher Approximation for Natural Gradient Descent

Author: Balles Lukas
Hennig Philipp
Kunstner Frederik
Publication venue
Publication date: 01/06/2020
Field of study

Natural gradient descent, which preconditions a gradient descent update with the Fisher information matrix of the underlying statistical model, is a way to capture partial second-order information. Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam. We dispute this argument by showing that the empirical Fisher---unlike the Fisher---does not generally capture second-order information. We further argue that the conditions under which the empirical Fisher approaches the Fisher (and the Hessian) are unlikely to be met in practice, and that, even on simple optimization problems, the pathologies of the empirical Fisher can have undesirable effects.Comment: V3: Minor corrections (typographic errors

arXiv.org e-Print Archive

MPG.PuRe