43 research outputs found
AutoGraph: Automated Graph Neural Network
Graphs play an important role in many applications. Recently, Graph Neural
Networks (GNNs) have achieved promising results in graph analysis tasks. Some
state-of-the-art GNN models have been proposed, e.g., Graph Convolutional
Networks (GCNs), Graph Attention Networks (GATs), etc. Despite these successes,
most of the GNNs only have shallow structure. This causes the low expressive
power of the GNNs. To fully utilize the power of the deep neural network, some
deep GNNs have been proposed recently. However, the design of deep GNNs
requires significant architecture engineering. In this work, we propose a
method to automate the deep GNNs design. In our proposed method, we add a new
type of skip connection to the GNNs search space to encourage feature reuse and
alleviate the vanishing gradient problem. We also allow our evolutionary
algorithm to increase the layers of GNNs during the evolution to generate
deeper networks. We evaluate our method in the graph node classification task.
The experiments show that the GNNs generated by our method can obtain
state-of-the-art results in Cora, Citeseer, Pubmed and PPI datasets.Comment: Accepted by ICONIP 202
Notes on Hierarchical Splines, DCLNs and i-theory
We define an extension of classical additive splines for multivariate function approximation that we call hierarchical splines. We show that the case of hierarchical, additive, piece-wise linear splines includes present-day Deep Convolutional Learning Networks (DCLNs) with linear rectifiers and pooling (sum or max). We discuss how these observations together with i-theory may provide a framework for a general theory of deep networks.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent
Natural gradient descent, which preconditions a gradient descent update with
the Fisher information matrix of the underlying statistical model, is a way to
capture partial second-order information. Several highly visible works have
advocated an approximation known as the empirical Fisher, drawing connections
between approximate second-order methods and heuristics like Adam. We dispute
this argument by showing that the empirical Fisher---unlike the Fisher---does
not generally capture second-order information. We further argue that the
conditions under which the empirical Fisher approaches the Fisher (and the
Hessian) are unlikely to be met in practice, and that, even on simple
optimization problems, the pathologies of the empirical Fisher can have
undesirable effects.Comment: V3: Minor corrections (typographic errors