Search CORE

56,754 research outputs found

Neural Networks as Paths through the Space of Representations

Author: Kording Konrad P.
Kwok Devin
Lange Richard D.
Matelsky Jordan
Rolnick David S.
Wang Xinyue
Publication venue
Publication date: 27/11/2022
Field of study

Deep neural networks implement a sequence of layer-by-layer operations that are each relatively easy to understand, but the resulting overall computation is generally difficult to understand. We consider a simple hypothesis for interpreting the layer-by-layer construction of useful representations: perhaps the role of each layer is to reformat information to reduce the "distance" to the desired outputs. With this framework, the layer-wise computation implemented by a deep neural network can be viewed as a path through a high-dimensional representation space. We formalize this intuitive idea of a "path" by leveraging recent advances in *metric* representational similarity. We extend existing representational distance methods by computing geodesics, angles, and projections of representations, going beyond mere layer distances. We then demonstrate these tools by visualizing and comparing the paths taken by ResNet and VGG architectures on CIFAR-10. We conclude by sketching additional ways that this kind of representational geometry can be used to understand and interpret network training, and to describe novel kinds of similarities between different models.Comment: 10 pages, submitted to ICLR 202

arXiv.org e-Print Archive

Layer-wise training for self-supervised learning on graphs

Author: Pina Oscar
Vilaplana Verónica
Publication venue
Publication date: 04/09/2023
Field of study

End-to-end training of graph neural networks (GNN) on large graphs presents several memory and computational challenges, and limits the application to shallow architectures as depth exponentially increases the memory and space complexities. In this manuscript, we propose Layer-wise Regularized Graph Infomax, an algorithm to train GNNs layer by layer in a self-supervised manner. We decouple the feature propagation and feature transformation carried out by GNNs to learn node representations in order to derive a loss function based on the prediction of future inputs. We evaluate the algorithm in inductive large graphs and show similar performance to other end to end methods and a substantially increased efficiency, which enables the training of more sophisticated models in one single device. We also show that our algorithm avoids the oversmoothing of the representations, another common challenge of deep GNNs

arXiv.org e-Print Archive

Understanding and Comparing Deep Neural Networks for Age and Gender Classification

Author: Binder Alexander
Lapuschkin Sebastian
Müller Klaus-Robert
Samek Wojciech
Publication venue
Publication date: 25/08/2017
Field of study

Recently, deep neural networks have demonstrated excellent performances in recognizing the age and gender on human face images. However, these models were applied in a black-box manner with no information provided about which facial features are actually used for prediction and how these features depend on image preprocessing, model initialization and architecture choice. We present a study investigating these different effects. In detail, our work compares four popular neural network architectures, studies the effect of pretraining, evaluates the robustness of the considered alignment preprocessings via cross-method test set swapping and intuitively visualizes the model's prediction strategies in given preprocessing conditions using the recent Layer-wise Relevance Propagation (LRP) algorithm. Our evaluations on the challenging Adience benchmark show that suitable parameter initialization leads to a holistic perception of the input, compensating artefactual data representations. With a combination of simple preprocessing steps, we reach state of the art performance in gender recognition.Comment: 8 pages, 5 figures, 5 tables. Presented at ICCV 2017 Workshop: 7th IEEE International Workshop on Analysis and Modeling of Faces and Gesture

arXiv.org e-Print Archive

Fraunhofer-ePrints