156 research outputs found
Multitask Learning on Graph Neural Networks: Learning Multiple Graph Centrality Measures with a Unified Network
The application of deep learning to symbolic domains remains an active
research endeavour. Graph neural networks (GNN), consisting of trained neural
modules which can be arranged in different topologies at run time, are sound
alternatives to tackle relational problems which lend themselves to graph
representations. In this paper, we show that GNNs are capable of multitask
learning, which can be naturally enforced by training the model to refine a
single set of multidimensional embeddings and decode them
into multiple outputs by connecting MLPs at the end of the pipeline. We
demonstrate the multitask learning capability of the model in the relevant
relational problem of estimating network centrality measures, focusing
primarily on producing rankings based on these measures, i.e. is vertex
more central than vertex given centrality ?. We then show that a GNN
can be trained to develop a \emph{lingua franca} of vertex embeddings from
which all relevant information about any of the trained centrality measures can
be decoded. The proposed model achieves accuracy on a test dataset of
random instances with up to 128 vertices and is shown to generalise to larger
problem sizes. The model is also shown to obtain reasonable accuracy on a
dataset of real world instances with up to 4k vertices, vastly surpassing the
sizes of the largest instances with which the model was trained ().
Finally, we believe that our contributions attest to the potential of GNNs in
symbolic domains in general and in relational learning in particular.Comment: Published at ICANN2019. 10 pages, 3 Figure
Statistical Mechanics for Network Structure and Evolution
In this thesis, we address problems in complex networks using the methods of statistical mechanics and information theory. We particularly focus on the thermodynamic characterisation of networks and entropic analysis on statistics and dynamics of network evolution. After a brief introduction of background and motivation behind the thesis in Chapter 1, we provide a review of relevant literature in Chapter 2, and elaborate the main methods from Chapter 3 to Chapter 6.
In Chapter 3, we explore the normalised Laplacian matrix as the Hamiltonian operator of the network which governs the particle occupations corresponding to Maxwell-Boltzmann, Bose-Einstein and Fermi-Dirac statistics. The relevant partition functions derive the thermodynamic quantities in revealing network structural characterisations. Chapter 4 further decomposes the global network entropy in three statistics on edge-connection components. This decompensation reflects the detailed distribution of entropy across the edges of a network, and is particularly useful if the analysis of non-homogeneous networks with a strong community and hub structure is being attempted.
Furthermore, Chapter 5 and Chapter 6 provide the theoretical approaches to analyse the dynamic network evolution and the application of the real-world networks. In Chapter 5, we investigate both undirected and directed network evolution using the Euler-Lagrange equation. This variational principle is based on the von Neumann entropy for time-varying network structure. The presented model not only provides an accurate simulation of the degree statistics in network evolution, but also captures the topological variations taking place when the structure of a network changes violently. Chapter 6 studies the fMRI regional brain interaction networks using directed graphs. We further develop a novel method for characterising networks using Bose-Einstein entropy and the Jensen-Shannon divergence. It offers a high discrimination among patients with suspected Alzheimer's disease. Finally, Chapter 7 concludes the thesis and discusses the limitations of our methodologies, which also supplies the potential research in the future
Information Flow in Self-Supervised Learning
In this paper, we provide a comprehensive toolbox for understanding and
enhancing self-supervised learning (SSL) methods through the lens of matrix
information theory. Specifically, by leveraging the principles of matrix mutual
information and joint entropy, we offer a unified analysis for both contrastive
and feature decorrelation based methods. Furthermore, we propose the matrix
variational masked auto-encoder (M-MAE) method, grounded in matrix information
theory, as an enhancement to masked image modeling. The empirical evaluations
underscore the effectiveness of M-MAE compared with the state-of-the-art
methods, including a 3.9% improvement in linear probing ViT-Base, and a 1%
improvement in fine-tuning ViT-Large, both on ImageNet
Recommended from our members
Converting to Optimization in Machine Learning: Perturb-and-MAP, Differential Privacy, and Program Synthesis
On a mathematical level, most computational problems encountered in machine learning are instances of one of four abstract, fundamental problems: sampling, integration, optimization, and search.
Thanks to the rich history of the respective mathematical fields, disparate methods with different properties have been developed for these four problem classes.
As a result it can be beneficial to convert a problem from one abstract class into a problem of a different class, because the latter might come with insights, techniques, and algorithms well suited to the particular problem at hand.
In particular, this thesis contributes four new methods and generalizations of existing methods for converting specific non-optimization machine learning tasks into optimization problems with more appealing properties.
The first example is partition function estimation (an integration problem), where an existing algorithm -- the Gumbel trick -- for converting to the MAP optimization problem is generalized into a more general family of algorithms, such that other instances of this family have better statistical properties.
Second, this family of algorithms is further generalized to another integration problem, the problem of estimating Rényi entropies.
The third example shows how an intractable sampling problem arising when wishing to publicly release a database containing sensitive data in a safe ("differentially private") manner can be converted into an optimization problem using the theory of Reproducing Kernel Hilbert Spaces.
Finally, the fourth case study casts the challenging discrete search problem of program synthesis from input-output examples as a supervised learning task that can be efficiently tackled using gradient-based optimization.
In all four instances, the conversions result in novel algorithms with desirable properties.
In the first instance, new generalizations of the Gumbel trick can be used to construct statistical estimators of the partition function that achieve the same estimation error while using up to 40% fewer samples.
The second instance shows that unbiased estimators of the Rényi entropy can be constructed in the Perturb-and-MAP framework.
The main contribution of the third instance is theoretical: the conversion shows that it is possible to construct an algorithm for releasing synthetic databases that approximate databases containing sensitive data in a mathematically precise sense, and to prove results about their approximation errors.
Finally, the fourth conversion yields an algorithm for synthesising program source code from input-output examples that is able to solve test problems 1-3 orders of magnitude faster than a wide range of baselines
Investigating human-perceptual properties of "shapes" using 3D shapes and 2D fonts
Shapes are generally used to convey meaning. They are used in video games, films and other multimedia, in diverse ways. 3D shapes may be destined for virtual scenes or represent objects to be constructed in the real-world. Fonts add character to an otherwise plain block of text, allowing the writer to make important points more visually prominent or distinct from other text. They can indicate the structure of a document, at a glance. Rather than studying shapes through traditional geometric shape descriptors, we provide alternative methods to describe and analyse shapes, from a lens of human perception. This is done via the concepts of Schelling Points and Image Specificity. Schelling Points are choices people make when they aim to match with what they expect others to choose but cannot communicate with others to determine an answer. We study whole mesh selections in this setting, where Schelling Meshes are the most frequently selected shapes. The key idea behind image Specificity is that different images evoke different descriptions; but ‘Specific’ images yield more consistent descriptions than others. We apply Specificity to 2D fonts. We show that each concept can be learned and predict them for fonts and 3D shapes, respectively, using a depth image-based convolutional neural network. Results are shown for a range of fonts and 3D shapes and we demonstrate that font Specificity and the Schelling meshes concept are useful for visualisation, clustering, and search applications. Overall, we find that each concept represents similarities between their respective type of shape, even when there are discontinuities between the shape geometries themselves. The ‘context’ of these similarities is in some kind of abstract or subjective meaning which is consistent among different people
Computation in Complex Networks
Complex networks are one of the most challenging research focuses of disciplines, including physics, mathematics, biology, medicine, engineering, and computer science, among others. The interest in complex networks is increasingly growing, due to their ability to model several daily life systems, such as technology networks, the Internet, and communication, chemical, neural, social, political and financial networks. The Special Issue “Computation in Complex Networks" of Entropy offers a multidisciplinary view on how some complex systems behave, providing a collection of original and high-quality papers within the research fields of: • Community detection • Complex network modelling • Complex network analysis • Node classification • Information spreading and control • Network robustness • Social networks • Network medicin
Recommended from our members
Foundations of Node Representation Learning
Low-dimensional node representations, also called node embeddings, are a cornerstone in the modeling and analysis of complex networks. In recent years, advances in deep learning have spurred development of novel neural network-inspired methods for learning node representations which have largely surpassed classical \u27spectral\u27 embeddings in performance. Yet little work asks the central questions of this thesis: Why do these novel deep methods outperform their classical predecessors, and what are their limitations?
We pursue several paths to answering these questions. To further our understanding of deep embedding methods, we explore their relationship with spectral methods, which are better understood, and show that some popular deep methods are equivalent to spectral methods in a certain natural limit. We also introduce the problem of inverting node embeddings in order to probe what information they contain. Further, we propose a simple, non-deep method for node representation learning, and find it to often be competitive with modern deep graph networks in downstream performance.
To better understand the limitations of node embeddings, we prove some upper and lower bounds on their capabilities. Most notably, we prove that node embeddings are capable of exact low-dimensional representation of networks with bounded max degree or arboricity, and we further show that a simple algorithm can find such exact embeddings for real-world networks. By contrast, we also prove inherent bounds on random graph models, including those derived from node embeddings, to capture key structural properties of networks without simply memorizing a given graph
Neural function approximation on graphs: shape modelling, graph discrimination & compression
Graphs serve as a versatile mathematical abstraction of real-world phenomena in numerous scientific disciplines. This thesis is part of the Geometric Deep Learning subject area, a family of learning paradigms, that capitalise on the increasing volume of non-Euclidean data so as to solve real-world tasks in a data-driven manner. In particular, we focus on the topic of graph function approximation using neural networks, which lies at the heart of many relevant methods. In the first part of the thesis, we contribute to the understanding and design of Graph Neural Networks (GNNs). Initially, we investigate the problem of learning on signals supported on a fixed graph. We show that treating graph signals as general graph spaces is restrictive and conventional GNNs have limited expressivity. Instead, we expose a more enlightening perspective by drawing parallels between graph signals and signals on Euclidean grids, such as images and audio. Accordingly, we propose a permutation-sensitive GNN based on an operator analogous to shifts in grids and instantiate it on 3D meshes for shape modelling (Spiral Convolutions). Following, we focus on learning on general graph spaces and in particular on functions that are invariant to graph isomorphism. We identify a fundamental trade-off between invariance, expressivity and computational complexity, which we address with a symmetry-breaking mechanism based on substructure encodings (Graph Substructure Networks). Substructures are shown to be a powerful tool that provably improves expressivity while controlling computational complexity, and a useful inductive bias in network science and chemistry. In the second part of the thesis, we discuss the problem of graph compression, where we analyse the information-theoretic principles and the connections with graph generative models. We show that another inevitable trade-off surfaces, now between computational complexity and compression quality, due to graph isomorphism. We propose a substructure-based dictionary coder - Partition and Code (PnC) - with theoretical guarantees that can be adapted to different graph distributions by estimating its parameters from observations. Additionally, contrary to the majority of neural compressors, PnC is parameter and sample efficient and is therefore of wide practical relevance. Finally, within this framework, substructures are further illustrated as a decisive archetype for learning problems on graph spaces.Open Acces
- …