619 research outputs found
Robustness, Heterogeneity and Structure Capturing for Graph Representation Learning and its Application
Graph neural networks (GNNs) are potent methods for graph representation learn- ing (GRL), which extract knowledge from complicated (graph) structured data in various real-world scenarios. However, GRL still faces many challenges. Firstly GNN-based node classification may deteriorate substantially by overlooking the pos- sibility of noisy data in graph structures, as models wrongly process the relation among nodes in the input graphs as the ground truth. Secondly, nodes and edges have different types in the real-world and it is essential to capture this heterogeneity in graph representation learning. Next, relations among nodes are not restricted to pairwise relations and it is necessary to capture the complex relations accordingly. Finally, the absence of structural encodings, such as positional information, deterio- rates the performance of GNNs. This thesis proposes novel methods to address the aforementioned problems:
1. Bayesian Graph Attention Network (BGAT): Developed for situations with scarce data, this method addresses the influence of spurious edges. Incor- porating Bayesian principles into the graph attention mechanism enhances robustness, leading to competitive performance against benchmarks (Chapter 3).
2. Neighbour Contrastive Heterogeneous Graph Attention Network (NC-HGAT): By enhancing a cutting-edge self-supervised heterogeneous graph neural net- work model (HGAT) with neighbour contrastive learning, this method ad- dresses heterogeneity and uncertainty simultaneously. Extra attention to edge relations in heterogeneous graphs also aids in subsequent classification tasks (Chapter 4).
3. A novel ensemble learning framework is introduced for predicting stock price movements. It adeptly captures both group-level and pairwise relations, lead- ing to notable advancements over the existing state-of-the-art. The integration of hypergraph and graph models, coupled with the utilisation of auxiliary data via GNNs before recurrent neural network (RNN), provides a deeper under- standing of long-term dependencies between similar entities in multivariate time series analysis (Chapter 5).
4. A novel framework for graph structure learning is introduced, segmenting graphs into distinct patches. By harnessing the capabilities of transformers and integrating other position encoding techniques, this approach robustly capture intricate structural information within a graph. This results in a more comprehensive understanding of its underlying patterns (Chapter 6)
Attribute network models, stochastic approximation, and network sampling and ranking algorithms
We analyze dynamic random network models where younger vertices connect to
older ones with probabilities proportional to their degrees as well as a
propensity kernel governed by their attribute types. Using stochastic
approximation techniques we show that, in the large network limit, such
networks converge in the local weak sense to randomly stopped multitype
branching processes whose explicit description allows for the derivation of
asymptotics for a wide class of network functionals. These asymptotics imply
that while degree distribution tail exponents depend on the attribute type
(already derived by Jordan (2013)), Page-rank centrality scores have the
\emph{same} tail exponent across attributes. Moreover, the mean behavior of the
limiting Page-rank score distribution can be explicitly described and shown to
depend on the attribute type. The limit results also give explicit formulae for
the performance of various network sampling mechanisms. One surprising
consequence is the efficacy of Page-rank and walk based network sampling
schemes for directed networks in the setting of rare minorities. The results
also allow one to evaluate the impact of various proposed mechanisms to
increase degree centrality of minority attributes in the network, and to
quantify the bias in inferring about the network from an observed sample.
Further, we formalize the notion of resolvability of such models where, owing
to propagation of chaos type phenomenon in the evolution dynamics for such
models, one can set up a correspondence to models driven by continuous time
branching process dynamics.Comment: 48 page
Goodness of fit testing based on graph functionals for homogenous Erd\"os Renyi graphs
The Erd\"os Renyi graph is a popular choice to model network data as it is
parsimoniously parametrized, straightforward to interprete and easy to
estimate. However, it has limited suitability in practice, since it often fails
to capture crucial characteristics of real-world networks. To check the
adequacy of this model, we propose a novel class of goodness-of-fit tests for
homogeneous Erd\"os Renyi models against heterogeneous alternatives that allow
for nonconstant edge probabilities. We allow for asymptotically dense and
sparse networks. The tests are based on graph functionals that cover a broad
class of network statistics for which we derive limiting distributions in a
unified manner. The resulting class of asymptotic tests includes several
existing tests as special cases. Further, we propose a parametric bootstrap and
prove its consistency, which allows for performance improvements particularly
for small network sizes and avoids the often tedious variance estimation for
asymptotic tests. Moreover, we analyse the sensitivity of different
goodness-of-fit test statistics that rely on popular choices of subgraphs. We
evaluate the proposed class of tests and illustrate our theoretical findings by
extensive simulations
Recommended from our members
Foundations of Node Representation Learning
Low-dimensional node representations, also called node embeddings, are a cornerstone in the modeling and analysis of complex networks. In recent years, advances in deep learning have spurred development of novel neural network-inspired methods for learning node representations which have largely surpassed classical \u27spectral\u27 embeddings in performance. Yet little work asks the central questions of this thesis: Why do these novel deep methods outperform their classical predecessors, and what are their limitations?
We pursue several paths to answering these questions. To further our understanding of deep embedding methods, we explore their relationship with spectral methods, which are better understood, and show that some popular deep methods are equivalent to spectral methods in a certain natural limit. We also introduce the problem of inverting node embeddings in order to probe what information they contain. Further, we propose a simple, non-deep method for node representation learning, and find it to often be competitive with modern deep graph networks in downstream performance.
To better understand the limitations of node embeddings, we prove some upper and lower bounds on their capabilities. Most notably, we prove that node embeddings are capable of exact low-dimensional representation of networks with bounded max degree or arboricity, and we further show that a simple algorithm can find such exact embeddings for real-world networks. By contrast, we also prove inherent bounds on random graph models, including those derived from node embeddings, to capture key structural properties of networks without simply memorizing a given graph
Recommended from our members
Inference in ERGMs and Ising Models.
Discrete exponential families have drawn a lot of attention in probability, statistics, and machine learning, both classically and in the recent literature. This thesis studies in depth two discrete exponential families of concrete interest, (i) Exponential Random Graph Models (ERGMs) and (ii) Ising Models. In the ERGM setting, this thesis consider a “degree corrected” version of standard ERGMs, and in the Ising model setting, this thesis focus on Ising models on dense regular graphs, both from the point of view of statistical inference.
The first part of the thesis studies the problem of testing for sparse signals present on the vertices of ERGMs. It proposes computably efficient tests for a wide class of ERGMs. Focusing on the two star ERGM, it shows that the tests studied are “asymptotically efficient” in all parameter regimes except one, which is referred to as “critical point”. In the critical regime, it is shown that improved detection is possible. This shows that compared to the standard belief, in this setting dependence is actually beneficial to the inference problem. The main proof idea for analyzing the two star ERGM is a correlations estimate between degrees under local alternatives, which is possibly of independent interest.
In the second part of the thesis, we derive the limit of experiments for a class of one parameter Ising models on dense regular graphs. In particular, we show that the limiting experiment is Gaussian in the “low temperature” regime, non Gaussian in the “critical” regime, and an infinite collection of Gaussians in the “high temperature” regime. We also derive the limiting distributions of commonlt studied estimators, and study limiting power for tests of hypothesis against contiguous alternatives (whose scaling changes across the regimes). To the best of our knowledge, this is the first attempt at establishing the classical limits of experiments for Ising models (and more generally, Markov random fields)
A Connected World. Social Networks and Organizations
This is the submitted version. The final version is available from Cambridge University Press via the DOI in this recordThis Element synthesizes the current state of research on organizational social networks from its early foundations to contemporary debates. It highlights the characteristics that make the social network perspective distinctive in the organizational research landscape, including its emphasis on structure and outcomes. It covers the main theoretical developments and summarizes the research design questions that organizational researchers face when collecting and analyzing network data. Then, it discusses current debates ranging from agency and structure to network volatility and personality. Finally, the Element envisages future research directions on the role of brokerage for individuals and communities, network cognition, and the importance of past ties. Overall, the Element provides an innovative angle for understanding organizational social networks, engaging in empirical network research, and nurturing further theoretical development on the role of social interactions and connectedness in modern organizations
Learning and reasoning with graph data
Reasoning about graphs, and learning from graph data is a field of artificial intelligence that has recently received much attention in the machine learning areas of graph representation learning and graph neural networks. Graphs are also the underlying structures of interest in a wide range of more traditional fields ranging from logic-oriented knowledge representation and reasoning to graph kernels and statistical relational learning. In this review we outline a broad map and inventory of the field of learning and reasoning with graphs that spans the spectrum from reasoning in the form of logical deduction to learning node embeddings. To obtain a unified perspective on such a diverse landscape we introduce a simple and general semantic concept of a model that covers logic knowledge bases, graph neural networks, kernel support vector machines, and many other types of frameworks. Still at a high semantic level, we survey common strategies for model specification using probabilistic factorization and standard feature construction techniques. Based on this semantic foundation we introduce a taxonomy of reasoning tasks that casts problems ranging from transductive link prediction to asymptotic analysis of random graph models as queries of different complexities for a given model. Similarly, we express learning in different frameworks and settings in terms of a common statistical maximum likelihood principle. Overall, this review aims to provide a coherent conceptual framework that provides a basis for further theoretical analyses of respective strengths and limitations of different approaches to handling graph data, and that facilitates combination and integration of different modeling paradigms
A Connected World: Social Networks and Organizations
This Element synthesizes the current state of research on organizational social networks from its early foundations to contemporary debates. It highlights the characteristics that make the social network perspective distinctive in the organizational research landscape, including its emphasis on structure and outcomes. It covers the main theoretical developments and summarizes the research design questions that organizational researchers face when collecting and analyzing network data. Then, it discusses current debates ranging from agency and structure to network volatility and personality. Finally, the Element envisages future research directions on the role of brokerage for individuals and communities, network cognition, and the importance of past ties. Overall, the Element provides an innovative angle for understanding organizational social networks, engaging in empirical network research, and nurturing further theoretical development on the role of social interactions and connectedness in modern organizations
- …