28 research outputs found

    A Survey on Graph Kernels

    Get PDF
    Graph kernels have become an established and widely-used technique for solving classification tasks on graphs. This survey gives a comprehensive overview of techniques for kernel-based graph classification developed in the past 15 years. We describe and categorize graph kernels based on properties inherent to their design, such as the nature of their extracted graph features, their method of computation and their applicability to problems in practice. In an extensive experimental evaluation, we study the classification accuracy of a large suite of graph kernels on established benchmarks as well as new datasets. We compare the performance of popular kernels with several baseline methods and study the effect of applying a Gaussian RBF kernel to the metric induced by a graph kernel. In doing so, we find that simple baselines become competitive after this transformation on some datasets. Moreover, we study the extent to which existing graph kernels agree in their predictions (and prediction errors) and obtain a data-driven categorization of kernels as result. Finally, based on our experimental results, we derive a practitioner's guide to kernel-based graph classification

    Metrics for Graph Comparison: A Practitioner's Guide

    Full text link
    Comparison of graph structure is a ubiquitous task in data analysis and machine learning, with diverse applications in fields such as neuroscience, cyber security, social network analysis, and bioinformatics, among others. Discovery and comparison of structures such as modular communities, rich clubs, hubs, and trees in data in these fields yields insight into the generative mechanisms and functional properties of the graph. Often, two graphs are compared via a pairwise distance measure, with a small distance indicating structural similarity and vice versa. Common choices include spectral distances (also known as λ\lambda distances) and distances based on node affinities. However, there has of yet been no comparative study of the efficacy of these distance measures in discerning between common graph topologies and different structural scales. In this work, we compare commonly used graph metrics and distance measures, and demonstrate their ability to discern between common topological features found in both random graph models and empirical datasets. We put forward a multi-scale picture of graph structure, in which the effect of global and local structure upon the distance measures is considered. We make recommendations on the applicability of different distance measures to empirical graph data problem based on this multi-scale view. Finally, we introduce the Python library NetComp which implements the graph distances used in this work

    Network Representation Learning: From Traditional Feature Learning to Deep Learning

    Get PDF
    Network representation learning (NRL) is an effective graph analytics technique and promotes users to deeply understand the hidden characteristics of graph data. It has been successfully applied in many real-world tasks related to network science, such as social network data processing, biological information processing, and recommender systems. Deep Learning is a powerful tool to learn data features. However, it is non-trivial to generalize deep learning to graph-structured data since it is different from the regular data such as pictures having spatial information and sounds having temporal information. Recently, researchers proposed many deep learning-based methods in the area of NRL. In this survey, we investigate classical NRL from traditional feature learning method to the deep learning-based model, analyze relationships between them, and summarize the latest progress. Finally, we discuss open issues considering NRL and point out the future directions in this field

    Network alignment and similarity reveal atlas-based topological differences in structural connectomes

    Get PDF
    The interactions between different brain regions can be modeled as a graph, called connectome, whose nodes correspond to parcels from a predefined brain atlas. The edges of the graph encode the strength of the axonal connectivity between regions of the atlas which can be estimated via diffusion Magnetic Resonance Imaging (MRI) tractography. Herein, we aim at providing a novel perspective on the problem of choosing a suitable atlas for structural connectivity studies by assessing how robustly an atlas captures the network topology across different subjects in a homogeneous cohort. We measure this robustness by assessing the alignability of the connectomes, namely the possibility to retrieve graph matchings that provide highly similar graphs. We introduce two novel concepts. First, the graph Jaccard index (GJI), a graph similarity measure based on the well-established Jaccard index between sets; the GJI exhibits natural mathematical properties that are not satisfied by previous approaches. Second, we devise WL-align, a new technique for aligning connectomes obtained by adapting the Weisfeiler-Lehman (WL) graph-isomorphism test.We validated the GJI and WL-align on data from the Human Connectome Project database, inferring a strategy for choosing a suitable parcellation for structural connectivity studies. Code and data are publicly available

    Learning Interpretable Features of Graphs and Time Series Data

    Get PDF
    Graphs and time series are two of the most ubiquitous representations of data of modern time. Representation learning of real-world graphs and time-series data is a key component for the downstream supervised and unsupervised machine learning tasks such as classification, clustering, and visualization. Because of the inherent high dimensionality, representation learning, i.e., low dimensional vector-based embedding of graphs and time-series data is very challenging. Learning interpretable features incorporates transparency of the feature roles, and facilitates downstream analytics tasks in addition to maximizing the performance of the downstream machine learning models. In this thesis, we leveraged tensor (multidimensional array) decomposition for generating interpretable and low dimensional feature space of graphs and time-series data found from three domains: social networks, neuroscience, and heliophysics. We present the theoretical models and empirical results on node embedding of social networks, biomarker embedding on fMRI-based brain networks, and prediction and visualization of multivariate time-series-based flaring and non-flaring solar events

    Network representation learning: From traditional feature learning to deep learning

    Get PDF
    Network representation learning (NRL) is an effective graph analytics technique and promotes users to deeply understand the hidden characteristics of graph data. It has been successfully applied in many real-world tasks related to network science, such as social network data processing, biological information processing, and recommender systems. Deep Learning is a powerful tool to learn data features. However, it is non-trivial to generalize deep learning to graph-structured data since it is different from the regular data such as pictures having spatial information and sounds having temporal information. Recently, researchers proposed many deep learning-based methods in the area of NRL. In this survey, we investigate classical NRL from traditional feature learning method to the deep learning-based model, analyze relationships between them, and summarize the latest progress. Finally, we discuss open issues considering NRL and point out the future directions in this field. © 2020 Institute of Electrical and Electronics Engineers Inc.. All rights reserved

    Robustness analysis of graph-based machine learning

    Get PDF
    Graph-based machine learning is an emerging approach to analysing data that is or can be well-modelled by pairwise relationships between entities. This includes examples such as social networks, road networks, protein-protein interaction net- works and molecules. Despite the plethora of research dedicated to designing novel machine learning models, less attention has been paid to the theoretical proper- ties of our existing tools. In this thesis, we focus on the robustness properties of graph-based machine learning models, in particular spectral graph filters and graph neural networks. Robustness is an essential property for dealing with noisy data, protecting a system against security vulnerabilities and, in some cases, necessary for transferability, amongst other things. We focus specifically on the challenging and combinatorial problem of robustness with respect to the topology of the underlying graph. The first part of this thesis proposes stability bounds to help understand to which topological changes graph-based models are robust. Beyond theoretical results, we conduct experiments to verify the intuition this theory provides. In the second part, we propose a flexible and query-efficient method to perform black-box adversarial attacks on graph classifiers. Adversarial attacks can be considered a search for model instability and provide an upper bound between an input and the decision boundary. In the third and final part of the thesis, we propose a novel robustness certificate for graph classifiers. Using a technique that can certify in- dividual parts of the graph at varying levels of perturbation, we provide a refined understanding of the perturbations to which a given model is robust. We believe the findings in this thesis provide novel insight and motivate further research into both understanding stability and instability of graph-based machine learning models
    corecore