Search CORE

44 research outputs found

Extending local features with contextual information in graph kernels

Author: CC Aggarwal
G San Martino Da
M Collins
M Collins
N Shervashidze
S Vishwanathan
SVN Vishwanathan
T Gärtner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Graph kernels are usually defined in terms of simpler kernels over local substructures of the original graphs. Different kernels consider different types of substructures. However, in some cases they have similar predictive performances, probably because the substructures can be interpreted as approximations of the subgraphs they induce. In this paper, we propose to associate to each feature a piece of information about the context in which the feature appears in the graph. A substructure appearing in two different graphs will match only if it appears with the same context in both graphs. We propose a kernel based on this idea that considers trees as substructures, and where the contexts are features too. The kernel is inspired from the framework in [6], even if it is not part of it. We give an efficient algorithm for computing the kernel and show promising results on real-world graph classification datasets.Comment: To appear in ICONIP 201

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova

funcGNN: A Graph Neural Network Approach to Program Similarity

Author: Cohen William W
Hunt John
Li Qimai
Phan Anh Viet
Vallee-Rai Raja
Vishwanathan SVN
Walenstein Andrew
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/07/2020
Field of study

Program similarity is a fundamental concept, central to the solution of software engineering tasks such as software plagiarism, clone identification, code refactoring and code search. Accurate similarity estimation between programs requires an in-depth understanding of their structure, semantics and flow. A control flow graph (CFG), is a graphical representation of a program which captures its logical control flow and hence its semantics. A common approach is to estimate program similarity by analysing CFGs using graph similarity measures, e.g. graph edit distance (GED). However, graph edit distance is an NP-hard problem and computationally expensive, making the application of graph similarity techniques to complex software programs impractical. This study intends to examine the effectiveness of graph neural networks to estimate program similarity, by analysing the associated control flow graphs. We introduce funcGNN, which is a graph neural network trained on labeled CFG pairs to predict the GED between unseen program pairs by utilizing an effective embedding vector. To our knowledge, this is the first time graph neural networks have been applied on labeled CFGs for estimating the similarity between high-level language programs. Results: We demonstrate the effectiveness of funcGNN to estimate the GED between programs and our experimental analysis demonstrates how it achieves a lower error rate (0.00194), with faster (23 times faster than the quickest traditional GED approximation method) and better scalability compared with the state of the art methods. funcGNN posses the inductive learning ability to infer program structure and generalise to unseen programs. The graph embedding of a program proposed by our methodology could be applied to several related software engineering problems (such as code plagiarism and clone identification) thus opening multiple research directions.Comment: 11 pages, 8 figures, 3 table

arXiv.org e-Print Archive

Crossref

A Jensen-Shannon Divergence Kernel for Directed Graphs

Author: AFT Martins
AL Barabási
B Schölkopf
C Ye
D Berwanger
F Aziz
F Chung
F Escolano
L Bai
L Han
M Cuturi
S Battiston
SVN Vishwanathan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

White Rose Research Online

Probabilistic Clustering of Time-Evolving Distance Data

Author: AK Jain
AY Ng
C Leslie
CP Robert
D Blei
DD Lee
DM Blei
Gunnar Rätsch
H Saigo
J Pitman
Julia E. Vogt
M Bilodeau
Marius Kloft
MB Eisen
MS Srivastava
P McCullagh
P McCullagh
RM Neal
S Sonnenburg
Sandhya Prabhakaran
SN MacEachern
Stefan Stark
Sudhir S. Raman
SVN Vishwanathan
TS Ferguson
TW Anderson
Volker Roth
WJ Ewens
Publication venue
Publication date: 01/01/2015
Field of study

We present a novel probabilistic clustering model for objects that are represented via pairwise distances and observed at different time points. The proposed method utilizes the information given by adjacent time points to find the underlying cluster structure and obtain a smooth cluster evolution. This approach allows the number of objects and clusters to differ at every time point, and no identification on the identities of the objects is needed. Further, the model does not require the number of clusters being specified in advance -- they are instead determined automatically using a Dirichlet process prior. We validate our model on synthetic data showing that the proposed method is more accurate than state-of-the-art clustering methods. Finally, we use our dynamic clustering model to analyze and illustrate the evolution of brain cancer patients over time

arXiv.org e-Print Archive

Crossref

edoc

Hierarchies and Ranks for Persistence Pairs

Author: AJ Zomorodian
B Rieck
C Maria
D Cohen-Steiner
D Cohen-Steiner
G Carlsson
H Carr
H Doraiswamy
H Doraiswamy
H Edelsbrunner
H Saikia
J Milnor
J Tierny
P Bendich
P Bendich
P Bille
S Gerber
SVN Vishwanathan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/07/2019
Field of study

We develop a novel hierarchy for zero-dimensional persistence pairs, i.e., connected components, which is capable of capturing more fine-grained spatial relations between persistence pairs. Our work is motivated by a lack of spatial relationships between features in persistence diagrams, leading to a limited expressive power. We build upon a recently-introduced hierarchy of pairs in persistence diagrams that augments the pairing stored in persistence diagrams with information about which components merge. Our proposed hierarchy captures differences in branching structure. Moreover, we show how to use our hierarchy to measure the spatial stability of a pairing and we define a rank function for persistence pairs and demonstrate different applications.Comment: Topology-based Methods in Visualization 201

arXiv.org e-Print Archive

Crossref

Probabilistic clustering of time-evolving distance data

Author: AK Jain
AY Ng
C Leslie
CP Robert
D Blei
DD Lee
DM Blei
Gunnar Rätsch
H Saigo
J Pitman
Julia E. Vogt
M Bilodeau
Marius Kloft
MB Eisen
MS Srivastava
P McCullagh
P McCullagh
RM Neal
S Sonnenburg
Sandhya Prabhakaran
SN MacEachern
Stefan Stark
Sudhir S. Raman
SVN Vishwanathan
TS Ferguson
TW Anderson
Volker Roth
WJ Ewens
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Predicting complex traits using a diffusion kernel on genetic markers with an application to dairy cattle and wheat data

Author: AE Hoerl
AJ Lorenz
AJ Smola
C Saunders
CR Henderson
D Gianola
D Gianola
D Gianola
D Gianola
D Gianola
D Gianola
D Habier
Daniel Gianola
F Fouss
G de los Campos
G de los Campos
G Kimeldorf
G Kimeldorf
G Morota
Gota Morota
Guilherme J M Rosa
H Shao
I Strandén
IR Kondor
J Crossa
J Lafferty
J Yang
JP Vert
Kent A Weigel
L Loewe
L Xu
LC Evans
M Gönen
Masanori Koyama
N Long
N Long
O González-Recio
O González-Recio
PM VanRaden
S Tsuruta
SVN Vishwanathan
T Gärtner
TFC Mackay
TH Meuwissen
THE Meuwissen
U Ober
U Ober
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Kernels on Graphs as Proximity Measures

Author: C Lenart
D Boley
D Liben-Nowell
E Estrada
E Estrada
E Estrada
F Chung
F Fouss
F Sommer
I Kivimäki
IJ Schoenberg
IJ Schoenberg
IS Dhillon
IS Dhillon
J Shawe-Taylor
K Avrachenkov
K Avrachenkov
K-R Müller
L Backstrom
L Katz
O Chapelle
P Chebotarev
P Chebotarev
P Chebotarev
PY Chebotarev
PY Chebotarev
PY Chebotarev
RA Horn
SJ Kirkland
SVN Vishwanathan
U Luxburg von
V Ivashkin
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 15/06/2017
Field of study

International audienceKernels and, broadly speaking, similarity measures on graphs are extensively used in graph-based unsupervised and semi-supervised learning algorithms as well as in the link prediction problem. We analytically study proximity and distance properties of various kernels and similarity measures on graphs. This can potentially be useful for recommending the adoption of one or another similarity measure in a machine learning method. Also, we numerically compare various similarity measures in the context of spectral clustering and observe that normalized heat-type similarity measures with log modification generally perform the best

Crossref

INRIA a CCSD electronic archive server