Search CORE

3,341 research outputs found

Estimating Graphlet Statistics via Lifting

Author: Paramonov Kirill
Sharpnack James
Shemetov Dmitry
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/04/2020
Field of study

Exploratory analysis over network data is often limited by the ability to efficiently calculate graph statistics, which can provide a model-free understanding of the macroscopic properties of a network. We introduce a framework for estimating the graphlet count---the number of occurrences of a small subgraph motif (e.g. a wedge or a triangle) in the network. For massive graphs, where accessing the whole graph is not possible, the only viable algorithms are those that make a limited number of vertex neighborhood queries. We introduce a Monte Carlo sampling technique for graphlet counts, called {\em Lifting}, which can simultaneously sample all graphlets of size up to

k

vertices for arbitrary

k

. This is the first graphlet sampling method that can provably sample every graphlet with positive probability and can sample graphlets of arbitrary size

k

. We outline variants of lifted graphlet counts, including the ordered, unordered, and shotgun estimators, random walk starts, and parallel vertex starts. We prove that our graphlet count updates are unbiased for the true graphlet count and have a controlled variance for all graphlets. We compare the experimental performance of lifted graphlet counts to the state-of-the art graphlet sampling procedures: Waddling and the pairwise subgraph random walk

arXiv.org e-Print Archive

Crossref

Neural Collective Entity Linking

Author: Cao Yixin
Hou Lei
Li Juanzi
Liu Zhiyuan
Publication venue
Publication date: 01/08/2018
Field of study

Entity Linking aims to link entity mentions in texts to knowledge bases, and neural models have achieved recent success in this task. However, most existing methods rely on local contexts to resolve entities independently, which may usually fail due to the data sparsity of local information. To address this issue, we propose a novel neural model for collective entity linking, named as NCEL. NCEL applies Graph Convolutional Network to integrate both local contextual features and global coherence information for entity linking. To improve the computation efficiency, we approximately perform graph convolution on a subgraph of adjacent entity mentions instead of those in the entire text. We further introduce an attention scheme to improve the robustness of NCEL to data noise and train the model on Wikipedia hyperlinks to avoid overfitting and domain bias. In experiments, we evaluate NCEL on five publicly available datasets to verify the linking performance as well as generalization ability. We also conduct an extensive analysis of time complexity, the impact of key modules, and qualitative results, which demonstrate the effectiveness and efficiency of our proposed method.Comment: 12 pages, 3 figures, COLING201

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Matrices of forests, analysis of networks, and ranking problems

Author: Agaev Rafig
Chebotarev Pavel
Publication venue: 'Elsevier BV'
Publication date: 28/05/2013
Field of study

The matrices of spanning rooted forests are studied as a tool for analysing the structure of networks and measuring their properties. The problems of revealing the basic bicomponents, measuring vertex proximity, and ranking from preference relations / sports competitions are considered. It is shown that the vertex accessibility measure based on spanning forests has a number of desirable properties. An interpretation for the stochastic matrix of out-forests in terms of information dissemination is given.Comment: 8 pages. This article draws heavily from arXiv:math/0508171. Published in Proceedings of the First International Conference on Information Technology and Quantitative Management (ITQM 2013). This version contains some corrections and addition

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Inference, Learning, and Population Size: Projectivity for SRL Models

Author: Jaeger Manfred
Schulte Oliver
Publication venue
Publication date: 01/01/2018
Field of study

A subtle difference between propositional and relational data is that in many relational models, marginal probabilities depend on the population or domain size. This paper connects the dependence on population size to the classic notion of projectivity from statistical theory: Projectivity implies that relational predictions are robust with respect to changes in domain size. We discuss projectivity for a number of common SRL systems, and identify syntactic fragments that are guaranteed to yield projective models. The syntactic conditions are restrictive, which suggests that projectivity is difficult to achieve in SRL, and care must be taken when working with different domain sizes

arXiv.org e-Print Archive

VBN

Subgraphs in preferential attachment models

Author: Garavaglia Alessandro
Stegehuis Clara
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 27/06/2018
Field of study

We consider subgraph counts in general preferential attachment models with power-law degree exponent

\tau>2

. For all subgraphs

H

, we find the scaling of the expected number of subgraphs as a power of the number of vertices. We prove our results on the expected number of subgraphs by defining an optimization problem that finds the optimal subgraph structure in terms of the indices of the vertices that together span it and by using the representation of the preferential attachment model as a P\'olya urn model

arXiv.org e-Print Archive

University of Twente Research Information