125 research outputs found
On the Localization of the Personalized PageRank of Complex Networks
In this paper new results on personalized PageRank are shown. We consider
directed graphs that may contain dangling nodes. The main result presented
gives an analytical characterization of all the possible values of the
personalized PageRank for any node.We use this result to give a theoretical
justification of a recent model that uses the personalized PageRank to classify
users of Social Networks Sites. We introduce new concepts concerning
competitivity and leadership in complex networks. We also present some
theoretical techniques to locate leaders and competitors which are valid for
any personalization vector and by using only information related to the
adjacency matrix of the graph and the distribution of its dangling nodes
Representation Independent Analytics Over Structured Data
Database analytics algorithms leverage quantifiable structural properties of
the data to predict interesting concepts and relationships. The same
information, however, can be represented using many different structures and
the structural properties observed over particular representations do not
necessarily hold for alternative structures. Thus, there is no guarantee that
current database analytics algorithms will still provide the correct insights,
no matter what structures are chosen to organize the database. Because these
algorithms tend to be highly effective over some choices of structure, such as
that of the databases used to validate them, but not so effective with others,
database analytics has largely remained the province of experts who can find
the desired forms for these algorithms. We argue that in order to make database
analytics usable, we should use or develop algorithms that are effective over a
wide range of choices of structural organizations. We introduce the notion of
representation independence, study its fundamental properties for a wide range
of data analytics algorithms, and empirically analyze the amount of
representation independence of some popular database analytics algorithms. Our
results indicate that most algorithms are not generally representation
independent and find the characteristics of more representation independent
heuristics under certain representational shifts
Information Filtering on Coupled Social Networks
In this paper, based on the coupled social networks (CSN), we propose a
hybrid algorithm to nonlinearly integrate both social and behavior information
of online users. Filtering algorithm based on the coupled social networks,
which considers the effects of both social influence and personalized
preference. Experimental results on two real datasets, \emph{Epinions} and
\emph{Friendfeed}, show that hybrid pattern can not only provide more accurate
recommendations, but also can enlarge the recommendation coverage while
adopting global metric. Further empirical analyses demonstrate that the mutual
reinforcement and rich-club phenomenon can also be found in coupled social
networks where the identical individuals occupy the core position of the online
system. This work may shed some light on the in-depth understanding structure
and function of coupled social networks
Neural Collective Entity Linking
Entity Linking aims to link entity mentions in texts to knowledge bases, and
neural models have achieved recent success in this task. However, most existing
methods rely on local contexts to resolve entities independently, which may
usually fail due to the data sparsity of local information. To address this
issue, we propose a novel neural model for collective entity linking, named as
NCEL. NCEL applies Graph Convolutional Network to integrate both local
contextual features and global coherence information for entity linking. To
improve the computation efficiency, we approximately perform graph convolution
on a subgraph of adjacent entity mentions instead of those in the entire text.
We further introduce an attention scheme to improve the robustness of NCEL to
data noise and train the model on Wikipedia hyperlinks to avoid overfitting and
domain bias. In experiments, we evaluate NCEL on five publicly available
datasets to verify the linking performance as well as generalization ability.
We also conduct an extensive analysis of time complexity, the impact of key
modules, and qualitative results, which demonstrate the effectiveness and
efficiency of our proposed method.Comment: 12 pages, 3 figures, COLING201
Modeling Relation Paths for Representation Learning of Knowledge Bases
Representation learning of knowledge bases (KBs) aims to embed both entities
and relations into a low-dimensional space. Most existing methods only consider
direct relations in representation learning. We argue that multiple-step
relation paths also contain rich inference patterns between entities, and
propose a path-based representation learning model. This model considers
relation paths as translations between entities for representation learning,
and addresses two key challenges: (1) Since not all relation paths are
reliable, we design a path-constraint resource allocation algorithm to measure
the reliability of relation paths. (2) We represent relation paths via semantic
composition of relation embeddings. Experimental results on real-world datasets
show that, as compared with baselines, our model achieves significant and
consistent improvements on knowledge base completion and relation extraction
from text.Comment: 10 page
Bayesian anomaly detection methods for social networks
Learning the network structure of a large graph is computationally demanding,
and dynamically monitoring the network over time for any changes in structure
threatens to be more challenging still. This paper presents a two-stage method
for anomaly detection in dynamic graphs: the first stage uses simple, conjugate
Bayesian models for discrete time counting processes to track the pairwise
links of all nodes in the graph to assess normality of behavior; the second
stage applies standard network inference tools on a greatly reduced subset of
potentially anomalous nodes. The utility of the method is demonstrated on
simulated and real data sets.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS329 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …