3,033 research outputs found
Term-Specific Eigenvector-Centrality in Multi-Relation Networks
Fuzzy matching and ranking are two information retrieval techniques widely used in web search. Their application to structured data, however, remains an open problem. This article investigates how eigenvector-centrality can be used for approximate matching in multi-relation graphs, that is, graphs where connections of many different types may exist. Based on an extension of the PageRank matrix, eigenvectors representing the distribution of a term after propagating term weights between related data items are computed. The result is an index which takes the document structure into account and can be used with standard document retrieval techniques. As the scheme takes the shape of an index transformation, all necessary calculations are performed during index tim
Interbank markets and multiplex networks: centrality measures and statistical null models
The interbank market is considered one of the most important channels of
contagion. Its network representation, where banks and claims/obligations are
represented by nodes and links (respectively), has received a lot of attention
in the recent theoretical and empirical literature, for assessing systemic risk
and identifying systematically important financial institutions. Different
types of links, for example in terms of maturity and collateralization of the
claim/obligation, can be established between financial institutions. Therefore
a natural representation of the interbank structure which takes into account
more features of the market, is a multiplex, where each layer is associated
with a type of link. In this paper we review the empirical structure of the
multiplex and the theoretical consequences of this representation. We also
investigate the betweenness and eigenvector centrality of a bank in the
network, comparing its centrality properties across different layers and with
Maximum Entropy null models.Comment: To appear in the book "Interconnected Networks", A. Garas e F.
Schweitzer (eds.), Springer Complexity Serie
Mathematical Formulation of Multi-Layer Networks
A network representation is useful for describing the structure of a large
variety of complex systems. However, most real and engineered systems have
multiple subsystems and layers of connectivity, and the data produced by such
systems is very rich. Achieving a deep understanding of such systems
necessitates generalizing "traditional" network theory, and the newfound deluge
of data now makes it possible to test increasingly general frameworks for the
study of networks. In particular, although adjacency matrices are useful to
describe traditional single-layer networks, such a representation is
insufficient for the analysis and description of multiplex and time-dependent
networks. One must therefore develop a more general mathematical framework to
cope with the challenges posed by multi-layer complex systems. In this paper,
we introduce a tensorial framework to study multi-layer networks, and we
discuss the generalization of several important network descriptors and
dynamical processes --including degree centrality, clustering coefficients,
eigenvector centrality, modularity, Von Neumann entropy, and diffusion-- for
this framework. We examine the impact of different choices in constructing
these generalizations, and we illustrate how to obtain known results for the
special cases of single-layer and multiplex networks. Our tensorial approach
will be helpful for tackling pressing problems in multi-layer complex systems,
such as inferring who is influencing whom (and by which media) in multichannel
social networks and developing routing techniques for multimodal transportation
systems.Comment: 15 pages, 5 figure
Finding community structure in networks using the eigenvectors of matrices
We consider the problem of detecting communities or modules in networks,
groups of vertices with a higher-than-average density of edges connecting them.
Previous work indicates that a robust approach to this problem is the
maximization of the benefit function known as "modularity" over possible
divisions of a network. Here we show that this maximization process can be
written in terms of the eigenspectrum of a matrix we call the modularity
matrix, which plays a role in community detection similar to that played by the
graph Laplacian in graph partitioning calculations. This result leads us to a
number of possible algorithms for detecting community structure, as well as
several other results, including a spectral measure of bipartite structure in
networks and a new centrality measure that identifies those vertices that
occupy central positions within the communities to which they belong. The
algorithms and measures proposed are illustrated with applications to a variety
of real-world complex networks.Comment: 22 pages, 8 figures, minor corrections in this versio
Opinion-Based Centrality in Multiplex Networks: A Convex Optimization Approach
Most people simultaneously belong to several distinct social networks, in
which their relations can be different. They have opinions about certain
topics, which they share and spread on these networks, and are influenced by
the opinions of other persons. In this paper, we build upon this observation to
propose a new nodal centrality measure for multiplex networks. Our measure,
called Opinion centrality, is based on a stochastic model representing opinion
propagation dynamics in such a network. We formulate an optimization problem
consisting in maximizing the opinion of the whole network when controlling an
external influence able to affect each node individually. We find a
mathematical closed form of this problem, and use its solution to derive our
centrality measure. According to the opinion centrality, the more a node is
worth investing external influence, and the more it is central. We perform an
empirical study of the proposed centrality over a toy network, as well as a
collection of real-world networks. Our measure is generally negatively
correlated with existing multiplex centrality measures, and highlights
different types of nodes, accordingly to its definition
Predicting Scientific Success Based on Coauthorship Networks
We address the question to what extent the success of scientific articles is
due to social influence. Analyzing a data set of over 100000 publications from
the field of Computer Science, we study how centrality in the coauthorship
network differs between authors who have highly cited papers and those who do
not. We further show that a machine learning classifier, based only on
coauthorship network centrality measures at time of publication, is able to
predict with high precision whether an article will be highly cited five years
after publication. By this we provide quantitative insight into the social
dimension of scientific publishing - challenging the perception of citations as
an objective, socially unbiased measure of scientific success.Comment: 21 pages, 2 figures, incl. Supplementary Materia
Centrality measures for graphons: Accounting for uncertainty in networks
As relational datasets modeled as graphs keep increasing in size and their
data-acquisition is permeated by uncertainty, graph-based analysis techniques
can become computationally and conceptually challenging. In particular, node
centrality measures rely on the assumption that the graph is perfectly known --
a premise not necessarily fulfilled for large, uncertain networks. Accordingly,
centrality measures may fail to faithfully extract the importance of nodes in
the presence of uncertainty. To mitigate these problems, we suggest a
statistical approach based on graphon theory: we introduce formal definitions
of centrality measures for graphons and establish their connections to
classical graph centrality measures. A key advantage of this approach is that
centrality measures defined at the modeling level of graphons are inherently
robust to stochastic variations of specific graph realizations. Using the
theory of linear integral operators, we define degree, eigenvector, Katz and
PageRank centrality functions for graphons and establish concentration
inequalities demonstrating that graphon centrality functions arise naturally as
limits of their counterparts defined on sequences of graphs of increasing size.
The same concentration inequalities also provide high-probability bounds
between the graphon centrality functions and the centrality measures on any
sampled graph, thereby establishing a measure of uncertainty of the measured
centrality score. The same concentration inequalities also provide
high-probability bounds between the graphon centrality functions and the
centrality measures on any sampled graph, thereby establishing a measure of
uncertainty of the measured centrality score.Comment: Authors ordered alphabetically, all authors contributed equally. 21
pages, 7 figure
LexRank: Graph-based Lexical Centrality as Salience in Text Summarization
We introduce a stochastic graph-based method for computing relative
importance of textual units for Natural Language Processing. We test the
technique on the problem of Text Summarization (TS). Extractive TS relies on
the concept of sentence salience to identify the most important sentences in a
document or set of documents. Salience is typically defined in terms of the
presence of particular important words or in terms of similarity to a centroid
pseudo-sentence. We consider a new approach, LexRank, for computing sentence
importance based on the concept of eigenvector centrality in a graph
representation of sentences. In this model, a connectivity matrix based on
intra-sentence cosine similarity is used as the adjacency matrix of the graph
representation of sentences. Our system, based on LexRank ranked in first place
in more than one task in the recent DUC 2004 evaluation. In this paper we
present a detailed analysis of our approach and apply it to a larger data set
including data from earlier DUC evaluations. We discuss several methods to
compute centrality using the similarity graph. The results show that
degree-based methods (including LexRank) outperform both centroid-based methods
and other systems participating in DUC in most of the cases. Furthermore, the
LexRank with threshold method outperforms the other degree-based techniques
including continuous LexRank. We also show that our approach is quite
insensitive to the noise in the data that may result from an imperfect topical
clustering of documents
- …