Search CORE

281 research outputs found

Searching for superspreaders of information in real-world social media

Author: Andrade Jr. Jose S.
Makse Hernan A.
Muchnik Lev
Pei Sen
Zheng Zhiming
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/07/2014
Field of study

A number of predictors have been suggested to detect the most influential spreaders of information in online social media across various domains such as Twitter or Facebook. In particular, degree, PageRank, k-core and other centralities have been adopted to rank the spreading capability of users in information dissemination media. So far, validation of the proposed predictors has been done by simulating the spreading dynamics rather than following real information flow in social networks. Consequently, only model-dependent contradictory results have been achieved so far for the best predictor. Here, we address this issue directly. We search for influential spreaders by following the real spreading dynamics in a wide range of networks. We find that the widely-used degree and PageRank fail in ranking users' influence. We find that the best spreaders are consistently located in the k-core across dissimilar social platforms such as Twitter, Facebook, Livejournal and scientific publishing in the American Physical Society. Furthermore, when the complete global network structure is unavailable, we find that the sum of the nearest neighbors' degree is a reliable local proxy for user's influence. Our analysis provides practical instructions for optimal design of strategies for "viral" information dissemination in relevant applications.Comment: 12 pages, 7 figure

arXiv.org e-Print Archive

City University of New York

PubMed Central

Ultimate periodicity of b-recognisable sets : a quasilinear procedure

Author: A. Cobham
A. Muchnik
C. Frougny
J. Almeida
J. Honkala
J.-P. Allouche
S. Ginsburg
V. Bruyère
Publication venue
Publication date: 01/01/2013
Field of study

It is decidable if a set of numbers, whose representation in a base b is a regular language, is ultimately periodic. This was established by Honkala in 1986. We give here a structural description of minimal automata that accept an ultimately periodic set of numbers. We then show that it can verified in linear time if a given minimal automaton meets this description. This thus yields a O(n log(n)) procedure for deciding whether a general deterministic automaton accepts an ultimately periodic set of numbers.Comment: presented at DLT 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Improving the presentation of search results by multipartite graph clustering of multiple reformulated queries and a novel document representation

Author: Lytkin N.
Muchnik I.
Perlovsky L.
Petrov S.
Streltsov S.
Publication venue: б. и.
Publication date: 01/01/2005
Field of study

The goal of clustering web search results is to reveal the semantics of the retrieved documents. The main challenge is to make clustering partition relevant to a user’s query. In this paper, we describe a method of clustering search results using a similarity measure between documents retrieved by multiple reformulated queries. The method produces clusters of documents that are most relevant to the original query and, at the same time, represent a more diverse set of semantically related queries. In order to cluster thousands of documents in real time, we designed a novel multipartite graph clustering algorithm that has low polynomial complexity and no manually adjusted hyper–parameters. The loss of semantics resulting from the stem–based document representation is a common problem in information retrieval. To address this problem, we propose an alternative novel document representation, under which words are represented by their synonymy groups.This work was supported by Yandex grant 110104

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Two-dimensional ranking of Wikipedia articles

Author: A. Capocci
A. O. Zhirov
D. Donato
D. L. Shepelyansky
D.J. Watts
F. Radicchi
G. Pandurangan
J. Giles
J. Kleinberg
L. Muchnik
M.E.J. Newman
O. V. Zhirov
R. Albert
S. Brin
S. Redner
V. Zlatic
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/09/2010
Field of study

The Library of Babel, described by Jorge Luis Borges, stores an enormous amount of information. The Library exists {\it ab aeterno}. Wikipedia, a free online encyclopaedia, becomes a modern analogue of such a Library. Information retrieval and ranking of Wikipedia articles become the challenge of modern society. While PageRank highlights very well known nodes with many ingoing links, CheiRank highlights very communicative nodes with many outgoing links. In this way the ranking becomes two-dimensional. Using CheiRank and PageRank we analyze the properties of two-dimensional ranking of all Wikipedia English articles and show that it gives their reliable classification with rich and nontrivial features. Detailed studies are done for countries, universities, personalities, physicists, chess players, Dow-Jones companies and other categories.Comment: RevTex 9 pages, data, discussion added, more data at http://www.quantware.ups-tlse.fr/QWLIB/2drankwikipedia

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

HAL-INSA Toulouse

Research Papers in Economics

Worldwide spreading of economic crisis

Author: Antonios Garas
Céline Rozenblat
Dorogovtsev S N
Garas A
Kitsak M Gallos L K Havlin S Liljeros F Muchnik L Stanley H E Makse H A
Krugman P R
Mantegna R N
Marco Tomassini
Panos Argyrakis
Shlomo Havlin
Publication venue: 'IOP Publishing'
Publication date: 23/08/2010
Field of study

We model the spreading of a crisis by constructing a global economic network and applying the Susceptible-Infected-Recovered (SIR) epidemic model with a variable probability of infection. The probability of infection depends on the strength of economic relations between the pair of countries, and the strength of the target country. It is expected that a crisis which originates in a large country, such as the USA, has the potential to spread globally, like the recent crisis. Surprisingly we show that also countries with much lower GDP, such as Belgium, are able to initiate a global crisis. Using the {\it k}-shell decomposition method to quantify the spreading power (of a node), we obtain a measure of ``centrality'' as a spreader of each country in the economic network. We thus rank the different countries according to the shell they belong to, and find the 12 most central countries. These countries are the most likely to spread a crisis globally. Of these 12 only six are large economies, while the other six are medium/small ones, a result that could not have been otherwise anticipated. Furthermore, we use our model to predict the crisis spreading potential of countries belonging to different shells according to the crisis magnitude.Comment: 13 pages, 4 figures and Supplementary Materia

arXiv.org e-Print Archive

Crossref

Serveur académique lausannois

Origins of power-law degree distribution in the heterogeneity of human activity in social networks

Author: Andrade Jr., Jose S.
Havlin Shlomo
Makse Hernan A.
Muchnik Lev
Parra Lucas C.
Pei Sen
Reis Saulo D. S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/04/2013
Field of study

The probability distribution of number of ties of an individual in a social network follows a scale-free power-law. However, how this distribution arises has not been conclusively demonstrated in direct analyses of people's actions in social networks. Here, we perform a causal inference analysis and find an underlying cause for this phenomenon. Our analysis indicates that heavy-tailed degree distribution is causally determined by similarly skewed distribution of human activity. Specifically, the degree of an individual is entirely random - following a "maximum entropy attachment" model - except for its mean value which depends deterministically on the volume of the users' activity. This relation cannot be explained by interactive models, like preferential attachment, since the observed actions are not likely to be caused by interactions with other people.Comment: 23 pages, 5 figure

arXiv.org e-Print Archive

City University of New York

PubMed Central

An output-sensitive algorithm for the minimization of 2-dimensional String Covers

Author: A Apostolico
A Apostolico
A Bacciotti
A Katok
A Muchnik
A Tychonoff
A Wlodawer
AK Brodzik
AV Aho
DE Knuth
J Kopf
JR Searle
K Perlin
L Bursill
R Middlestead
RS Bird
S Havlin
WA Sethares
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/05/2019
Field of study

String covers are a powerful tool for analyzing the quasi-periodicity of 1-dimensional data and find applications in automata theory, computational biology, coding and the analysis of transactional data. A \emph{cover} of a string

T

is a string

C

for which every letter of

T

lies within some occurrence of

C

. String covers have been generalized in many ways, leading to \emph{k-covers}, \emph{

\lambda

-covers}, \emph{approximate covers} and were studied in different contexts such as \emph{indeterminate strings}. In this paper we generalize string covers to the context of 2-dimensional data, such as images. We show how they can be used for the extraction of textures from images and identification of primitive cells in lattice data. This has interesting applications in image compression, procedural terrain generation and crystallography

arXiv.org e-Print Archive

Crossref

Studies of the limit order book around large price changes

Author: A. Johansen
A. Ponzi
A.G. Zawadowski
A.G. Zawadowski
B. Tóth
D. Challet
E. Smith
F. Lillo
F. Omori
G.J. Stigler
J. D. Farmer
J. Kertész
J.-P. Bouchaud
J.D. Farmer
L. Muchnik
M.G. Daniels
P. Bak
P. Weber
P. Weber
R.D. Willmann
R.V. Chamberlin
S. Abe
S. Maslov
S. Zapperi
Z. Eisler
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We study the dynamics of the limit order book of liquid stocks after experiencing large intra-day price changes. In the data we find large variations in several microscopical measures, e.g., the volatility the bid-ask spread, the bid-ask imbalance, the number of queuing limit orders, the activity (number and volume) of limit orders placed and canceled, etc. The relaxation of the quantities is generally very slow that can be described by a power law of exponent

\approx0.4

. We introduce a numerical model in order to understand the empirical results better. We find that with a zero intelligence deposition model of the order flow the empirical results can be reproduced qualitatively. This suggests that the slow relaxations might not be results of agents' strategic behaviour. Studying the difference between the exponents found empirically and numerically helps us to better identify the role of strategic behaviour in the phenomena.Comment: 19 pages, 7 figure

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Oxford University Research Archive

Research Papers in Economics

A complementary view on the growth of directory trees

Author: A. Rokas
C. Dupuis
C. J. Tessone
D. Garlaschelli
D. Garlaschelli
D. Knuth
D.A. Huffman
E. Codd
E. Weibel
E.A. Herrada
F. Schweitzer
J. Cracraft
J.R. Banavar
K. Klemm
K. Klemm
L. Muchnik
M. M. Geipel
M. Zamir
P. Prusinkiewicz
P.L. Krapivsky
P.L. Krapivsky
P.L. Krapivsky
S. Dorogovtsev
S. Golder
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Trees are a special sub-class of networks with unique properties, such as the level distribution which has often been overlooked. We analyse a general tree growth model proposed by Klemm {\em et. al.} (2005) to explain the growth of user-generated directory structures in computers. The model has a single parameter

q

which interpolates between preferential attachment and random growth. Our analysis results in three contributions: First, we propose a more efficient estimation method for

q

based on the degree distribution, which is one specific representation of the model. Next, we introduce the concept of a level distribution and analytically solve the model for this representation. This allows for an alternative and independent measure of

q

. We argue that, to capture real growth processes, the

q

estimations from the degree and the level distributions should coincide. Thus, we finally apply both representations to validate the model with synthetically generated tree structures, as well as with collected data of user directories. In the case of real directory structures, we show that

q

measured from the level distribution are incompatible with

q

measured from the degree distribution. In contrast to this, we find perfect agreement in the case of simulated data. Thus, we conclude that the model is an incomplete description of the growth of real directory structures as it fails to reproduce the level distribution. This insight can be generalised to point out the importance of the level distribution for modeling tree growth.Comment: 16 pages, 7 figure

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Research Papers in Economics