19 research outputs found
The detection and effect of social events on Wikipedia data-set for studying human preferences
Several studies have used Wikipedia (WP) data-set to analyse worldwide human preferences by languages. However, those studies could suffer from bias related to exceptional social circumstances. Any massive event promoting exceptional editions of WP can be defined as a source of bias. In this article, we follow a procedure for detecting outliers. Our study is based on 12 languages and 13 different categories. Our methodology defines a parameter, which is language-dependent instead of being externally fixed. We also study the presence of human cyclic behavior to evaluate apparent outliers. After our analysis, we found that the outliers in our data-set do not significantly affect the analysis of preferences by categories among different WP languages. While investigating the possibility of bias related to exceptional social circumstances is always a safe measure before doing any analysis on Big Data, we found that in the case of the first ten years of the Wikipedia data-set, outliers do not significantly affect using Wikipedia data-set as a digital footprint to analyse worldwide human preferences
Bounded confidence models generate more secondary clusters when the number of agents is growing
We study the bounded confidence model on a growing population. We compare
simulations of the agent model, its version in continuous densities and with
the standard influence function or a smoother influence function. We find that
the model on a growing population generates bigger secondary clusters and more
systematically than when the population is fixed. Moreover, our tests with the
smooth influence function suggest that these secondary clusters can be
generated by a different mechanism when the population is growing than when it
is fixed.Comment: 16 pages, 8 figure
Measuring the effect of node aggregation on community detection
Many times the nodes of a complex network, whether deliberately or not, are
aggregated for technical, ethical, legal limitations or privacy reasons. A
common example is the geographic position: one may uncover communities in a
network of places, or of individuals identified with their typical geographical
position, and then aggregate these places into larger entities, such as
municipalities, thus obtaining another network. The communities found in the
networks obtained at various levels of aggregation may exhibit various degrees
of similarity, from full alignment to perfect independence. This is akin to the
problem of ecological and atomic fallacies in statistics, or to the Modified
Areal Unit Problem in geography. We identify the class of community detection
algorithms most suitable to cope with node aggregation, and develop an index
for aggregability, capturing to which extent the aggregation preserves the
community structure. We illustrate its relevance on real-world examples (mobile
phone and Twitter reply-to networks). Our main message is that any
node-partitioning analysis performed on aggregated networks should be
interpreted with caution, as the outcome may be strongly influenced by the
level of the aggregation.Comment: 12 pages, 5 figure
Continuous opinion model in small world directed networks
In the compromise model of continuous opinions proposed by Deffuant et al,
the states of two agents in a network can start to converge if they are
neighbors and if their opinions are sufficiently close to each other, below a
given threshold of tolerance . In directed networks, if agent i is a
neighbor of agent j, j need not be a neighbor of i. In Watts-Strogatz networks
we performed simulations to find the averaged number of final opinions
and their distribution as a function of $\epsilon$ and of the network
structural disorder. In directed networks exhibits a rich structure,
being larger than in undirected networks for higher values of , and
smaller for lower values of .Comment: 15 pages, 6 figure
The Detection and effect of social events on Wikipedia data-set for studying human preferences
Several studies have used Wikipedia (WP) data-set to analyse worldwide human
preferences by languages. However, those studies could suffer from bias related
to exceptional social circumstances. Any massive event promoting the
exceptional edition of WP can be defined as a source of bias. In this article,
we follow a procedure for detecting outliers. Our study is based on
languages and different categories. Our methodology defines a parameter,
which is language-depending instead of being externally fixed. We also study
the presence of human cyclic behaviour to evaluate apparent outliers. After our
analysis, we found that the outliers in our data set do not significantly
affect using the whole Wikipedia-data set as a digital footprint to analyse
worldwide human preferences.Comment: 8 pages, 4 figure
A multilevel analysis to systemic exposure: insights from local and system-wide information
In the aftermath of the financial crisis, the growing literature on financial
networks has widely documented the predictive power of topological
characteristics (e.g. degree centrality measures) to explain the systemic
impact or systemic vulnerability of financial institutions. In this work, we
show that considering alternative topological measures based on local
sub-network environment improves our ability to identify systemic institutions.
To provide empirical evidence, we apply a two-step procedure. First, we recover
network communities (i.e. close-peer environment) on a spillover network of
financial institutions. Second, we regress alternative measures of
vulnerability on three levels of topological measures: the global level (i.e.
firm topological characteristics computed over the whole system), local level
(i.e. firm topological characteristics computed over the community) and
aggregated level by averaging individual characteristics over the community.
The sample includes financial institutions (banks, broker-dealers,
insurance and real-estate companies) listed in the Standard \& Poor's 500
index. Our results confirm the informational content of topological metrics
based on close-peer environment. Such information is different from the one
embeds in traditional system wide topological metrics and is proved to be
predictor of distress for financial institutions in time of crisis.Comment: 12 pages, 3 figures and 3 table
What Can Wikipedia Tell Us About the Global or Local Character of Burstiness?
In this communication we take advantage of the global covering character of Wikipedia dataset to analyze the dependence of the usual coefficients used to measure burstiness respect to language. Analyzing separately the patterns for single editors over several pages, we show several characteristics of the super-editors in the WP written in English, Spanish, French and Portuguese. We report for the first time the Burstiness and Memory effect coefficients, separately for the 4 WP’s, showing similitudes and differences for all the users respect to the super-editors, the exponent for their averaged inter-event activity and finally some statistical traces for their averaged monthly activity