1,066 research outputs found
Continuous Average Straightness in Spatial Graphs
The Straightness is a measure designed to characterize a pair of vertices in
a spatial graph. It is defined as the ratio of the Euclidean distance to the
graph distance between these vertices. It is often used as an average, for
instance to describe the accessibility of a single vertex relatively to all the
other vertices in the graph, or even to summarize the graph as a whole. In some
cases, one needs to process the Straightness between not only vertices, but
also any other points constituting the graph of interest. Suppose for instance
that our graph represents a road network and we do not want to limit ourselves
to crossroad-to-crossroad itineraries, but allow any street number to be a
starting point or destination. In this situation, the standard approach
consists in: 1) discretizing the graph edges, 2) processing the
vertex-to-vertex Straightness considering the additional vertices resulting
from this discretization, and 3) performing the appropriate average on the
obtained values. However, this discrete approximation can be computationally
expensive on large graphs, and its precision has not been clearly assessed. In
this article, we adopt a continuous approach to average the Straightness over
the edges of spatial graphs. This allows us to derive 5 distinct measures able
to characterize precisely the accessibility of the whole graph, as well as
individual vertices and edges. Our method is generic and could be applied to
other measures designed for spatial graphs. We perform an experimental
evaluation of our continuous average Straightness measures, and show how they
behave differently from the traditional vertex-to-vertex ones. Moreover, we
also study their discrete approximations, and show that our approach is
globally less demanding in terms of both processing time and memory usage. Our
R source code is publicly available under an open source license
Accuracy Measures for the Comparison of Classifiers
The selection of the best classification algorithm for a given dataset is a
very widespread problem. It is also a complex one, in the sense it requires to
make several important methodological choices. Among them, in this work we
focus on the measure used to assess the classification performance and rank the
algorithms. We present the most popular measures and discuss their properties.
Despite the numerous measures proposed over the years, many of them turn out to
be equivalent in this specific case, to have interpretation problems, or to be
unsuitable for our purpose. Consequently, classic overall success rate or
marginal rates should be preferred for this specific task.Comment: The 5th International Conference on Information Technology, amman :
Jordanie (2011
Evaluation of Performance Measures for Classifiers Comparison
The selection of the best classification algorithm for a given dataset is a
very widespread problem, occuring each time one has to choose a classifier to
solve a real-world problem. It is also a complex task with many important
methodological decisions to make. Among those, one of the most crucial is the
choice of an appropriate measure in order to properly assess the classification
performance and rank the algorithms. In this article, we focus on this specific
task. We present the most popular measures and compare their behavior through
discrimination plots. We then discuss their properties from a more theoretical
perspective. It turns out several of them are equivalent for classifiers
comparison purposes. Futhermore. they can also lead to interpretation problems.
Among the numerous measures proposed over the years, it appears that the
classical overall success rate and marginal rates are the more suitable for
classifier comparison task
Opinion-Based Centrality in Multiplex Networks: A Convex Optimization Approach
Most people simultaneously belong to several distinct social networks, in
which their relations can be different. They have opinions about certain
topics, which they share and spread on these networks, and are influenced by
the opinions of other persons. In this paper, we build upon this observation to
propose a new nodal centrality measure for multiplex networks. Our measure,
called Opinion centrality, is based on a stochastic model representing opinion
propagation dynamics in such a network. We formulate an optimization problem
consisting in maximizing the opinion of the whole network when controlling an
external influence able to affect each node individually. We find a
mathematical closed form of this problem, and use its solution to derive our
centrality measure. According to the opinion centrality, the more a node is
worth investing external influence, and the more it is central. We perform an
empirical study of the proposed centrality over a toy network, as well as a
collection of real-world networks. Our measure is generally negatively
correlated with existing multiplex centrality measures, and highlights
different types of nodes, accordingly to its definition
Comparative Evaluation of Community Detection Algorithms: A Topological Approach
Community detection is one of the most active fields in complex networks
analysis, due to its potential value in practical applications. Many works
inspired by different paradigms are devoted to the development of algorithmic
solutions allowing to reveal the network structure in such cohesive subgroups.
Comparative studies reported in the literature usually rely on a performance
measure considering the community structure as a partition (Rand Index,
Normalized Mutual information, etc.). However, this type of comparison neglects
the topological properties of the communities. In this article, we present a
comprehensive comparative study of a representative set of community detection
methods, in which we adopt both types of evaluation. Community-oriented
topological measures are used to qualify the communities and evaluate their
deviation from the reference structure. In order to mimic real-world systems,
we use artificially generated realistic networks. It turns out there is no
equivalence between both approaches: a high performance does not necessarily
correspond to correct topological properties, and vice-versa. They can
therefore be considered as complementary, and we recommend applying both of
them in order to perform a complete and accurate assessment
A community role approach to assess social capitalists visibility in the Twitter network
In the context of Twitter, social capitalists are specific users trying to
increase their number of followers and interactions by any means. These users
are not healthy for the service, because they are either spammers or real users
flawing the notions of influence and visibility. Studying their behavior and
understanding their position in Twit-ter is thus of important interest. It is
also necessary to analyze how these methods effectively affect user visibility.
Based on a recently proposed method allowing to identify social capitalists, we
tackle both points by studying how they are organized, and how their links
spread across the Twitter follower-followee network. To that aim, we consider
their position in the network w.r.t. its community structure. We use the
concept of community role of a node, which describes its position in a network
depending on its connectiv-ity at the community level. However, the topological
measures originally defined to characterize these roles consider only certain
aspects of the community-related connectivity, and rely on a set of empirically
fixed thresholds. We first show the limitations of these measures, before
extending and generalizing them. Moreover, we use an unsupervised approach to
identify the roles, in order to provide more flexibility relatively to the
studied system. We then apply our method to the case of social capitalists and
show they are highly visible on Twitter, due to the specific roles they hold.Comment: arXiv admin note: substantial text overlap with arXiv:1406.661
Towards realistic artificial benchmark for community detection algorithms evaluation
Assessing the partitioning performance of community detection algorithms is
one of the most important issues in complex network analysis. Artificially
generated networks are often used as benchmarks for this purpose. However,
previous studies showed their level of realism have a significant effect on the
algorithms performance. In this study, we adopt a thorough experimental
approach to tackle this problem and investigate this effect. To assess the
level of realism, we use consensual network topological properties. Based on
the LFR method, the most realistic generative method to date, we propose two
alternative random models to replace the Configuration Model originally used in
this algorithm, in order to increase its realism. Experimental results show
both modifications allow generating collections of community-structured
artificial networks whose topological properties are closer to those
encountered in real-world networks. Moreover, the results obtained with eleven
popular community identification algorithms on these benchmarks show their
performance decrease on more realistic networks
Business-oriented Analysis of a Social Network of University Students
Despites the great interest caused by social networks in Business Science, their analysis is rarely performed both in a global and systematic way in this field: most authors focus on parts of the studied network, or on a few nodes considered individually. This could be explained by the fact that practical extraction of social networks is a difficult and costly task, since the specific relational data it requires are often difficult to access and thereby expensive. One may ask if equivalent information could be extracted from less expensive individual data, i.e. data concerning single individuals instead of several ones. In this work, we try to tackle this problem through group detection. We gather both types of data from a population of students, and estimate groups separately using individual and relational data, leading to sets of clusters and communities, respectively. We found out there is no strong overlapping between them, meaning both types of data do not convey the same information in this specific context, and can therefore be considered as complementary. However, a link, even if weak, exists and appears when we identify the most discriminant attributes relatively to the communities. Implications in Business Science include community prediction using individual data.Social Networks; Business Science; Cluster Analysis; Community Detection; Community Comparison; Individual Data; Relational Data
- …