9,978 research outputs found
Computing Vertex Centrality Measures in Massive Real Networks with a Neural Learning Model
Vertex centrality measures are a multi-purpose analysis tool, commonly used
in many application environments to retrieve information and unveil knowledge
from the graphs and network structural properties. However, the algorithms of
such metrics are expensive in terms of computational resources when running
real-time applications or massive real world networks. Thus, approximation
techniques have been developed and used to compute the measures in such
scenarios. In this paper, we demonstrate and analyze the use of neural network
learning algorithms to tackle such task and compare their performance in terms
of solution quality and computation time with other techniques from the
literature. Our work offers several contributions. We highlight both the pros
and cons of approximating centralities though neural learning. By empirical
means and statistics, we then show that the regression model generated with a
feedforward neural networks trained by the Levenberg-Marquardt algorithm is not
only the best option considering computational resources, but also achieves the
best solution quality for relevant applications and large-scale networks.
Keywords: Vertex Centrality Measures, Neural Networks, Complex Network Models,
Machine Learning, Regression ModelComment: 8 pages, 5 tables, 2 figures, version accepted at IJCNN 2018. arXiv
admin note: text overlap with arXiv:1810.1176
HyperANF: Approximating the Neighbourhood Function of Very Large Graphs on a Budget
The neighbourhood function N(t) of a graph G gives, for each t, the number of
pairs of nodes such that y is reachable from x in less that t hops. The
neighbourhood function provides a wealth of information about the graph (e.g.,
it easily allows one to compute its diameter), but it is very expensive to
compute it exactly. Recently, the ANF algorithm (approximate neighbourhood
function) has been proposed with the purpose of approximating NG(t) on large
graphs. We describe a breakthrough improvement over ANF in terms of speed and
scalability. Our algorithm, called HyperANF, uses the new HyperLogLog counters
and combines them efficiently through broadword programming; our implementation
uses overdecomposition to exploit multi-core parallelism. With HyperANF, for
the first time we can compute in a few hours the neighbourhood function of
graphs with billions of nodes with a small error and good confidence using a
standard workstation. Then, we turn to the study of the distribution of the
shortest paths between reachable nodes (that can be efficiently approximated by
means of HyperANF), and discover the surprising fact that its index of
dispersion provides a clear-cut characterisation of proper social networks vs.
web graphs. We thus propose the spid (Shortest-Paths Index of Dispersion) of a
graph as a new, informative statistics that is able to discriminate between the
above two types of graphs. We believe this is the first proposal of a
significant new non-local structural index for complex networks whose
computation is highly scalable
Injecting Uncertainty in Graphs for Identity Obfuscation
Data collected nowadays by social-networking applications create fascinating
opportunities for building novel services, as well as expanding our
understanding about social structures and their dynamics. Unfortunately,
publishing social-network graphs is considered an ill-advised practice due to
privacy concerns. To alleviate this problem, several anonymization methods have
been proposed, aiming at reducing the risk of a privacy breach on the published
data, while still allowing to analyze them and draw relevant conclusions. In
this paper we introduce a new anonymization approach that is based on injecting
uncertainty in social graphs and publishing the resulting uncertain graphs.
While existing approaches obfuscate graph data by adding or removing edges
entirely, we propose using a finer-grained perturbation that adds or removes
edges partially: this way we can achieve the same desired level of obfuscation
with smaller changes in the data, thus maintaining higher utility. Our
experiments on real-world networks confirm that at the same level of identity
obfuscation our method provides higher usefulness than existing randomized
methods that publish standard graphs.Comment: VLDB201
Principal manifolds and graphs in practice: from molecular biology to dynamical systems
We present several applications of non-linear data modeling, using principal
manifolds and principal graphs constructed using the metaphor of elasticity
(elastic principal graph approach). These approaches are generalizations of the
Kohonen's self-organizing maps, a class of artificial neural networks. On
several examples we show advantages of using non-linear objects for data
approximation in comparison to the linear ones. We propose four numerical
criteria for comparing linear and non-linear mappings of datasets into the
spaces of lower dimension. The examples are taken from comparative political
science, from analysis of high-throughput data in molecular biology, from
analysis of dynamical systems.Comment: 12 pages, 9 figure
Correlation of Automorphism Group Size and Topological Properties with Program-size Complexity Evaluations of Graphs and Complex Networks
We show that numerical approximations of Kolmogorov complexity (K) applied to
graph adjacency matrices capture some group-theoretic and topological
properties of graphs and empirical networks ranging from metabolic to social
networks. That K and the size of the group of automorphisms of a graph are
correlated opens up interesting connections to problems in computational
geometry, and thus connects several measures and concepts from complexity
science. We show that approximations of K characterise synthetic and natural
networks by their generating mechanisms, assigning lower algorithmic randomness
to complex network models (Watts-Strogatz and Barabasi-Albert networks) and
high Kolmogorov complexity to (random) Erdos-Renyi graphs. We derive these
results via two different Kolmogorov complexity approximation methods applied
to the adjacency matrices of the graphs and networks. The methods used are the
traditional lossless compression approach to Kolmogorov complexity, and a
normalised version of a Block Decomposition Method (BDM) measure, based on
algorithmic probability theory.Comment: 15 2-column pages, 20 figures. Forthcoming in Physica A: Statistical
Mechanics and its Application
- …