3,642 research outputs found
Developments in the theory of randomized shortest paths with a comparison of graph node distances
There have lately been several suggestions for parametrized distances on a
graph that generalize the shortest path distance and the commute time or
resistance distance. The need for developing such distances has risen from the
observation that the above-mentioned common distances in many situations fail
to take into account the global structure of the graph. In this article, we
develop the theory of one family of graph node distances, known as the
randomized shortest path dissimilarity, which has its foundation in statistical
physics. We show that the randomized shortest path dissimilarity can be easily
computed in closed form for all pairs of nodes of a graph. Moreover, we come up
with a new definition of a distance measure that we call the free energy
distance. The free energy distance can be seen as an upgrade of the randomized
shortest path dissimilarity as it defines a metric, in addition to which it
satisfies the graph-geodetic property. The derivation and computation of the
free energy distance are also straightforward. We then make a comparison
between a set of generalized distances that interpolate between the shortest
path distance and the commute time, or resistance distance. This comparison
focuses on the applicability of the distances in graph node clustering and
classification. The comparison, in general, shows that the parametrized
distances perform well in the tasks. In particular, we see that the results
obtained with the free energy distance are among the best in all the
experiments.Comment: 30 pages, 4 figures, 3 table
Mal-Netminer: Malware Classification Approach based on Social Network Analysis of System Call Graph
As the security landscape evolves over time, where thousands of species of
malicious codes are seen every day, antivirus vendors strive to detect and
classify malware families for efficient and effective responses against malware
campaigns. To enrich this effort, and by capitalizing on ideas from the social
network analysis domain, we build a tool that can help classify malware
families using features driven from the graph structure of their system calls.
To achieve that, we first construct a system call graph that consists of system
calls found in the execution of the individual malware families. To explore
distinguishing features of various malware species, we study social network
properties as applied to the call graph, including the degree distribution,
degree centrality, average distance, clustering coefficient, network density,
and component ratio. We utilize features driven from those properties to build
a classifier for malware families. Our experimental results show that
influence-based graph metrics such as the degree centrality are effective for
classifying malware, whereas the general structural metrics of malware are less
effective for classifying malware. Our experiments demonstrate that the
proposed system performs well in detecting and classifying malware families
within each malware class with accuracy greater than 96%.Comment: Mathematical Problems in Engineering, Vol 201
A Survey on Centrality Metrics and Their Implications in Network Resilience
Centrality metrics have been used in various networks, such as communication,
social, biological, geographic, or contact networks. In particular, they have
been used in order to study and analyze targeted attack behaviors and
investigated their effect on network resilience. Although a rich volume of
centrality metrics has been developed for decades, a limited set of centrality
metrics have been commonly in use. This paper aims to introduce various
existing centrality metrics and discuss their applicabilities and performance
based on the results obtained from extensive simulation experiments to
encourage their use in solving various computing and engineering problems in
networks.Comment: Main paper: 36 pages, 2 figures. Appendix 23 pages,45 figure
City Indicators for Mobility Data Mining
Classifying cities and other geographical units is a classical task in
urban geography, typically carried out through manual analysis
of specific characteristics of the area. The primary objective of
this paper is to contribute to this process through the definition
of a wide set of city indicators that capture different aspects
of the city, mainly based on human mobility and automatically
computed from a set of data sources, including mobility traces
and road networks. The secondary objective is to prove that such
set of characteristics is indeed rich enough to support a simple
task of geographical transfer learning, namely identifying which
groups of geographical areas can share with each other a basic
traffic prediction model. The experiments show that similarity in
terms of our city indicators also means better transferability of
predictive models, opening the way to the development of more
sophisticated solutions that leverage city indicators
Reconstructing networks
Complex networks datasets often come with the problem of missing information:
interactions data that have not been measured or discovered, may be affected by
errors, or are simply hidden because of privacy issues. This Element provides
an overview of the ideas, methods and techniques to deal with this problem and
that together define the field of network reconstruction. Given the extent of
the subject, we shall focus on the inference methods rooted in statistical
physics and information theory. The discussion will be organized according to
the different scales of the reconstruction task, that is, whether the goal is
to reconstruct the macroscopic structure of the network, to infer its mesoscale
properties, or to predict the individual microscopic connections.Comment: 107 pages, 25 figure
Reconstructing networks
Complex networks datasets often come with the problem of missing information: interactions data that have not been measured or discovered, may be affected by errors, or are simply hidden because of privacy issues. This Element provides an overview of the ideas, methods and techniques to deal with this problem and that together define the field of network reconstruction. Given the extent of the subject, the authors focus on the inference methods rooted in statistical physics and information theory. The discussion is organized according to the different scales of the reconstruction task, that is, whether the goal is to reconstruct the macroscopic structure of the network, to infer its mesoscale properties, or to predict the individual microscopic connections
- …