3,642 research outputs found

    Developments in the theory of randomized shortest paths with a comparison of graph node distances

    Get PDF
    There have lately been several suggestions for parametrized distances on a graph that generalize the shortest path distance and the commute time or resistance distance. The need for developing such distances has risen from the observation that the above-mentioned common distances in many situations fail to take into account the global structure of the graph. In this article, we develop the theory of one family of graph node distances, known as the randomized shortest path dissimilarity, which has its foundation in statistical physics. We show that the randomized shortest path dissimilarity can be easily computed in closed form for all pairs of nodes of a graph. Moreover, we come up with a new definition of a distance measure that we call the free energy distance. The free energy distance can be seen as an upgrade of the randomized shortest path dissimilarity as it defines a metric, in addition to which it satisfies the graph-geodetic property. The derivation and computation of the free energy distance are also straightforward. We then make a comparison between a set of generalized distances that interpolate between the shortest path distance and the commute time, or resistance distance. This comparison focuses on the applicability of the distances in graph node clustering and classification. The comparison, in general, shows that the parametrized distances perform well in the tasks. In particular, we see that the results obtained with the free energy distance are among the best in all the experiments.Comment: 30 pages, 4 figures, 3 table

    Mal-Netminer: Malware Classification Approach based on Social Network Analysis of System Call Graph

    Get PDF
    As the security landscape evolves over time, where thousands of species of malicious codes are seen every day, antivirus vendors strive to detect and classify malware families for efficient and effective responses against malware campaigns. To enrich this effort, and by capitalizing on ideas from the social network analysis domain, we build a tool that can help classify malware families using features driven from the graph structure of their system calls. To achieve that, we first construct a system call graph that consists of system calls found in the execution of the individual malware families. To explore distinguishing features of various malware species, we study social network properties as applied to the call graph, including the degree distribution, degree centrality, average distance, clustering coefficient, network density, and component ratio. We utilize features driven from those properties to build a classifier for malware families. Our experimental results show that influence-based graph metrics such as the degree centrality are effective for classifying malware, whereas the general structural metrics of malware are less effective for classifying malware. Our experiments demonstrate that the proposed system performs well in detecting and classifying malware families within each malware class with accuracy greater than 96%.Comment: Mathematical Problems in Engineering, Vol 201

    A Survey on Centrality Metrics and Their Implications in Network Resilience

    Full text link
    Centrality metrics have been used in various networks, such as communication, social, biological, geographic, or contact networks. In particular, they have been used in order to study and analyze targeted attack behaviors and investigated their effect on network resilience. Although a rich volume of centrality metrics has been developed for decades, a limited set of centrality metrics have been commonly in use. This paper aims to introduce various existing centrality metrics and discuss their applicabilities and performance based on the results obtained from extensive simulation experiments to encourage their use in solving various computing and engineering problems in networks.Comment: Main paper: 36 pages, 2 figures. Appendix 23 pages,45 figure

    City Indicators for Mobility Data Mining

    Get PDF
    Classifying cities and other geographical units is a classical task in urban geography, typically carried out through manual analysis of specific characteristics of the area. The primary objective of this paper is to contribute to this process through the definition of a wide set of city indicators that capture different aspects of the city, mainly based on human mobility and automatically computed from a set of data sources, including mobility traces and road networks. The secondary objective is to prove that such set of characteristics is indeed rich enough to support a simple task of geographical transfer learning, namely identifying which groups of geographical areas can share with each other a basic traffic prediction model. The experiments show that similarity in terms of our city indicators also means better transferability of predictive models, opening the way to the development of more sophisticated solutions that leverage city indicators

    Reconstructing networks

    Get PDF
    Complex networks datasets often come with the problem of missing information: interactions data that have not been measured or discovered, may be affected by errors, or are simply hidden because of privacy issues. This Element provides an overview of the ideas, methods and techniques to deal with this problem and that together define the field of network reconstruction. Given the extent of the subject, we shall focus on the inference methods rooted in statistical physics and information theory. The discussion will be organized according to the different scales of the reconstruction task, that is, whether the goal is to reconstruct the macroscopic structure of the network, to infer its mesoscale properties, or to predict the individual microscopic connections.Comment: 107 pages, 25 figure

    Reconstructing networks

    Get PDF
    Complex networks datasets often come with the problem of missing information: interactions data that have not been measured or discovered, may be affected by errors, or are simply hidden because of privacy issues. This Element provides an overview of the ideas, methods and techniques to deal with this problem and that together define the field of network reconstruction. Given the extent of the subject, the authors focus on the inference methods rooted in statistical physics and information theory. The discussion is organized according to the different scales of the reconstruction task, that is, whether the goal is to reconstruct the macroscopic structure of the network, to infer its mesoscale properties, or to predict the individual microscopic connections
    • …
    corecore