9 research outputs found

    Assessing Percolation Threshold Based on High-Order Non-Backtracking Matrices

    Full text link
    Percolation threshold of a network is the critical value such that when nodes or edges are randomly selected with probability below the value, the network is fragmented but when the probability is above the value, a giant component connecting large portion of the network would emerge. Assessing the percolation threshold of networks has wide applications in network reliability, information spread, epidemic control, etc. The theoretical approach so far to assess the percolation threshold is mainly based on spectral radius of adjacency matrix or non-backtracking matrix, which is limited to dense graphs or locally treelike graphs, and is less effective for sparse networks with non-negligible amount of triangles and loops. In this paper, we study high-order non-backtracking matrices and their application to assessing percolation threshold. We first define high-order non-backtracking matrices and study the properties of their spectral radii. Then we focus on 2nd-order non-backtracking matrix and demonstrate analytically that the reciprocal of its spectral radius gives a tighter lower bound than those of adjacency and standard non-backtracking matrices. We further build a smaller size matrix with the same largest eigenvalue as the 2nd-order non-backtracking matrix to improve computation efficiency. Finally, we use both synthetic networks and 42 real networks to illustrate that the use of 2nd-order non-backtracking matrix does give better lower bound for assessing percolation threshold than adjacency and standard non-backtracking matrices.Comment: to appear in proceedings of the 26th International World Wide Web Conference(WWW2017

    Network Features in Complex Applications

    Get PDF
    The aim of this thesis is to show the potential of Graph Theory and Network Science applied in real-case scenarios. Indeed, there is a gap in the state-of-art in combining mathematical theory with more practical applications such as helping the Law Enforcement Agencies (LEAs) to conduct their investigations, or in Deep Learning techniques which enable Artificial Neural Networks (ANNs) to work more efficiently. In particular, three main case studies on which evaluate the goodness of Social Network Analysis (SNA) tools were considered: (i) Criminal Networks Analysis, (ii) Networks Resilience, and (iii) ANN topology. We have addressed two typical problems in dealing with criminal networks: (i) how to efficiently slow down the information spreading within the criminal organisation by prompt and targeted investigative operations from LEAs and (ii) what is the impact of missing data during LEAs investigation. In the first case, we identified the appropriate centrality metric to effectively identify the criminals to be arrested, showing how, by neutralising only 5% of the top-ranking affiliates, the network connectivity dropped by 70%. In the second case, we simulated the missing data problem by pruning some criminal networks by removing nodes or links and compared these networks against the originals considering four metrics to compute graph similarities. We discovered that a negligible error (i.e., 30% difference from the real network) was detected when, for example, some wiretaps are missing. On the other hand, it is crucial to investigate the suspects in a timely fashion, since any exclusion of suspects from an investigation may lead to significant errors (i.e., 80% difference). Next, we defined a new approach for simulating network resilience by a probabilistic failure model. Indeed, while the classical approach for removing nodes was always successful, such an assumption was not realistic. Thus, we defined some models simulating the scenario in which nodes oppose resistance against removal. Once identified the centrality metric that on average, generates the biggest damage in the connectivity of the networks under scrutiny, we have compared our outcomes against the classical node removal approach, by ranking the nodes according to the same centrality metric, which confirmed our intuition. Lastly, we adopted SNA techniques to analyse ANNs. In particular, we moved a step forward from earlier works because not only did our experiments confirm the efficiency arising from training sparse ANNs, but they also managed to further exploit sparsity through a better tuned algorithm, featuring increased speed at a negligible accuracy loss. We focused on the role of the parameter used to fine-tune the training phase of Sparse ANNs. Our intuition has been that this step can be avoided as the accuracy loss is negligible and, as a consequence, the execution time is significantly reduced. Yet, it is evident that Network Science algorithms, by keeping sparsity in ANNs, are a promising direction for accelerating their training processes. All these studies pave the way for a range of unexplored possibilities for an effective use of Network Science at the service of society.PhD Scholarship (Data Science Research Centre, University of Derby

    Developing Robust Models, Algorithms, Databases and Tools With Applications to Cybersecurity and Healthcare

    Get PDF
    As society and technology becomes increasingly interconnected, so does the threat landscape. Once isolated threats now pose serious concerns to highly interdependent systems, highlighting the fundamental need for robust machine learning. This dissertation contributes novel tools, algorithms, databases, and models—through the lens of robust machine learning—in a research effort to solve large-scale societal problems affecting millions of people in the areas of cybersecurity and healthcare. (1) Tools: We develop TIGER, the first comprehensive graph robustness toolbox; and our ROBUSTNESS SURVEY identifies critical yet missing areas of graph robustness research. (2) Algorithms: Our survey and toolbox reveal existing work has overlooked lateral attacks on computer authentication networks. We develop D2M, the first algorithmic framework to quantify and mitigate network vulnerability to lateral attacks by modeling lateral attack movement from a graph theoretic perspective. (3) Databases: To prevent lateral attacks altogether, we develop MALNET-GRAPH, the world’s largest cybersecurity graph database—containing over 1.2M graphs across 696 classes—and show the first large-scale results demonstrating the effectiveness of malware detection through a graph medium. We extend MALNET-GRAPH by constructing the largest binary-image cybersecurity database—containing 1.2M images, 133×more images than the only other public database—enabling new discoveries in malware detection and classification research restricted to a few industry labs (MALNET-IMAGE). (4) Models: To protect systems from adversarial attacks, we develop UNMASK, the first model that flags semantic incoherence in computer vision systems, which detects up to 96.75% of attacks, and defends the model by correctly classifying up to 93% of attacks. Inspired by UNMASK’s ability to protect computer visions systems from adversarial attack, we develop REST, which creates noise robust models through a novel combination of adversarial training, spectral regularization, and sparsity regularization. In the presence of noise, our method improves state-of-the-art sleep stage scoring by 71%—allowing us to diagnose sleep disorders earlier on and in the home environment—while using 19× less parameters and 15×less MFLOPS. Our work has made significant impact to industry and society: the UNMASK framework laid the foundation for a multi-million dollar DARPA GARD award; the TIGER toolbox for graph robustness analysis is a part of the Nvidia Data Science Teaching Kit, available to educators around the world; we released MALNET, the world’s largest graph classification database with 1.2M graphs; and the D2M framework has had major impact to Microsoft products, inspiring changes to the product’s approach to lateral attack detection.Ph.D
    corecore