9 research outputs found
Assessing Percolation Threshold Based on High-Order Non-Backtracking Matrices
Percolation threshold of a network is the critical value such that when nodes
or edges are randomly selected with probability below the value, the network is
fragmented but when the probability is above the value, a giant component
connecting large portion of the network would emerge. Assessing the percolation
threshold of networks has wide applications in network reliability, information
spread, epidemic control, etc. The theoretical approach so far to assess the
percolation threshold is mainly based on spectral radius of adjacency matrix or
non-backtracking matrix, which is limited to dense graphs or locally treelike
graphs, and is less effective for sparse networks with non-negligible amount of
triangles and loops. In this paper, we study high-order non-backtracking
matrices and their application to assessing percolation threshold. We first
define high-order non-backtracking matrices and study the properties of their
spectral radii. Then we focus on 2nd-order non-backtracking matrix and
demonstrate analytically that the reciprocal of its spectral radius gives a
tighter lower bound than those of adjacency and standard non-backtracking
matrices. We further build a smaller size matrix with the same largest
eigenvalue as the 2nd-order non-backtracking matrix to improve computation
efficiency. Finally, we use both synthetic networks and 42 real networks to
illustrate that the use of 2nd-order non-backtracking matrix does give better
lower bound for assessing percolation threshold than adjacency and standard
non-backtracking matrices.Comment: to appear in proceedings of the 26th International World Wide Web
Conference(WWW2017
Network Features in Complex Applications
The aim of this thesis is to show the potential of Graph Theory and Network Science applied in real-case scenarios. Indeed, there is a gap in the state-of-art in combining mathematical theory with more practical applications such as helping the Law Enforcement Agencies (LEAs) to conduct their investigations, or in Deep Learning techniques which enable Artificial Neural Networks (ANNs) to work more efficiently. In particular, three main case studies on which evaluate the goodness of Social Network Analysis (SNA) tools were considered: (i) Criminal Networks Analysis, (ii) Networks Resilience, and (iii) ANN topology.
We have addressed two typical problems in dealing with criminal networks: (i) how to efficiently slow down the information spreading within the criminal organisation by prompt and targeted investigative operations from LEAs and (ii) what is the impact of missing data during LEAs investigation.
In the first case, we identified the appropriate centrality metric to effectively identify the criminals to be arrested, showing how, by neutralising only 5% of the top-ranking affiliates, the network connectivity dropped by 70%.
In the second case, we simulated the missing data problem by pruning some criminal networks by removing nodes or links and compared these networks against the originals considering four metrics to compute graph similarities. We discovered that a negligible error (i.e., 30% difference from the real network) was detected when, for example, some wiretaps are missing. On the other hand, it is crucial to investigate the suspects in a timely fashion, since any exclusion of suspects from an investigation may lead to significant errors (i.e., 80% difference).
Next, we defined a new approach for simulating network resilience by a probabilistic failure model. Indeed, while the classical approach for removing nodes was always successful, such an assumption was not realistic. Thus, we defined some models simulating the scenario in which nodes oppose resistance against removal. Once identified the centrality metric that on average, generates the biggest damage in the connectivity of the networks under scrutiny, we have compared our outcomes against the classical node removal approach, by ranking the nodes according to the same centrality metric, which confirmed our intuition.
Lastly, we adopted SNA techniques to analyse ANNs. In particular, we moved a step forward from earlier works because not only did our experiments confirm the efficiency arising from training sparse ANNs, but they also managed to further exploit sparsity through a better tuned algorithm, featuring increased speed at a negligible accuracy loss. We focused on the role of the parameter used to fine-tune the training phase of Sparse ANNs. Our intuition has been that this step can be avoided as the accuracy loss is negligible and, as a consequence, the execution time is significantly reduced. Yet, it is evident that Network Science algorithms, by keeping sparsity in ANNs, are a promising direction for accelerating their training processes.
All these studies pave the way for a range of unexplored possibilities for an effective use of Network Science at the service of society.PhD Scholarship (Data Science Research Centre, University of Derby
Recommended from our members
Analysis, Modeling, and Control of Dynamic Processes in Networks
Dynamic network processes have surrounded people for millennia. Information spread through social networks, alliance formation in financial and organizational networks, heat diffusion through material networks, and distributed synchronization in robotic networks are just a few examples. Network processes are studies along three dimensions: analysis of network processes through the data produced by them; designing complex plausible, yet, tractable mathematical models for network processes; and designing control mechanisms that would guide network processes towards desirable evolution patterns. This thesis advances the frontier of knowledge about network processes along each of these three dimensions, emphasizing applications to social networks.The first part of the thesis is dedicated to the design of a method for model-driven analysis of a polar opinion formation process in social networks. The core of the method is a distance measure quantifying the likelihood of a social network's transitioning between different states with respect to a chosen opinion dynamics model characterizing expected evolution of the network's state. I describe how to design such a distance measure relying upon the classical transportation problem, compute it in linear time, and use it in applications.In the second part of the thesis, I focus on designing a model for polar opinion formation in social networks, and define a class of non-linear models that capture the dependence of the users' opinion formation behavior upon the opinions themselves. The obtained models are connected to socio-psychological theories, and their behavior is theoretically analyzed employing tools from non-smooth analysis and a generalization of LaSalle Invariance Principle.The third part of the thesis targets the problem of defense against social control. While the existing socio-psychological theories as well as influence maximization techniques expose the opinion formation process in social networks to external attacks, I propose an algorithm that nullifies the effect of such attacks by strategically recommending a small number of new edges to the network's users. The optimization problem underlying the algorithm is NP-hard, and I provide a pseudo-linear time heuristic---drawing upon the theory of Markov chains---that solves the problem approximately and performs well in experiments
Developing Robust Models, Algorithms, Databases and Tools With Applications to Cybersecurity and Healthcare
As society and technology becomes increasingly interconnected, so does the threat landscape. Once isolated threats now pose serious concerns to highly interdependent systems, highlighting the fundamental need for robust machine learning. This dissertation contributes novel tools, algorithms, databases, and models—through the lens of robust machine learning—in a research effort to solve large-scale societal problems affecting millions of people in the areas of cybersecurity and healthcare.
(1) Tools: We develop TIGER, the first comprehensive graph robustness toolbox; and our ROBUSTNESS SURVEY identifies critical yet missing areas of graph robustness research.
(2) Algorithms: Our survey and toolbox reveal existing work has overlooked lateral attacks on computer authentication networks. We develop D2M, the first algorithmic framework to quantify and mitigate network vulnerability to lateral attacks by modeling lateral attack movement from a graph theoretic perspective.
(3) Databases: To prevent lateral attacks altogether, we develop MALNET-GRAPH, the world’s largest cybersecurity graph database—containing over 1.2M graphs across 696 classes—and show the first large-scale results demonstrating the effectiveness of malware detection through a graph medium. We extend MALNET-GRAPH by constructing the largest binary-image cybersecurity database—containing 1.2M images, 133×more images than the only other public database—enabling new discoveries in malware detection and classification research restricted to a few industry labs (MALNET-IMAGE).
(4) Models: To protect systems from adversarial attacks, we develop UNMASK, the first model that flags semantic incoherence in computer vision systems, which detects up to 96.75% of attacks, and defends the model by correctly classifying up to 93% of attacks. Inspired by UNMASK’s ability to protect computer visions systems from adversarial attack, we develop REST, which creates noise robust models through a novel combination of adversarial training, spectral regularization, and sparsity regularization. In the presence of noise, our method improves state-of-the-art sleep stage scoring by 71%—allowing us to diagnose sleep disorders earlier on and in the home environment—while using 19× less parameters and 15×less MFLOPS. Our work has made significant impact to industry and society: the UNMASK framework laid the foundation for a multi-million dollar DARPA GARD award; the TIGER toolbox for graph robustness analysis is a part of the Nvidia Data Science Teaching Kit, available to educators around the world; we released MALNET, the world’s largest graph classification database with 1.2M graphs; and the D2M framework has had major impact to Microsoft products, inspiring changes to the product’s approach to lateral attack detection.Ph.D