53 research outputs found

    Developing Robust Models, Algorithms, Databases and Tools With Applications to Cybersecurity and Healthcare

    Get PDF
    As society and technology becomes increasingly interconnected, so does the threat landscape. Once isolated threats now pose serious concerns to highly interdependent systems, highlighting the fundamental need for robust machine learning. This dissertation contributes novel tools, algorithms, databases, and models—through the lens of robust machine learning—in a research effort to solve large-scale societal problems affecting millions of people in the areas of cybersecurity and healthcare. (1) Tools: We develop TIGER, the first comprehensive graph robustness toolbox; and our ROBUSTNESS SURVEY identifies critical yet missing areas of graph robustness research. (2) Algorithms: Our survey and toolbox reveal existing work has overlooked lateral attacks on computer authentication networks. We develop D2M, the first algorithmic framework to quantify and mitigate network vulnerability to lateral attacks by modeling lateral attack movement from a graph theoretic perspective. (3) Databases: To prevent lateral attacks altogether, we develop MALNET-GRAPH, the world’s largest cybersecurity graph database—containing over 1.2M graphs across 696 classes—and show the first large-scale results demonstrating the effectiveness of malware detection through a graph medium. We extend MALNET-GRAPH by constructing the largest binary-image cybersecurity database—containing 1.2M images, 133×more images than the only other public database—enabling new discoveries in malware detection and classification research restricted to a few industry labs (MALNET-IMAGE). (4) Models: To protect systems from adversarial attacks, we develop UNMASK, the first model that flags semantic incoherence in computer vision systems, which detects up to 96.75% of attacks, and defends the model by correctly classifying up to 93% of attacks. Inspired by UNMASK’s ability to protect computer visions systems from adversarial attack, we develop REST, which creates noise robust models through a novel combination of adversarial training, spectral regularization, and sparsity regularization. In the presence of noise, our method improves state-of-the-art sleep stage scoring by 71%—allowing us to diagnose sleep disorders earlier on and in the home environment—while using 19× less parameters and 15×less MFLOPS. Our work has made significant impact to industry and society: the UNMASK framework laid the foundation for a multi-million dollar DARPA GARD award; the TIGER toolbox for graph robustness analysis is a part of the Nvidia Data Science Teaching Kit, available to educators around the world; we released MALNET, the world’s largest graph classification database with 1.2M graphs; and the D2M framework has had major impact to Microsoft products, inspiring changes to the product’s approach to lateral attack detection.Ph.D

    Network-based methods for biological data integration in precision medicine

    Full text link
    [eng] The vast and continuously increasing volume of available biomedical data produced during the last decades opens new opportunities for large-scale modeling of disease biology, facilitating a more comprehensive and integrative understanding of its processes. Nevertheless, this type of modelling requires highly efficient computational systems capable of dealing with such levels of data volumes. Computational approximations commonly used in machine learning and data analysis, namely dimensionality reduction and network-based approaches, have been developed with the goal of effectively integrating biomedical data. Among these methods, network-based machine learning stands out due to its major advantage in terms of biomedical interpretability. These methodologies provide a highly intuitive framework for the integration and modelling of biological processes. This PhD thesis aims to explore the potential of integration of complementary available biomedical knowledge with patient-specific data to provide novel computational approaches to solve biomedical scenarios characterized by data scarcity. The primary focus is on studying how high-order graph analysis (i.e., community detection in multiplex and multilayer networks) may help elucidate the interplay of different types of data in contexts where statistical power is heavily impacted by small sample sizes, such as rare diseases and precision oncology. The central focus of this thesis is to illustrate how network biology, among the several data integration approaches with the potential to achieve this task, can play a pivotal role in addressing this challenge provided its advantages in molecular interpretability. Through its insights and methodologies, it introduces how network biology, and in particular, models based on multilayer networks, facilitates bringing the vision of precision medicine to these complex scenarios, providing a natural approach for the discovery of new biomedical relationships that overcomes the difficulties for the study of cohorts presenting limited sample sizes (data-scarce scenarios). Delving into the potential of current artificial intelligence (AI) and network biology applications to address data granularity issues in the precision medicine field, this PhD thesis presents pivotal research works, based on multilayer networks, for the analysis of two rare disease scenarios with specific data granularities, effectively overcoming the classical constraints hindering rare disease and precision oncology research. The first research article presents a personalized medicine study of the molecular determinants of severity in congenital myasthenic syndromes (CMS), a group of rare disorders of the neuromuscular junction (NMJ). The analysis of severity in rare diseases, despite its importance, is typically neglected due to data availability. In this study, modelling of biomedical knowledge via multilayer networks allowed understanding the functional implications of individual mutations in the cohort under study, as well as their relationships with the causal mutations of the disease and the different levels of severity observed. Moreover, the study presents experimental evidence of the role of a previously unsuspected gene in NMJ activity, validating the hypothetical role predicted using the newly introduced methodologies. The second research article focuses on the applicability of multilayer networks for gene priorization. Enhancing concepts for the analysis of different data granularities firstly introduced in the previous article, the presented research provides a methodology based on the persistency of network community structures in a range of modularity resolution, effectively providing a new framework for gene priorization for patient stratification. In summary, this PhD thesis presents major advances on the use of multilayer network-based approaches for the application of precision medicine to data-scarce scenarios, exploring the potential of integrating extensive available biomedical knowledge with patient-specific data

    Enhancing Network Resilience through Machine Learning-powered Graph Combinatorial Optimization: Applications in Cyber Defense and Information Diffusion

    Get PDF
    With the burgeoning advancements of computing and network communication technologies, network infrastructures and their application environments have become increasingly complex. Due to the increased complexity, networks are more prone to hardware faults and highly susceptible to cyber-attacks. Therefore, for rapidly growing network-centric applications, network resilience is essential to minimize the impact of attacks and to ensure that the network provides an acceptable level of services during attacks, faults or disruptions. In this regard, this thesis focuses on developing effective approaches for enhancing network resilience. Existing approaches for enhancing network resilience emphasize on determining bottleneck nodes and edges in the network and designing proactive responses to safeguard the network against attacks. However, existing solutions generally consider broader application domains and possess limited applicability when applied to specific application areas such as cyber defense and information diffusion, which are highly popular application domains among cyber attackers. These solutions often prioritize general security measures and may not be able to address the complex targeted cyberattacks [147, 149]. Cyber defense and information diffusion application domains usually consist of sensitive networks that attackers target to gain unauthorized access, potentially causing significant financial and reputational loss. This thesis aims to design effective, efficient and scalable techniques for discovering bottleneck nodes and edges in the network to enhance network resilience in cyber defense and information diffusion application domains. We first investigate a cyber defense graph optimization problem, i.e., hardening active directory systems by discovering bottleneck edges in the network. We then study the problem of identifying bottleneck structural hole spanner nodes, which are crucial for information diffusion in the network. We transform both problems into graph-combinatorial optimization problems and design machine learning based approaches for discovering bottleneck points vital for enhancing network resilience. This thesis makes the following four contributions. We first study defending active directories by discovering bottleneck edges in the network and make the following two contributions. (1) To defend active directories by discovering and blocking bottleneck edges in the graphs, we first prove that deriving an optimal defensive policy is #P-hard. We design a kernelization technique that reduces the active directory graph to a much smaller condensed graph. We propose an effective edge-blocking defensive policy by combining neural network-based dynamic program and evolutionary diversity optimization to defend active directory graphs. The key idea is to accurately train the attacking policy to obtain an effective defensive policy. The experimental evaluations on synthetic AD attack graphs demonstrate that our defensive policy generates effective defense. (2) To harden large-scale active directory graphs, we propose reinforcement learning based policy that uses evolutionary diversity optimization to generate edge-blocking defensive plans. The main idea is to train the attacker’s policy on multiple independent defensive plan environments simultaneously so as to obtain effective defensive policy. The experimental results on synthetic AD graphs show that the proposed defensive policy is highly effective, scales better and generates better defensive plans than our previously proposed neural network-based dynamic program and evolutionary diversity optimization approach. We then investigate discovering bottleneck structural hole spanner nodes in the network and make the following two contributions. (3) To discover bottleneck structural hole spanner nodes in large-scale and diverse networks, we propose two graph neural network models, GraphSHS and Meta-GraphSHS. The main idea is to transform the SHS identification problem into a learning problem and use the graph neural network models to learn the bottleneck nodes. Besides, the Meta-GraphSHS model learns generalizable knowledge from diverse training graphs to create a customized model that can be fine-tuned to discover SHSs in new unseen diverse graphs. Our experimental results show that the proposed models are highly effective and efficient. (4) To identify bottleneck structural hole spanner nodes in dynamic networks, we propose a decremental algorithm and graph neural network model. The key idea of our proposed algorithm is to reduce the re-computations by identifying affected nodes due to updates in the network and performing re-computations for affected nodes only. Our graph neural network model considers the dynamic network as a series of snapshots and learns to discover SHS nodes in these snapshots. Our experiments demonstrate that the proposed approaches achieve significant speedup over re-computations for dynamic graphs.Thesis (Ph.D.) -- University of Adelaide, School of Computer and Mathematical Sciences, 202

    Complex networks: Structure and dynamics

    Full text link

    Module hierarchy and centralisation in the anatomy and dynamics of human cortex

    Get PDF
    Systems neuroscience has recently unveiled numerous fundamental features of the macroscopic architecture of the human brain, the connectome, and we are beginning to understand how characteristics of brain dynamics emerge from the underlying anatomical connectivity. The current work utilises complex network analysis on a high-resolution structural connectivity of the human cortex to identify generic organisation principles, such as centralised, modular and hierarchical properties, as well as specific areas that are pivotal in shaping cortical dynamics and function. After confirming its small-world and modular architecture, we characterise the cortex’ multilevel modular hierarchy, which appears to be reasonably centralised towards the brain’s strong global structural core. The potential functional importance of the core and hub regions is assessed by various complex network metrics, such as integration measures, network vulnerability and motif spectrum analysis. Dynamics facilitated by the large-scale cortical topology is explored by simulating coupled oscillators on the anatomical connectivity. The results indicate that cortical connectivity appears to favour high dynamical complexity over high synchronizability. Taking the ability to entrain other brain regions as a proxy for the threat posed by a potential epileptic focus in a given region, we also show that epileptic foci in topologically more central areas should pose a higher epileptic threat than foci in more peripheral areas. To assess the influence of macroscopic brain anatomy in shaping global resting state dynamics on slower time scales, we compare empirically obtained functional connectivity data with data from simulating dynamics on the structural connectivity. Despite considerable micro-scale variability between the two functional connectivities, our simulations are able to approximate the profile of the empirical functional connectivity. Our results outline the combined characteristics a hierarchically modular and reasonably centralised macroscopic architecture of the human cerebral cortex, which, through these topological attributes, appears to facilitate highly complex dynamics and fundamentally shape brain function

    Topological Complexity of the Electricity Transmissión Network. Implications in the Sustainability Paradigm

    Get PDF
    Aquesta tesi explora i estudia l'estructura, dinàmica i evolució de la xarxa de transmissió d'electricitat des de la perspectiva dels sistemes complexos, essent el seu principal objectiu la definició de nous criteris i eines per ajudar a un disseny més eficient i sostenible de la xarxa de transmissió de potència. Per assolir aquest objectiu, s'han estudiat i analitzat dos conjunts de dades. D'una banda, la xarxa corresponent a la Unió per la Coordinació de Transport d'Electricitat (UCTE), que associa a la majoria dels operadors de xarxes elèctriques nacionals de l'Europa continental amb la finalitat de coordinar la producció i la demanda anual d'uns 2.300 TWh d'energia per 450 milions de clients de 24 països diferents. D'altra banda, l'evolució històrica de la xarxa de transport del Gestionaire du Réseau du Transport d'Electricité (RTE), responsable de l'explotació, manteniment i desenvolupament de la xarxa nacional mésgran d'Europa, la xarxa francesa de transport d'electricitat.Els resultats obtinguts fins al moment mostren diferències estadísticament significatives en l'estructura de les xarxes elèctriques, definint clarament comportaments dinàmics particulars que ens permeten segregar les xarxes europees en dos grups, a saber, fràgil i robust. Les xarxes fràgils es caracteritzen per les topologies més estructurades, mallades i no a l'atzar, mentre que les topològiques de caràcter més robust, contraintuïtivament, tenen estructures molts més aleatòries. Les conseqüències d'aquestes troballes per a la sostenibilitat de les xarxes d'infraestructures són importants en termes de cost i avaluació d'impactes i riscos. Es presenta així mateix un model per a l'evolució temporal i espacial d'una xarxa elèctrica. En aquest sentit, suggerim que la fragilitat topològica global augmenta quan es consideren accions de connectivitat de caire local a fi d'augmentar la fiabilitat de la xarxa a escala regional.Aquests resultats suggereixen la necessitat d'aplicar nous mètodes de disseny de la xarxa elèctrica així com noves eines amb capacitat per incloure aquests nous aspectes topològics en l'avaluació de l'eficiència i la fiabilitat de lamateixa.This Thesis explores the structure, dynamics and evolution of the electricity transmission network from a complexsystems perspective, its main objective being the definition of new criteria and tools to help to design a more efficientand sustainable transmission power grid. In doing so, two data sets have been explored and analyzed. On one hand,the Union for the Coordination of Transport of Electricity (UCTE) network, which associates most of the continentalEurope national power grid operators in order to coordinate the production and demand of some annual 2300 TWh ofenergy and 450 million customers from 24 countries. On the other hand, the Gestionaire du Réseau du Transportd'Electricité (RTE) transport network historical evolution, responsible for operating, maintaining and developing thebiggest national network in Europe, the French electricity transmission network.The results obtained so far show statistically significant dissimilarities in the structure of the power grids, clearlydefining and enclosing particular dynamic behaviors that enable us to segregate European networks in two sets,namely fragile and robust. Fragile networks are characterized by meshed topologies and non random structures whilerobust ones share more randomly generated topologies. The consequences of these finding for the sustainability ofinfrastructure networks are significant in terms of cost and risk assessment. A model for the evolution of a power gridnetwork is also presented. We suggest that global topological fragility increases when local connectivity schemes areadapted in order to increase local reliability.These outcomes appeal for new power grid design methods and tools capable to include these new topologicalaspects into efficiency and reliability assessment

    On Business Analytics: Dynamic Network Analysis for Descriptive Analytics and Multicriteria Decision Analysis for Prescriptive Analytics.

    Get PDF
    Ferry Jules. Collèges communaux. — Classement des professeurs. In: Bulletin administratif de l'instruction publique. Tome 24 n°467, 1881. pp. 836-842
    corecore