9,548 research outputs found

    A heuristic optimization method for mitigating the impact of a virus attack

    Get PDF
    Taking precautions before or during the start of a virus outbreak can heavily reduce the number of infected. The question which individuals should be immunized in order to mitigate the impact of the virus on the rest of population has received quite some attention in the literature. The dynamics of the of a virus spread through a population is often represented as information spread over a complex network. The strategies commonly proposed to determine which nodes are to be selected for immunization often involve only one centrality measure at a time, while often the topology of the network seems to suggest that a single metric is insufficient to capture the influence of a node entirely. In this work we present a generic method based on a genetic algorithm (GA) which does not rely explicitly on any centrality measures during its search but only exploits this type of information to narrow the search space. The fitness of an individual is defined as the estimated expected number of infections of a virus following SIR dynamics. The proposed method is evaluated on two contact networks: the Goodreau's Faux Mesa high school and the US air transportation network. The GA method manages to outperform the most common strategies based on a single metric for the air transportation network and its performance is comparable with the best performing strategy for the high school network.Comment: To appear in the proceedings of the International Conference on Computational Science (ICCS) in Barcelona. 11 pages, 5 figure

    Network inference and community detection, based on covariance matrices, correlations and test statistics from arbitrary distributions

    Get PDF
    In this paper we propose methodology for inference of binary-valued adjacency matrices from various measures of the strength of association between pairs of network nodes, or more generally pairs of variables. This strength of association can be quantified by sample covariance and correlation matrices, and more generally by test-statistics and hypothesis test p-values from arbitrary distributions. Community detection methods such as block modelling typically require binary-valued adjacency matrices as a starting point. Hence, a main motivation for the methodology we propose is to obtain binary-valued adjacency matrices from such pairwise measures of strength of association between variables. The proposed methodology is applicable to large high-dimensional data-sets and is based on computationally efficient algorithms. We illustrate its utility in a range of contexts and data-sets

    A Parameterized Centrality Metric for Network Analysis

    Full text link
    A variety of metrics have been proposed to measure the relative importance of nodes in a network. One of these, alpha-centrality [Bonacich, 2001], measures the number of attenuated paths that exist between nodes. We introduce a normalized version of this metric and use it to study network structure, specifically, to rank nodes and find community structure of the network. Specifically, we extend the modularity-maximization method [Newman and Girvan, 2004] for community detection to use this metric as the measure of node connectivity. Normalized alpha-centrality is a powerful tool for network analysis, since it contains a tunable parameter that sets the length scale of interactions. By studying how rankings and discovered communities change when this parameter is varied allows us to identify locally and globally important nodes and structures. We apply the proposed method to several benchmark networks and show that it leads to better insight into network structure than alternative methods.Comment: 11 pages, submitted to Physical Review

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
    • …
    corecore