32,379 research outputs found

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Scalable BGP Prefix Selection for Effective Inter-domain Traffic Engineering

    Full text link
    Inter-domain Traffic Engineering for multi-homed networks faces a scalability challenge, as the size of BGP routing table continue to grow. In this context, the choice of the best path must be made potentially for each destination prefix, requiring all available paths to be characterised (e.g., through measurements) and compared with each other. Fortunately, it is well-known that a few number of prefixes carry the larger part of the traffic. As a natural consequence, to engineer large volume of traffic only few prefixes need to be managed. Yet, traffic characteristics of a given prefix can greatly vary over time, and little is known on the dynamism of traffic at this aggregation level, including predicting the set of the most significant prefixes in the near future. %based on past observations. Sophisticated prediction methods won't scale in such context. In this paper, we study the relationship between prefix volume, stability, and predictability, based on recent traffic traces from nine different networks. Three simple and resource-efficient methods to select the prefixes associated with the most important foreseeable traffic volume are then proposed. Such proposed methods allow to select sets of prefixes with both excellent representativeness (volume coverage) and stability in time, for which the best routes are identified. The analysis carried out confirm the potential benefits of a route decision engine
    • …
    corecore