51,645 research outputs found

    Extreme-scaling Applications 24/7 on JUQUEEN Blue Gene/Q

    Get PDF
    Jülich Supercomputing Centre has offered Extreme Scaling Workshops since 2009, with the latest edition in February 2015 giving seven international code teams an opportunity to (im)prove the scaling of their applications to all 458752 cores of the JUQUEEN IBM BlueGene/Q. Each of them successfully adapted their application codes and datasets to the restricted compute-node memory and exploit the massive parallelism with up to 1.8 million processes or threads. They thereby qualified to become members of the High-Q Club which now has over 24 codes demonstrating extreme scalability. Achievements in both strong and weak scaling are compared, and complemented with a review of program languages and parallelisation paradigms, exploitation of hardware threads, and file I/O requirements

    Exploring Scientific Application Performance Using Large Scale Object Storage

    Full text link
    One of the major performance and scalability bottlenecks in large scientific applications is parallel reading and writing to supercomputer I/O systems. The usage of parallel file systems and consistency requirements of POSIX, that all the traditional HPC parallel I/O interfaces adhere to, pose limitations to the scalability of scientific applications. Object storage is a widely used storage technology in cloud computing and is more frequently proposed for HPC workload to address and improve the current scalability and performance of I/O in scientific applications. While object storage is a promising technology, it is still unclear how scientific applications will use object storage and what the main performance benefits will be. This work addresses these questions, by emulating an object storage used by a traditional scientific application and evaluating potential performance benefits. We show that scientific applications can benefit from the usage of object storage on large scales.Comment: Preprint submitted to WOPSSS workshop at ISC 201

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
    corecore