5 research outputs found

    Butler enables rapid cloud-based analysis of thousands of human genomes.

    Get PDF
    We present Butler, a computational tool that facilitates large-scale genomic analyses on public and academic clouds. Butler includes innovative anomaly detection and self-healing functions that improve the efficiency of data processing and analysis by 43% compared with current approaches. Butler enabled processing of a 725-terabyte cancer genome dataset from the Pan-Cancer Analysis of Whole Genomes (PCAWG) project in a time-efficient and uniform manner

    Butler enables rapid cloud-based analysis of thousands of human genomes

    Get PDF
    We present Butler, a computational tool that facilitates large-scale genomic analyses on public and academic clouds. Butler includes innovative anomaly detection and self-healing functions that improve the efficiency of data processing and analysis by 43% compared with current approaches. Butler enabled processing of a 725-terabyte cancer genome dataset from the Pan-Cancer Analysis of Whole Genomes (PCAWG) project in a time-efficient and uniform manner

    Butler enables rapid cloud-based analysis of thousands of human genomes

    No full text
    Cloud computing offers easy and economical access to computational capacity at a scale that had previously been available to only the largest research institutions. To take advantage, large biological datasets are increasingly analyzed on various cloud computing platforms, using public, private and hybrid clouds1 with the aid of workflow systems. When employed in global projects, such systems must be flexible in their ability to operate in different environments, including academic clouds, to allow researchers to bring their computational pipelines to the data, especially in cases where the raw data themselves cannot be moved. The recently developed cloud-based scientific workflow frameworks Nextflow2, Toil3 and GenomeVIP4 focus their support largely on individual commercial cloud computing environments—mostly Amazon Web Services—and lack complete functionality for other major providers. This limits their use in studies that require multi-cloud operation due to practical and regulatory requirements5,6. Butler, in contrast, provides full support for operation on OpenStack-based commercial and academic clouds, Amazon Web Services, Microsoft Azure and Google Compute Platform, and can thus enable international collaborations involving the analysis of hundreds of thousands of samples where distributed cloud-based computation is pursued in different jurisdictions5,6,7.ISSN:1546-1696ISSN:1087-015
    corecore