36 research outputs found

    Experiences Building Globus Genomics: A Next-Generation Sequencing Analysis Service using Galaxy, Globus, and Amazon Web Services

    Get PDF
    ABSTRACT We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next-generation sequencing (NGS) genomic data. This system achieves a high degree of end-to-end automation that encompasses every stage of data analysis including initial data retrieval from remote sequencing centers or storage (via the Globus file transfer system); specification, configuration, and reuse of multi-step processing pipelines (via the Galaxy workflow system); creation of custom Amazon Machine Images and on-demand resource acquisition via a specialized elastic provisioner (on Amazon EC2); and efficient scheduling of these pipelines over many processors (via the HTCondor scheduler). The system allows biomedical researchers to perform rapid analysis of large NGS datasets in a fully automated manner, without software installation or a need for any local computing infrastructure. We report performance and cost results for some representative workloads

    GWAS Meta-Analysis of Suicide Attempt: Identification of 12 Genome-Wide Significant Loci and Implication of Genetic Risks for Specific Health Factors

    Get PDF

    Dissecting the Shared Genetic Architecture of Suicide Attempt, Psychiatric Disorders, and Known Risk Factors

    Get PDF
    Background Suicide is a leading cause of death worldwide, and nonfatal suicide attempts, which occur far more frequently, are a major source of disability and social and economic burden. Both have substantial genetic etiology, which is partially shared and partially distinct from that of related psychiatric disorders. Methods We conducted a genome-wide association study (GWAS) of 29,782 suicide attempt (SA) cases and 519,961 controls in the International Suicide Genetics Consortium (ISGC). The GWAS of SA was conditioned on psychiatric disorders using GWAS summary statistics via multitrait-based conditional and joint analysis, to remove genetic effects on SA mediated by psychiatric disorders. We investigated the shared and divergent genetic architectures of SA, psychiatric disorders, and other known risk factors. Results Two loci reached genome-wide significance for SA: the major histocompatibility complex and an intergenic locus on chromosome 7, the latter of which remained associated with SA after conditioning on psychiatric disorders and replicated in an independent cohort from the Million Veteran Program. This locus has been implicated in risk-taking behavior, smoking, and insomnia. SA showed strong genetic correlation with psychiatric disorders, particularly major depression, and also with smoking, pain, risk-taking behavior, sleep disturbances, lower educational attainment, reproductive traits, lower socioeconomic status, and poorer general health. After conditioning on psychiatric disorders, the genetic correlations between SA and psychiatric disorders decreased, whereas those with nonpsychiatric traits remained largely unchanged. Conclusions Our results identify a risk locus that contributes more strongly to SA than other phenotypes and suggest a shared underlying biology between SA and known risk factors that is not mediated by psychiatric disorders.Peer reviewe

    The Case for Optimized Edge-Centric Tractography at Scale.

    No full text
    The anatomic validity of structural connectomes remains a significant uncertainty in neuroimaging. Edge-centric tractography reconstructs streamlines in bundles between each pair of cortical or subcortical regions. Although edge bundles provides a stronger anatomic embedding than traditional connectomes, calculating them for each region-pair requires exponentially greater computation. We observe that major speedup can be achieved by reducing the number of streamlines used by probabilistic tractography algorithms. To ensure this does not degrade connectome quality, we calculate the identifiability of edge-centric connectomes between test and re-test sessions as a proxy for information content. We find that running PROBTRACKX2 with as few as 1 streamline per voxel per region-pair has no significant impact on identifiability. Variation in identifiability caused by streamline count is overshadowed by variation due to subject demographics. This finding even holds true in an entirely different tractography algorithm using MRTrix. Incidentally, we observe that Jaccard similarity is more effective than Pearson correlation in calculating identifiability for our subject population

    FAIR for AI: An interdisciplinary and international community building perspective

    No full text
    A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to include the software, tools, algorithms, and workflows that produce data. FAIR principles are now being adapted in the context of AI models and datasets. Here, we present the perspectives, vision, and experiences of researchers from different countries, disciplines, and backgrounds who are leading the definition and adoption of FAIR principles in their communities of practice, and discuss outcomes that may result from pursuing and incentivizing FAIR AI research. The material for this report builds on the FAIR for AI Workshop held at Argonne National Laboratory on June 7, 2022
    corecore