4 research outputs found

    Benefits of the Snakemake Workflow Management Software in Comparison to Traditional Programming (Paper)

    Get PDF
    Tools surrounding bioinformatics have increased data acquisition and accuracy significantly, especially with near-real time results using nanopore DNA sequencing. With large amounts of data, reproducibility is of high importance, and long workflows can become convoluted. Snakemake, built on the Common Workflow Language and Python, aims to alleviate this with readable formatting, reproducibility, and portability for any machine. Using 97 fastq files, the usability of these three traits were compared between a Bash and Snakemake workflow using a range of one to twelve threads. In every test, Snakemake was faster than Bash. At its fastest, Snakemake was 27% faster than Bash. Reproducibility of both workflows was verified using an MD5 hash of results. The hashes differed between the workflows; this may be a result of executing the workflows in two different terminal environments. Despite this, it is a valid method of validating reproducibility between tests within individual workflows. Outside speed tests, Snakemake offers quality of life features that allow it to pull ahead from Bash. Containerization of workflows using Conda is one example of this. The ability to require specific versions of software within a workflow boosts reproducibility. Additionally, portability is increased because the container can be deployed almost anywhere, and the required software can be downloaded on an as-needed basis. With readability comes maintainability. Snakemake will almost always pull ahead of Bash in this regard with its simple input, output, and shell fields. The field of Bioinformatics is moving very quickly, and it can be difficult for traditional Bash scripts to keep up in certain aspects. While Bash is paramount in the execution of some software, more powerful tools like Snakemake are required to handle the execution of an entire, complex workflow.https://openprairie.sdstate.edu/honors_isp/1007/thumbnail.jp

    Digits: Two Reports on New Units of Scholarly Publication

    Get PDF
    The Digits team (Matt Burton, Matthew J. Lavin, Jessica Otis, and Scott B. Weingart) convened around the question of how we might share, preserve, and legitimize scholarship freed from the affordances of print. For the A.W. Mellon-funded Digits Planning Grant (2016-2018), the PIs had three goals: - Investigate the use of software containers for research in the sciences, social sciences, and humanities. - Assess the infrastructural needs of digital humanists around publishing and preserving web-centric scholarship. - Gather a team of experts to guide the above activities and plan how they might inform a beneficial intervention into the scholarly ecosystem. Through our investigation into the scholarly uses of containers, we discovered that the technical infrastructure needed to connect containers with digital publications is underdeveloped. We see potential for container technologies to facilitate existing digital scholarly publications and afford new forms of computational scholarship, but this process would first require a series of infrastructural bridges. The digital scholarship needs assessment we conducted, as well as our advisory board meetings, made it clear that a targeted technological intervention alone would not be enough to welcome web-first publications into the scholarly ecosystem; in-tandem cultural and institutional changes are also necessary
    corecore