18 research outputs found

    <i>The cowboy</i> wrangles data into publication without analyzing it properly.

    No full text
    <p>Researchers should beware of potential confounding effects and statistical biases that could lead to inappropriate conclusions. <i>In silico</i> and mechanistic validations can also overcome cowboy tendencies. Image credit: Dan Madsen.</p

    <i>The jailer</i> guards research data and tools under lock and key to maintain her competitive advantage, even though sharing would advance general scientific progress.

    No full text
    <p>Having published, researchers should openly share their methods and data with the community. Image credit: Dan Madsen and Devika Joglekar.</p

    <i>The gold miner</i> keeps digging until a “significant” result surfaces.

    No full text
    <p>Researchers should stay true to their original experimental design, use positive and negative control experiments, and be open about the approaches that were attempted but failed. Image credit: Dan Madsen.</p

    <i>The farmer</i> builds a vast storehouse of genomic data but falls short on experimental design.

    No full text
    <p>Prior to “planting,” researchers should define clear objectives, identify suitable analytical approaches, and consider sample-size requirements, confounding variables, and evaluation measurements. Image credit: Dan Madsen.</p

    <i>The master</i> has unreasonable expectations about the expertise and time required to complete genomics research tasks; and <i>the servant</i> submits too willingly to those expectations.

    No full text
    <p>Front-line researchers should insist on adequate training and supervision, whereas mentors should take the long view on scientific training needs. Image credit: Dan Madsen.</p

    <i>The hermit</i> insists on scientific isolation and fails to realize that, in most cases, success in genomics research hinges upon collaboration among a broad range of scientists.

    No full text
    <p>Open-mindedness toward the conventions and idiosyncrasies of researchers from other domains is key to avoiding the hermit's existence. Image credit: Dan Madsen and Devika Joglekar.</p

    OmniScope: a Computational Pipeline for Metagenomic Species Identification Using Reference and de novo Assembly

    No full text
    Metagenomics has revolutionized the field of microbiology and promises to impact clinical practice as well. While the number of genomes available for reference-based metagenomic pathogen identification keeps increasing, it is still difficult to classify most of the reads from a metagenomic experiment due to intra-species diversity and uncharacterized pathogens. Here, we propose to combine reference-based metagenomic profiling (faster) with de novo metagenomic assembly (more accurate) to maximize the number of used reads and allow for the discovery of novel species in the data that are not identified by reference-based methods. We take advantage of the fact that homologous sequences among related but different species form detectable peaks in coverage. Reads belonging to those peaks are then extracted and assembled into contigs. Finally, using a de novo strategy that involves storing the DeBruijn graph in bloom filters, we take the unmapped reads and, together with the contigs, create a hybrid assembly that increases the number of species discovered. We provide a proof of concept and discuss potential applications both for clinical and environmental samples. Test data and code is freely available in GitHub at www.github.com/mjmiossec/omniscope

    Evaluation of Computational Methods for Human Microbiome Analysis Using Simulated Data

    No full text
    <p>Our understanding of the composition, function, and health implications of human microbiota has been advanced by high-throughput sequencing and the development of new genomic analyses. However, tradeoffs among alternative strategies for the acquisition and analysis of sequence data remain understudied. How do sequencing layout, sample complexity, and analysis pipeline affect taxonomic profiles? In order to approach this, we simulated metagenomic datasets reflecting different read lengths (75-1000 bp), sequencing depths (100 k-10 M), and number of species (10-426). Likewise, we simulated different database composition scenarios including presence/absence of dominant microbes in the database. The resulting simulation design yielded ~144 datasets analyzed using six pipelines (MetaPhlan2; metaMix, PathoScope2, Sigma, Kraken, and ConStrains).</p><p>            We evaluated pipeline performance based on ROC analysis (specificity/sensitivity), relative root mean square error, and average relative error.Our study enables researchers to make informed decisions relative to strengths and weaknesses of current taxonomic profiling methods, and adjust their sequencing experiments accordingly. All datasets and parameter values used in the study are freely available to ensure reproducibility and future pipeline benchmarking.</p

    PathoScope reads alignment summaries.

    No full text
    <p>EBV, Epstein-Barr virus; CDV, canine distemper virus.</p><p><sup>a</sup>Control pool generated from laboratory-raised <i>An</i>. <i>gambiae</i> mosquitoes that fed upod sheep’s blood.</p><p><sup>b</sup>Denotes percentage after PathoQC.</p><p><sup>c</sup>Denotes reads aligning to the sheep reference library.</p><p><sup>d</sup>Denotes reads aligned to EBV strain B95–8 (GenBank V01555.2)</p><p><sup>e</sup>Denotes reads aligned to CDV strain Uy251 (GenBank KM280689.1)</p><p>PathoScope reads alignment summaries.</p

    Xenosurveillance: A Novel Mosquito-Based Approach for Examining the Human-Pathogen Landscape

    No full text
    <div><p>Background</p><p>Globally, regions at the highest risk for emerging infectious diseases are often the ones with the fewest resources. As a result, implementing sustainable infectious disease surveillance systems in these regions is challenging. The cost of these programs and difficulties associated with collecting, storing and transporting relevant samples have hindered them in the regions where they are most needed. Therefore, we tested the sensitivity and feasibility of a novel surveillance technique called xenosurveillance. This approach utilizes the host feeding preferences and behaviors of <i>Anopheles gambiae</i>, which are highly anthropophilic and rest indoors after feeding, to sample viruses in human beings. We hypothesized that mosquito bloodmeals could be used to detect vertebrate viral pathogens within realistic field collection timeframes and clinically relevant concentrations.</p><p>Methodology/Principal Findings</p><p>To validate this approach, we examined variables influencing virus detection such as the duration between mosquito blood feeding and mosquito processing, the pathogen nucleic acid stability in the mosquito gut and the pathogen load present in the host’s blood at the time of bloodmeal ingestion using our laboratory model. Our findings revealed that viral nucleic acids, at clinically relevant concentrations, could be detected from engorged mosquitoes for up to 24 hours post feeding by qRT-PCR. Subsequently, we tested this approach in the field by examining blood from engorged mosquitoes from two field sites in Liberia. Using next-generation sequencing and PCR we were able to detect the genetic signatures of multiple viral pathogens including Epstein-Barr virus and canine distemper virus.</p><p>Conclusions/Significance</p><p>Together, these data demonstrate the feasibility of xenosurveillance and in doing so validated a simple and non-invasive surveillance tool that could be used to complement current biosurveillance efforts.</p></div
    corecore