892 research outputs found

    Bioinformatics tools for analysing viral genomic data

    Get PDF
    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing

    Spaced seeds improve k-mer-based metagenomic classification

    Full text link
    Metagenomics is a powerful approach to study genetic content of environmental samples that has been strongly promoted by NGS technologies. To cope with massive data involved in modern metagenomic projects, recent tools [4, 39] rely on the analysis of k-mers shared between the read to be classified and sampled reference genomes. Within this general framework, we show in this work that spaced seeds provide a significant improvement of classification accuracy as opposed to traditional contiguous k-mers. We support this thesis through a series a different computational experiments, including simulations of large-scale metagenomic projects. Scripts and programs used in this study, as well as supplementary material, are available from http://github.com/gregorykucherov/spaced-seeds-for-metagenomics.Comment: 23 page

    Whole genome metagenomic analysis of the gut microbiome of differently fed infants identifies differences in microbial composition and functional genes, including an absent CRISPR/Cas9 gene in the formula-fed cohort

    Get PDF
    Background: Advancements in sequencing capabilities have enhanced the study of the human microbiome. There are limited studies focused on the gastro-intestinal (gut) microbiome of infants, particularly the impact of diet between breast-fed (BF) versus formula-fed (FF). It is unclear what effect, if any, early feeding has on short- term or long-term composition and function of the gut microbiome. Results: Using a shotgun metagenomics approach, differences in the gut microbiome between BF (n = 10) and FF (n = 5) infants were detected. A Jaccard distance principle coordinate analysis was able to cluster BF versus FF infants based on the presence or absence of species identified in their gut microbiome. Thirty-two genera were identified as statistically different in the gut microbiome sequenced between BF and FF infants. Furthermore, the computational workflow identified 371 bacterial genes that were statistically different between the BF and FF cohorts in abundance. Only seven genes were lower in abundance (or absent) in the FF cohort compared to the BF cohort, including CRISPR/Cas9; whereas, the remaining candidates, including autotransporter adhesins, were higher in abundance in the FF cohort compared to BF cohort. Conclusions: These studies demonstrated that FF infants have, at an early age, a significantly different gut microbiome with potential implications for function of the fecal microbiota. Interactions between the fecal microbiota and host hinted at here have been linked to numerous diseases. Determining whether these non- abundant or more abundant genes have biological consequence related to infant feeding may aid in under- standing the adult gut microbiome, and the pathogenesis of obesity

    BusyBee Web : towards comprehensive and differential composition-based metagenomic binning

    Get PDF
    Despite recent methodology and reference database improvements for taxonomic profiling tools, metagenomic assembly and genomic binning remain important pillars of metagenomic analysis workflows. In case reference information is lacking, genomic binning is considered to be a state-of-the-art method in mixed culture metagenomic data analysis. In this light, our previously published tool BusyBee Web implements a composition-based binning method efficient enough to function as a rapid online utility. Handling assembled contigs and long nanopore generated reads alike, the webserver provides a wide range of supplementary annotations and visualizations. Half a decade after the initial publication, we revisited existing functionality, added comprehensive visualizations, and increased the number of data analysis customization options for further experimentation. The webserver now allows for visualizationsupported differential analysis of samples, which is computationally expensive and typically only performed in coverage-based binning methods. Further, users may now optionally check their uploaded samples for plasmid sequences using PLSDB as a reference database. Lastly, a new application programming interface with a supporting python package was implemented, to allow power users fully automated access to the resource and integration into existing workflows
    • …
    corecore