824 research outputs found

    Combination of Whole Genome Sequencing and Metagenomics for Microbiological Diagnostics

    Full text link
    Whole genome sequencing (WGS) provides the highest resolution for genome-based species identification and can provide insight into the antimicrobial resistance and virulence potential of a single microbiological isolate during the diagnostic process. In contrast, metagenomic sequencing allows the analysis of DNA segments from multiple microorganisms within a community, either using an amplicon- or shotgun-based approach. However, WGS and shotgun metagenomic data are rarely combined, although such an approach may generate additive or synergistic information, critical for, e.g., patient management, infection control, and pathogen surveillance. To produce a combined workflow with actionable outputs, we need to understand the pre-to-post analytical process of both technologies. This will require specific databases storing interlinked sequencing and metadata, and also involves customized bioinformatic analytical pipelines. This review article will provide an overview of the critical steps and potential clinical application of combining WGS and metagenomics together for microbiological diagnosis

    Literature on applied machine learning in metagenomic classification: A scoping review

    Get PDF
    Applied machine learning in bioinformatics is growing as computer science slowly invades all research spheres. With the arrival of modern next-generation DNA sequencing algorithms, metagenomics is becoming an increasingly interesting research field as it finds countless practical applications exploiting the vast amounts of generated data. This study aims to scope the scientific literature in the field of metagenomic classification in the time interval 2008–2019 and provide an evolutionary timeline of data processing and machine learning in this field. This study follows the scoping review methodology and PRISMA guidelines to identify and process the available literature. Natural Language Processing (NLP) is deployed to ensure efficient and exhaustive search of the literary corpus of three large digital libraries: IEEE, PubMed, and Springer. The search is based on keywords and properties looked up using the digital libraries’ search engines. The scoping review results reveal an increasing number of research papers related to metagenomic classification over the past decade. The research is mainly focused on metagenomic classifiers, identifying scope specific metrics for model evaluation, data set sanitization, and dimensionality reduction. Out of all of these subproblems, data preprocessing is the least researched with considerable potential for improvement

    Use of Whole Genome Shotgun Sequencing for the Analysis of Microbial Communities in Arabidopsis thaliana Leaves

    Get PDF
    Microorganisms, such as all Bacteria, Archaeae, and some Eukaryotes, inhabit all imaginable habitats in the planet, from water vents in the deep ocean to extreme environments of high temperature and salinity. Microbes also constitute the most diverse group of organisms in terms if genetic information, metabolic function, and taxonomy. Furthermore, many of these microbes establish complex interactions with each others and with many other multicellular organisms. The collection of microbes that share a body space with a plant or animal is called the microbiota, and their genetic information is called the microbiome. The microbiota has emerged as a crucial determinant of a host’s overall health and understanding it has become crucial in many biological fields. In mammals, the gut microbiota has been linked to important diseases such as diabetes, inflammatory bowel disease, and dementia. In plants, the microbiota can provide protection against certain pathogens or confer resistance against harsh environmental conditions such as drought. Furthermore, the leaves of plants represent one of the largest surface areas that can potentially be colonized by microbes. The advent of sequencing technologies has let researchers to study microbial communities at unprecedented resolution and scale. By targeting individual loci such as the 16S rDNA locus in bacteria, many species can be studied simultaneously, as well as their properties such as relative abundance without the need of individual isolation of target taxa. Decreasing costs of DNA sequencing has also led to whole shotgun sequencing where instead of targeting a single or a number of loci, random fragments of DNA are sequenced. This effectively renders the entire microbiome accessible to study, referred to as metagenomics. Consequently many more areas of investigation are open, such as the exploration of within host genetic diversity, functional analysis, or assembly of individual genomes from metagenomes. In this study, I described the analysis of metagenomic sequencing data from microbial 11 communities in leaves of wild Arabidopsis thaliana individuals from southwest Germany. As a model organisms, A. thaliana not only is accessible in the wild but also has a rich body of previous research in plant-microbe interactions. In the first section, I describe how whole shotgun sequencing of leaf DNA extracts can be used to accurately describe the taxonomic composition of the microbial community of individual hosts. The nature of whole shotgun sequencing is used to estimate true microbial abundances which can not be done with amplicons sequencing. I show how this community varies across hosts, but some trends are seen, such as the dominance of the bacterial genera Pseudomonas and Sphingomonas . Moreover, even though there is variation between individuals, I explore the influence of site of origin and host genotype. Finally, metagenomic assembly is applied to individual samples, showing the limitations of WGS in plant leaves. In the second section, I explore the genomic diversity of the most abundant genera: Pseudomonas and Sphingomonas . I use a core genome approach where a set of common genes is obtained from previously sequenced and assembled genomes. Thereafter, the gene sequences of the core genome is used as a reference for short genome mapping. Based on these mappings, individual strain mixtures are inferred based on the frequency distribution of non reference bases at each detected single nucleotide polymorphism (SNP). Finally, SNP’s are then used to derive population structure of strain mixtures across samples and with known reference genomes. In conclusion, this thesis provides insights into the use of metagenomic sequencing to study microbial populations in wild plants. I identify the strengths and weaknesses of using whole genome sequencing for this purpose. As well as a way to study strain level dynamics of prevalent taxa within a single host

    Analytical Tools and Databases for Metagenomics in the Next-Generation Sequencing Era

    Get PDF
    Metagenomics has become one of the indispensable tools in microbial ecology for the last few decades, and a new revolution in metagenomic studies is now about to begin, with the help of recent advances of sequencing techniques. The massive data production and substantial cost reduction in next-generation sequencing have led to the rapid growth of metagenomic research both quantitatively and qualitatively. It is evident that metagenomics will be a standard tool for studying the diversity and function of microbes in the near future, as fingerprinting methods did previously. As the speed of data accumulation is accelerating, bioinformatic tools and associated databases for handling those datasets have become more urgent and necessary. To facilitate the bioinformatics analysis of metagenomic data, we review some recent tools and databases that are used widely in this field and give insights into the current challenges and future of metagenomics from a bioinformatics perspective.

    Tailoring bioinformatics strategies for the characterization of the human microbiome in health and disease

    Get PDF
    The human microbiome is a very active area of research due to its potential to explain health and disease. Advances in high throughput DNA sequencing in the last decade have catalyzed the growth of microbiome research; DNA sequencing allows for a cost-effective method to characterize entire microbial communities directly, including unculturable microbes which were previously difficult to study. 16S rRNA sequencing and shotgun metagenomics, coupled with bioinformatics methods have powered the characterization of the human microbiome in different parts of the body. This has led to the discovery of novel links between the microbiome and diseases such as allergies, cancer, and autoimmune diseases. This thesis focuses on the application of both 16S rRNA sequencing and shotgun metagenomics for the characterization of the human microbiome and its relationship with health and disease. We established two methodologies to address these questions. The first methodology is a bench-to-bioinformatics pipeline to discover putative viral pathogens involved in disease using shotgun metagenomics technology. In paper I, we apply the proposed pipeline to explore the hypothesis of viral infection as a putative cause of childhood Acute Lymphoblastic Leukemia. In paper II, we propose a complementary method to the pipeline to improve the detection of unknown viruses, especially those with little or no homology to currently known viruses. We applied this method on a collection of viral-enriched libraries which resulted in the characterization of a new viral-like genome. The second methodology was developed to explore and generate hypothesis from a human skin microbiome dataset of Psoriasis and Atopic Dermatitis patients. The results of the analysis are presented in Paper III and Paper IV. Paper III is a pure data-driven exploration of the dataset to discover different aspects on how the microbiome is linked to both diseases. Paper IV follows up from the results of paper III but focuses on characterizing the skin site microbiome variability in Atopic Dermatitis

    Metagenomics : tools and insights for analyzing next-generation sequencing data derived from biodiversity studies

    Get PDF
    Advances in next-generation sequencing (NGS) have allowed significant breakthroughs in microbial ecology studies. This has led to the rapid expansion of research in the field and the establishment of “metagenomics”, often defined as the analysis of DNA from microbial communities in environmental samples without prior need for culturing. Many metagenomics statistical/computational tools and databases have been developed in order to allow the exploitation of the huge influx of data. In this review article, we provide an overview of the sequencing technologies and how they are uniquely suited to various types of metagenomic studies. We focus on the currently available bioinformatics techniques, tools, and methodologies for performing each individual step of a typical metagenomic dataset analysis. We also provide future trends in the field with respect to tools and technologies currently under development. Moreover, we discuss data management, distribution, and integration tools that are capable of performing comparative metagenomic analyses of multiple datasets using well-established databases, as well as commonly used annotation standards

    The challenges of defining the human nasopharyngeal resistome

    Get PDF
    The nasopharynx is an important microbial reservoir for the emergence and spread of antibiotic-resistant organisms. The nasopharyngeal resistome is an extensive, adaptable reservoir of antibiotic-resistance genes (ARGs) within this niche. Metagenomic sequencing decodes the genetic material of all organisms within a sample using next-generation technologies, permitting unbiased discovery of novel ARGs and associated mobile genetic elements (MGEs). The challenges of sequencing a low-biomass bacterial sample have limited exploration of the nasopharyngeal resistome. Here, we explore the current understanding of the nasopharyngeal resistome, particularly the role of MGEs in propagating antimicrobial resistance (AMR), explore the advantages and limitations of metagenomic sequencing technologies and bioinformatic pipelines for nasopharyngeal resistome analysis, and highlight the key outstanding questions for future research

    Understanding host-microbe interactions in maize kernel and sweetpotato leaf metagenomic profiles.

    Get PDF
    Functional and quantitative metagenomic profiling remains challenging and limits our understanding of host-microbe interactions. This body of work aims to mediate these challenges by using a novel quantitative reduced representation sequencing strategy (OmeSeq-qRRS), development of a fully automated software for quantitative metagenomic/microbiome profiling (Qmatey: quantitative metagenomic alignment and taxonomic identification using exact-matching) and implementing these tools for understanding plant-microbe-pathogen interactions in maize and sweetpotato. The next generation sequencing-based OmeSeq-qRRS leverages the strengths of shotgun whole genome sequencing and costs lower that the more affordable amplicon sequencing method. The novel FASTQ data compression/indexing and enhanced-multithreading of the MegaBLAST in Qmatey allows for computational speeds several thousand-folds faster than typical runs. Regardless of sample number, the analytical pipeline can be completed within days for genome-wide sequence data and provides broad-spectrum taxonomic profiling (virus to eukaryotes). As a proof of concept, these protocols and novel analytical pipelines were implemented to characterize the viruses within the leaf microbiome of a sweetpotato population that represents the global genetic diversity and the kernel microbiomes of genetically modified (GMO) and nonGMO maize hybrids. The metagenome profiles and high-density SNP data were integrated to identify host genetic factors (disease resistance and intracellular transport candidate genes) that underpin sweetpotato-virus interactions Additionally, microbial community dynamics were observed in the presence of pathogens, leading to the identification of multipartite interactions that modulate disease severity through co-infection and species competition. This study highlights a low-cost, quantitative and strain/species-level metagenomic profiling approach, new tools that complement the assay’s novel features and provide fast computation, and the potential for advancing functional metagenomic studies
    corecore