11 research outputs found

    CoMet—a web server for comparative functional profiling of metagenomes

    Get PDF
    Analyzing the functional potential of newly sequenced genomes and metagenomes has become a common task in biomedical and biological research. With the advent of high-throughput sequencing technologies comparative metagenomics opens the way to elucidate the genetically determined similarities and differences of complex microbial communities. We developed the web server ‘CoMet’ (http://comet.gobics.de), which provides an easy-to-use comparative metagenomics platform that is well-suitable for the analysis of large collections of metagenomic short read data. CoMet combines the ORF finding and subsequent assignment of protein sequences to Pfam domain families with a comparative statistical analysis. Besides comprehensive tabular data files, the CoMet server also provides visually interpretable output in terms of hierarchical clustering and multi-dimensional scaling plots and thus allows a quick overview of a given set of metagenomic samples

    Mixture models for analysis of the taxonomic composition of metagenomes

    Get PDF
    Motivation: Inferring the taxonomic profile of a microbial community from a large collection of anonymous DNA sequencing reads is a challenging task in metagenomics. Because existing methods for taxonomic profiling of metagenomes are all based on the assignment of fragmentary sequences to phylogenetic categories, the accuracy of results largely depends on fragment length. This dependence complicates comparative analysis of data originating from different sequencing platforms or resulting from different preprocessing pipelines

    Exploring Neighborhoods in the Metagenome Universe

    No full text
    The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis

    A 20-kb lineage-specific genomic region tames virulence in pathogenic amphidiploid Verticillium longisporum

    Get PDF
    Amphidiploid fungal Verticillium longisporum strains Vl43 and Vl32 colonize the plant host Brassica napus but differ in their ability to cause disease symptoms. These strains represent two V. longisporum lineages derived from different hybridization events of haploid parental Verticillium strains. Vl32 and Vl43 carry same-sex mating-type genes derived from both parental lineages. Vl32 and Vl43 similarly colonize and penetrate plant roots, but asymptomatic Vl32 proliferation in planta is lower than virulent Vl43. The highly conserved Vl43 and Vl32 genomes include less than 1% unique genes, and the karyotypes of 15 or 16 chromosomes display changed genetic synteny due to substantial genomic reshuffling. A 20 kb Vl43 lineage-specific (LS) region apparently originating from the Verticillium dahliae-related ancestor is specific for symptomatic Vl43 and encodes seven genes, including two putative transcription factors. Either partial or complete deletion of this LS region in Vl43 did not reduce virulence but led to induction of even more severe disease symptoms in rapeseed. This suggests that the LS insertion in the genome of symptomatic V. longisporum Vl43 mediates virulence-reducing functions, limits damage on the host plant, and therefore tames Vl43 from being even more virulent

    A 20-kb lineage-specific genomic region tames virulence in pathogenic amphidiploid Verticillium longisporum

    Get PDF
    Amphidiploid fungal Verticillium longisporum strains Vl43 and Vl32 colonize the plant host Brassica napus but differ in their ability to cause disease symptoms. These strains represent two V. longisporum lineages derived from different hybridization events of haploid parental Verticillium strains. Vl32 and Vl43 carry same-sex mating-type genes derived from both parental lineages. Vl32 and Vl43 similarly colonize and penetrate plant roots, but asymptomatic Vl32 proliferation in planta is lower than virulent Vl43. The highly conserved Vl43 and Vl32 genomes include less than 1% unique genes, and the karyotypes of 15 or 16 chromosomes display changed genetic synteny due to substantial genomic reshuffling. A 20 kb Vl43 lineage-specific (LS) region apparently originating from the Verticillium dahliae-related ancestor is specific for symptomatic Vl43 and encodes seven genes, including two putative transcription factors. Either partial or complete deletion of this LS region in Vl43 did not reduce virulence but led to induction of even more severe disease symptoms in rapeseed. This suggests that the LS insertion in the genome of symptomatic V. longisporum Vl43 mediates virulence-reducing functions, limits damage on the host plant, and therefore tames Vl43 from being even more virulent.</p
    corecore