140 research outputs found

    TriAnnot: A Versatile and High Performance Pipeline for the Automated Annotation of Plant Genomes

    Get PDF
    In support of the international effort to obtain a reference sequence of the bread wheat genome and to provide plant communities dealing with large and complex genomes with a versatile, easy-to-use online automated tool for annotation, we have developed the TriAnnot pipeline. Its modular architecture allows for the annotation and masking of transposable elements, the structural, and functional annotation of protein-coding genes with an evidence-based quality indexing, and the identification of conserved non-coding sequences and molecular markers. The TriAnnot pipeline is parallelized on a 712 CPU computing cluster that can run a 1-Gb sequence annotation in less than 5 days. It is accessible through a web interface for small scale analyses or through a server for large scale annotations. The performance of TriAnnot was evaluated in terms of sensitivity, specificity, and general fitness using curated reference sequence sets from rice and wheat. In less than 8 h, TriAnnot was able to predict more than 83% of the 3,748 CDS from rice chromosome 1 with a fitness of 67.4%. On a set of 12 reference Mb-sized contigs from wheat chromosome 3B, TriAnnot predicted and annotated 93.3% of the genes among which 54% were perfectly identified in accordance with the reference annotation. It also allowed the curation of 12 genes based on new biological evidences, increasing the percentage of perfect gene prediction to 63%. TriAnnot systematically showed a higher fitness than other annotation pipelines that are not improved for wheat. As it is easily adaptable to the annotation of other plant genomes, TriAnnot should become a useful resource for the annotation of large and complex genomes in the future

    Assessing the Diversity and Specificity of Two Freshwater Viral Communities through Metagenomics

    Get PDF
    Transitions between saline and fresh waters have been shown to be infrequent for microorganisms. Based on host-specific interactions, the presence of specific clades among hosts suggests the existence of freshwater-specific viral clades. Yet, little is known about the composition and diversity of the temperate freshwater viral communities, and even if freshwater lakes and marine waters harbor distinct clades for particular viral sub-families, this distinction remains to be demonstrated on a community scale

    A conserved lysine residue of plant Whirly proteins is necessary for higher order protein assembly and protection against DNA damage

    Get PDF
    All organisms have evolved specialized DNA repair mechanisms in order to protect their genome against detrimental lesions such as DNA double-strand breaks. In plant organelles, these damages are repaired either through recombination or through a microhomology-mediated break-induced replication pathway. Whirly proteins are modulators of this second pathway in both chloroplasts and mitochondria. In this precise pathway, tetrameric Whirly proteins are believed to bind single-stranded DNA and prevent spurious annealing of resected DNA molecules with other regions in the genome. In this study, we add a new layer of complexity to this model by showing through atomic force microscopy that tetramers of the potato Whirly protein WHY2 further assemble into hexamers of tetramers, or 24-mers, upon binding long DNA molecules. This process depends on tetramer–tetramer interactions mediated by K67, a highly conserved residue among plant Whirly proteins. Mutation of this residue abolishes the formation of 24-mers without affecting the protein structure or the binding to short DNA molecules. Importantly, we show that an Arabidopsis Whirly protein mutated for this lysine is unable to rescue the sensitivity of a Whirly-less mutant plant to a DNA double-strand break inducing agent

    ExploreMetabar : a user-friendly Shiny application to explore the drivers of microbial communities

    No full text
    International audienceA free and easily accessible web application ; R package for local installation ; Use of phyloseq object (output from rANOMALY, FROGS) ; Interactive filters, to focus on a specific group of samples. Complete diversity analysis with advanced visualizations and statistical tests, differential analysis and more features. Metabarcoding is widely used for community composition studies of complex samples (food, gut, environement) ; Existing tools for the analysis of these data needs some coding knowledge. We present a Shiny application for metabarcoding data with interactive features allowing user to explore and analyse them

    ANOMALY: AmplicoN wOrkflow for Microbial community AnaLYsis

    No full text
    Bioinformatic tools for amplicon sequencing data analysis are continuously and rapidly evolving, thus integrating We present an R workflow for 16S and ITS amplicons based sequencing. It is mainly based on the Dada2 and Phyloseq R packages. This workflow is based on several scripts in order to perform an analysis from fastq sequence files to final statistical analysis. The objective was to automate bioinformatic analyses to ensure reproducibility between projects trying to be versatile and simple to integrate new bioinformatic tools or statistical techniques.ANOMALY use Amplicon Sequence Variant (ASV from Dada2 package) as taxonomic unit, allowing an easy and relevant sequence tracking between different environments and/or projects. Decontam package is included for an accurate and consistent detection of contaminant ASV and taxonomic assignment step relies on IDTAXA method. Our workflow is able to merge and check annotations from two taxonomic databases to unravel misannotation, discordance or inconsistency. The well known Phyloseq package provides the most common graphical representation, with additional statistics to assess significant impact of tested factors on microbial communities. The workflow incorporate multiple differential analyses (DESeq2 etc...) to reveal thin community contrast between conditions. Finally we are able to combine those results for cross-validation and thinner interpretation.ANOMALY is a simple and customizable R workflow, that uses ASVs level for community characterization and integrates all assets of the up-to-date methods such as better sequence tracking, decontamination, merged taxonomic annotation, statistical tests, and cross-validated differential analysis

    ANOMALY: AmplicoN wOrkflow for Microbial community AnaLYsis

    No full text
    Bioinformatic tools for amplicon sequencing data analysis are continuously and rapidly evolving, thus integrating We present an R workflow for 16S and ITS amplicons based sequencing. It is mainly based on the Dada2 and Phyloseq R packages. This workflow is based on several scripts in order to perform an analysis from fastq sequence files to final statistical analysis. The objective was to automate bioinformatic analyses to ensure reproducibility between projects trying to be versatile and simple to integrate new bioinformatic tools or statistical techniques.ANOMALY use Amplicon Sequence Variant (ASV from Dada2 package) as taxonomic unit, allowing an easy and relevant sequence tracking between different environments and/or projects. Decontam package is included for an accurate and consistent detection of contaminant ASV and taxonomic assignment step relies on IDTAXA method. Our workflow is able to merge and check annotations from two taxonomic databases to unravel misannotation, discordance or inconsistency. The well known Phyloseq package provides the most common graphical representation, with additional statistics to assess significant impact of tested factors on microbial communities. The workflow incorporate multiple differential analyses (DESeq2 etc...) to reveal thin community contrast between conditions. Finally we are able to combine those results for cross-validation and thinner interpretation.ANOMALY is a simple and customizable R workflow, that uses ASVs level for community characterization and integrates all assets of the up-to-date methods such as better sequence tracking, decontamination, merged taxonomic annotation, statistical tests, and cross-validated differential analysis

    ExploreMetabar : a user-friendly Shiny application to explore the drivers of microbial communities

    No full text
    International audienceA free and easily accessible web application ; R package for local installation ; Use of phyloseq object (output from rANOMALY, FROGS) ; Interactive filters, to focus on a specific group of samples. Complete diversity analysis with advanced visualizations and statistical tests, differential analysis and more features. Metabarcoding is widely used for community composition studies of complex samples (food, gut, environement) ; Existing tools for the analysis of these data needs some coding knowledge. We present a Shiny application for metabarcoding data with interactive features allowing user to explore and analyse them
    corecore