8 research outputs found

    Sensitivity and accuracy of HHPRED+CODD and HMMER+CODD using the known Pfam domain occurrences for certifications.

    No full text
    <p>This figure reports the number of new domains (x-axis) certified by HHPRED+CODD (in orange and green for the phylum specific and non-specific approaches, respectively) and HMMER+CODD (blue) using local (left) and global (right) alignments for various FDR thresholds (y-axis).</p

    Number of sequenced genomes and domain coverage in the Eukaryote tree.

    No full text
    <p>This figure reports the number of genomes entirely sequenced in each of the 5 supergroups of the Eukaryote tree <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0095275#pone.0095275-Keeling1" target="_blank">[58]</a>. In each group, a few sequenced genomes are provided as example, along with statistics relative to Pfam domains (release 26): the proportion of proteins where at least one Pfam domain has been identified using recommended Pfam score thresholds (above), and the proportion of amino acids covered by a Pfam domain (below). Most of the genomes sequenced to date belong to the Unikont (241) and plant (60) super-groups. We can see that there is a marked difference in the protein domain coverage between these groups and the three other groups: while the proportion of proteins where at least one known Pfam domain is usually above 70% in Unikonts and plants, it lies between 50% and 60% in the other groups. Similarly, while the proportion of amino-acids covered by a Pfam domain is often above 40% in plants and Unikonts, it is around 22% in the other supergroups.</p

    Sensitivity and accuracy of HHPRED and HMMER for <i>P. falciparum</i> and <i>L. major</i>.

    No full text
    <p>Number of new domains (x-axis) identified by HHPRED (green) and HMMER (blue) using local (left) and global (right) alignments for various FDRs (y-axis). For each approach, the two plain lines represent an upper and lower FDR estimate (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0095275#s4" target="_blank">Methods</a> for details). Dashed lines represent the standard error associated with these two estimates. For the sake of clarity, only the standard error above (resp. below) the upper (resp. lower) FDR estimate are represented here.</p

    Cross-validation experiments on <i>P. falciparum</i> and <i>L. major</i>.

    No full text
    <p>The test was done on the 561 and 913 proteins of <i>P. falciparum</i> and <i>L. major</i> that have at least two known Pfam domains, respectively. The table reports the number of domains identified by HMMER and HHPRED that are certified by CODD at 3% FDR. Columns “# total certif.” and “# recovered domains” reports the total number of certified domains and the number of discarded domains that are recovered, respectively. Column “# overlaps” reports the number of newly certified domains that overlap a discarded domain.</p

    New Pfam domains (release 26) identified at 5% and 10% FDR.

    No full text
    <p>The table reports the number of new domains identified by HHPRED (local mode, phylum non-specific approach) and CODD for the three certification types: known Pfam domains (Pfam), known InterPro non-Pfam domains (Interp), potential domains (Pot). “All”: results achieved when combining the 3 types. “# dom.”: number of new domains identified. “new fam.”: number of domain families that were not previously known in any protein of the organism. In each cell, the left and right numbers report the result at 5% and 10% FDR, respectively. Column “All”: The number in parenthesis reports the proportion of already known domains or family this represents. <sup>*</sup>For the certifications by Interpro domains, this is the number of domains identified at 12% FDR because no FDR below 10% can be achieved by this certification type.</p

    New GO annotations at 5% FDR.

    No full text
    <p>“# known GO” is the number of known GO annotations from EuPathDB; “# GO known dom.” is the number of GO annotations that can be deduced from already known domains; “#GO new dom.” is the number of new GO annotations that can be deduced from new domains. Numbers in parenthesis report the number of annotations that confirm already known annotations or annotations deduced from known domains.</p

    DataSheet_1_Taxonomic composition and carbohydrate-active enzyme content in microbial enrichments from pulp mill anaerobic granules after cultivation on lignocellulosic substrates.zip

    No full text
    Metagenomes of lignocellulose-degrading microbial communities are reservoirs of carbohydrate-active enzymes relevant to biomass processing. Whereas several metagenomes of natural digestive systems have been sequenced, the current study analyses metagenomes originating from an industrial anaerobic digester that processes effluent from a cellulose pulp mill. Both 16S ribosomal DNA and metagenome sequences were obtained following anaerobic cultivation of the digester inoculum on cellulose and pretreated (steam exploded) poplar wood chips. The community composition and profile of predicted carbohydrate-active enzymes were then analyzed in detail. Recognized lignocellulose degraders were abundant in the resulting cultures, including populations belonging to Clostridiales and Bacteroidales orders. Poorly defined taxonomic lineages previously identified in other lignocellulose-degrading communities were also detected, including the uncultivated Firmicutes lineage OPB54 which represented nearly 10% of the cellulose-fed enrichment even though it was not detected in the bioreactor inoculum. In total, 3580 genes encoding carbohydrate-active enzymes were identified through metagenome sequencing. Similar to earlier enrichments of animal digestive systems, the profile encoded by the bioreactor inoculum following enrichment on pretreated wood was distinguished from the cellulose counterpart by a higher occurrence of enzymes predicted to act on pectin. The majority (> 93%) of carbohydrate-active enzymes predicted to act on plant polysaccharides were identified in the metagenome assembled genomes, permitting taxonomic assignment. The taxonomic assignment revealed that only a small selection of organisms directly participates in plant polysaccharide deconstruction and supports the rest of the community.</p

    Additional file 1: of Xylan degradation by the human gut Bacteroides xylanisolvens XB1AT involves two distinct gene clusters that are linked at the transcriptional level

    No full text
    Table S1. RNA-seq mapping assessment. Table S2. Xylanase specific activity of B xylanisolvens XB1AT. Table S3. Proteins identified by MALDI-TOF MS or LC-ESI-MS/MS over-produced upon growth of B. xylanisolvens XB1AT on OSX relative to xylose. Table S4. Composition of the commercial oat-spelt xylan (SERVA, France) used in this study. Table S5. Primers used for RT-PCR (to amplify the intergenic regions between two consecutive ORFs within PUL 43). Table S6. Primers used for relative RT-qPCR. Table S7. Primers used for insertion mutagenesis into PUL 43 HTCS gene (BXY_29350). Figure S1. Growth of B. xylanisolvens XB1AT (Wt) and PUL 43 HTCS (BXY_29350) mutant on glucose, xylose, wheat arabinoxylan (WAX) and oat-spelt xylan (OSX). Figure S2. B. xylanisolvens XB1AT gene expression in response to oat-spelt xylan (OSX) and xylose relative to glucose obtained from RNA-seq analysis. Figure S3. B. xylanisolvens XB1AT PUL expression in response to xylose relative to glucose at late-log phase obtained from RNA-seq analysis. Figure S4. Schematic layout of the mutant construction and validation of pGERM:HTCS insertion into PUL 43 HTCS gene (BXY_29350) of B. xylanisolvens XB1A genome. (XLSX 454 kb
    corecore