8 research outputs found

    Summary of text linked to residue mentions for a random sample of high-quality predictions.

    No full text
    a<p>Family-level NSM annotation available.</p>b<p>Family-level NSM-valid annotation available.</p

    Residue-wise performance of functional site predictions.

    No full text
    <p>(A) Fraction of NSM sites satisfying various thresholds of recall. (B) Fraction of DPA predictions satisfying various thresholds of precision. The performance is compared using NSM vs. NSM-valid sites, and with or without text matches. The precise definitions of the recall and precision are described in the <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0032171#s3" target="_blank">Methods</a>.</p

    Availability of text-extracted residue mentions in the corpus of abstracts <i>C</i>.

    No full text
    <p>The number of abstracts with residue mentions is comparable when further constrained to those residues that can be mapped to physical residues in protein structures.</p

    Availability of annotations for protein domains.

    No full text
    <p>Residues in protein domains were annotated using the following sources: 1) NSM, 2) NSM-valid, 3) CSA, and 4) text residue. A domain is labeled as annotated if one or more residues in the domain have an appropriate annotation. Stacked bars for each of the sources include cumulative numbers for direct annotations, annotations transferred at the protein level, and annotations transferred at the family level. The vertical order of the legend reflects the vertical order of the bars.</p

    Illustration of the process for extracting residue mentions and mapping them to physical residues in protein structures.

    No full text
    <p>Text mining for residue mentions is performed on the abstracts from the primary references cited in the PDB. A text residue is represented by a residue number and one or more 3-letter amino acid codes. An amino acid for a text residue is marked with an asterisk if a text occurrence suggests it is a mutation from wild type. Physical residues corresponding to a text residue are indicated using a PDB ID, chain identifier, residue number, and 3-letter amino acid code.</p

    Correspondence of DPA predictions to annotations.<sup>a</sup>

    No full text
    a<p>Each entry in the table contains a number of DPA predictions that have a given type of annotation. Percentages are with respect to the numbers in the four subsets of DPA predictions indicated at the head of each of column. The subsets are increasingly restrictive from left to right: predictions in a domain for which a transfer MSA is available; predictions in a domain for which a conservation MSA is available; predictions for which the conservation P-value is no greater than 10<sup>βˆ’2</sup>; and predictions for which the conservation P-value is no greater than 10<sup>βˆ’3</sup>. Annotations are organized into three major sections that are increasingly expansive from top to bottom, as described in the text: direct annotations, protein-level annotations, and family-level annotations. The meaning of the NSM, NSM-valid, CSA, and Text annotations is as described in the text.</p
    corecore