14 research outputs found

    M-BISON: Microarray-based integration of data sources using networks-10

    No full text
    B) DE. We measured performance using AUC of the ROC curve, plotted as a function of and . Pseudocolor represents AUC magnitude, with dark blue the lowest and dark red the highest (best performance). All simulated data runs contain at least one parameter combination that scores better than the B statistic with microarray data alone (lower left hand corner of each plot), except for AUC= 0.91/10% DE/= 1.64/= 0.91.<p><b>Copyright information:</b></p><p>Taken from "M-BISON: Microarray-based integration of data sources using networks"</p><p>http://www.biomedcentral.com/1471-2105/9/214</p><p>BMC Bioinformatics 2008;9():214-214.</p><p>Published online 25 Apr 2008</p><p>PMCID:PMC2396182.</p><p></p

    M-BISON: Microarray-based integration of data sources using networks-11

    No full text
    Set, with varying AUC, , , and number of DE genes. (A) 10% of genes are considered DE; (B) 20% of genes are considered DE. The first number on each dataset is the AUC for single parameter M-BISON (MB1); the second is the AUC for empirical M-BISON (MBe). Colors are used to clarify the difference in performance between using M-BISON and using the B statistic with microarray data only (MA): Green – MBe yields the highest AUC, followed by MB1 and finally MA; Yellow – MB1 yields the highest AUC, followed by MBe and finally MA; Red – MB1 yields the highest AUC, followed by MA and finally MBe. *MA AUC (AUC) is slightly higher than MB1 AUC.<p><b>Copyright information:</b></p><p>Taken from "M-BISON: Microarray-based integration of data sources using networks"</p><p>http://www.biomedcentral.com/1471-2105/9/214</p><p>BMC Bioinformatics 2008;9():214-214.</p><p>Published online 25 Apr 2008</p><p>PMCID:PMC2396182.</p><p></p

    Summary performance on PROSITE true positives (TP), false positives (FP), and false negative (FN) test sites

    No full text
    We summarize total numbers of predicted true positives, false negatives, and false positives for PROSITE and SeqFEATURE at 100%, 99%, and 95% specificity cutoffs. SeqFEATURE (at the default 99% specificity cutoff) misses about 18% of the PROSITE true positives on average, but it also predicts 60% fewer false positives and 78% fewer false negatives than PROSITE. The three different specificity cutoffs also show tradeoffs in the numbers of true positives and false predictions made by SeqFEATURE, demonstrating that one can adjust the cutoff to fit desired performance. For example, one can attain a very high positive predictive value by using SeqFEATURE's 100% specificity cutoffs - although sensitivity decreases to about 50%, almost no false positive predictions are made.<p><b>Copyright information:</b></p><p>Taken from "The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation"</p><p>http://genomebiology.com/content/9/1/R8</p><p>Genome Biology 2008;9(1):R8-R8.</p><p>Published online 16 Jan 2008</p><p>PMCID:PMC2395245.</p><p></p

    Performance on PROSITE true positives, false positives, and false negative test sites

    No full text
    We show the true positive (TP), false negative (FN), and false positive (FP) prediction rates for SeqFEATURE (at 95%, 99%, and 100% specificity) and PROSITE on test sites derived from the corresponding PROSITE patterns. The PROSITE values represent the maximum possible for each category. Not all patterns had a false negative or false positive test set. Most of SeqFEATURE's incorrect predictions at 95% and 99% specificity cutoffs arise from poor performance on a small subset of the patterns.<p><b>Copyright information:</b></p><p>Taken from "The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation"</p><p>http://genomebiology.com/content/9/1/R8</p><p>Genome Biology 2008;9(1):R8-R8.</p><p>Published online 16 Jan 2008</p><p>PMCID:PMC2395245.</p><p></p

    Sensitivity trends of SeqFEATURE, Gene3D, Pfam, and HMMPanther at low sequence identities

    No full text
    We compared the sensitivity of SeqFEATURE at three specificity cutoffs against the sensitivity of Gene3D, Pfam, and HMMPanther on test sets filtered for low sequence identity. We evaluated each method on the subset of the original test set that had less than the specified sequence identity to the training sets. As sequence identity decreases, the sequence-based methods show a clear trend towards lower sensitivity. In contrast, SeqFEATURE at all three cutoffs shows no such downward trend, indicating robust detection of function even when sequence identity is very low.<p><b>Copyright information:</b></p><p>Taken from "The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation"</p><p>http://genomebiology.com/content/9/1/R8</p><p>Genome Biology 2008;9(1):R8-R8.</p><p>Published online 16 Jan 2008</p><p>PMCID:PMC2395245.</p><p></p

    Distribution of AUC and sensitivity for all SeqFEATURE models listed in Table 2

    No full text
    Distribution of model AUC. Most models have AUC greater than 0.8, with 47% having AUC >0.95 and a few poor performers less than 0.5. Distribution of model sensitivity. We plot the sensitivity of each model at the default score cutoff of 99% specificity based on training data. Most models have a sensitivity greater than 0.6-0.7 at this cutoff, and many have a sensitivity greater than 0.8.<p><b>Copyright information:</b></p><p>Taken from "The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation"</p><p>http://genomebiology.com/content/9/1/R8</p><p>Genome Biology 2008;9(1):R8-R8.</p><p>Published online 16 Jan 2008</p><p>PMCID:PMC2395245.</p><p></p

    Local environments of SeqFEATURE predictions for three TargetDB structures with unknown function

    No full text
    Three predicted functional sites from TargetDB structures are shown compared to known examples of the predicted function. The predicted and known sites are shown in yellow, positively charge atoms (nitrogens) are shown in blue, and negatively charged atoms (oxygens) are shown in red. Carbons and secondary structure are shown in grey. All atoms within 7.5 angstroms of the site are shown. The active site of a known zinc protease (1KAP) is shown to the left of a zinc protease site in 2EJQ predicted by SeqFEATURE. Note the presence of two histidine residues (one can be seen clearly above each site) and a number of negative charges distributed throughout both local environments. Note also the similarity in secondary structure. The local structure of 1FI6 (left), which contains a known EF hand calcium-binding motif, is compared to SeqFEATURE's predicted calcium-binding site in 2OGF. Note the similar distribution of negative charges and closely matching loop structures. The calcium is visible as a brown sphere in 1FI6, surrounded by oxygen atoms. 1K8U is another known EF hand containing protein, shown to the left of the uncharacterized protein structure 2OX6, for which SeqFEATURE predicts calcium-binding. These figures were created using VMD [43].<p><b>Copyright information:</b></p><p>Taken from "The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation"</p><p>http://genomebiology.com/content/9/1/R8</p><p>Genome Biology 2008;9(1):R8-R8.</p><p>Published online 16 Jan 2008</p><p>PMCID:PMC2395245.</p><p></p

    Example ROC curves, precision-recall curves, and Z-score distributions for SeqFEATURE models

    No full text
    A sample of performance plots for ADH_ZINC.2.HIS.ND1 (top), ASP_PROTEASE.4.ASP.OD1 (middle), and ZINC_FINGER_C2H2_1.9.HIS.ND1 (bottom) are shown, representing a model with excellent performance, good performance, and somewhat satisfactory performance, respectively. The leftmost plot in each row gives the ROC curve in red and random performance in blue, the middle plot shows the precision versus recall (sensitivity) curve, and the rightmost plot shows the distribution of scores for positive sites (red) and negative sites (blue) from training. Because there are many more negative sites than positive sites, the score distributions on the right are normalized to Z-scores.<p><b>Copyright information:</b></p><p>Taken from "The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation"</p><p>http://genomebiology.com/content/9/1/R8</p><p>Genome Biology 2008;9(1):R8-R8.</p><p>Published online 16 Jan 2008</p><p>PMCID:PMC2395245.</p><p></p

    Clustering protein environments for function prediction: finding PROSITE motifs in 3D-4

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Clustering protein environments for function prediction: finding PROSITE motifs in 3D"</p><p>http://www.biomedcentral.com/1471-2105/8/S4/S10</p><p>BMC Bioinformatics 2007;8(Suppl 4):S10-S10.</p><p>Published online 22 May 2007</p><p>PMCID:PMC1892080.</p><p></p>he text are shown. The structures were oriented by superimposing the PROSITE patterns, and the arrows indicate the atoms around which the microenvironments were centered. All residues containing atoms within the 7.5-Angstrom environment are depicted. The three comparisons show varying degrees of similarities among environments in the same cluster, ranging from nearly identical (a) to somewhat diverse (c). (a) The environments in the cluster containing residues from the PROTEIN_KINASE_TYR PROSITE motif are quite similar (top: PDB identifier 1fvr; bottom: 1luf). (b, c) The UBIQUITIN_CONJUGAT_1 (top: 1ayz; bottom: 1wzv) and STAPH_STREP_TOXIN_2 (top: 1aw7; bottom: 1ck1) clusters show greater degrees of structural variability. These images were produced using PyMol [28]

    Clustering protein environments for function prediction: finding PROSITE motifs in 3D-2

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Clustering protein environments for function prediction: finding PROSITE motifs in 3D"</p><p>http://www.biomedcentral.com/1471-2105/8/S4/S10</p><p>BMC Bioinformatics 2007;8(Suppl 4):S10-S10.</p><p>Published online 22 May 2007</p><p>PMCID:PMC1892080.</p><p></p>e mean and median sizes are 437.2 and 232, respectively, and the standard deviation is 589.8. As discussed in the text, the long tail may represent internal hydrophobic environments
    corecore