6 research outputs found

    An improved method for identifying functionally linked proteins using phylogenetic profiles-0

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "An improved method for identifying functionally linked proteins using phylogenetic profiles"</p><p>http://www.biomedcentral.com/1471-2105/8/S4/S7</p><p>BMC Bioinformatics 2007;8(Suppl 4):S7-S7.</p><p>Published online 22 May 2007</p><p>PMCID:PMC1892086.</p><p></p>s ("matches") in three runs while genes 3 and 4 have four matches in a single run. We hypothesize that genes 1 and 2 are more likely to be truly co-evolving while genes 3 and 4 are likely to be just lineage-specific

    An improved method for identifying functionally linked proteins using phylogenetic profiles-4

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "An improved method for identifying functionally linked proteins using phylogenetic profiles"</p><p>http://www.biomedcentral.com/1471-2105/8/S4/S7</p><p>BMC Bioinformatics 2007;8(Suppl 4):S7-S7.</p><p>Published online 22 May 2007</p><p>PMCID:PMC1892086.</p><p></p> but poorly in the runs-informed metric as determined by smallest ratios of unweighted hypergeometric -value without runs to our runs-using score (taking both on a linear and not on a logarithmic scale). Not surprisingly, the matches between these profiles are concentrated in few runs. We find that the protein pairs here are not closely related functionally according to our snapshot of GO, and so they are likely false positives for the runs-oblivious unweighted hypergeometric model

    An improved method for identifying functionally linked proteins using phylogenetic profiles-2

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "An improved method for identifying functionally linked proteins using phylogenetic profiles"</p><p>http://www.biomedcentral.com/1471-2105/8/S4/S7</p><p>BMC Bioinformatics 2007;8(Suppl 4):S7-S7.</p><p>Published online 22 May 2007</p><p>PMCID:PMC1892086.</p><p></p>g to the unweighted hypergeometric metric without runs and one from our runs-employing two-term model. We see that the unweighted hypergeometric network contains many more edges of high degree. In particular, nodes with more than 40 edges are almost completely absent from the runs network while being abundant in the unweighted hypergeometric network. This suggests that the runs-informed network contains smaller and more interpretable clusters

    An improved method for identifying functionally linked proteins using phylogenetic profiles-1

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "An improved method for identifying functionally linked proteins using phylogenetic profiles"</p><p>http://www.biomedcentral.com/1471-2105/8/S4/S7</p><p>BMC Bioinformatics 2007;8(Suppl 4):S7-S7.</p><p>Published online 22 May 2007</p><p>PMCID:PMC1892086.</p><p></p>e unweighted hypergeometric distribution for the probability of the observed or a greater number of matches between two profiles. The second (red) ranks by , the entropy of the first profile plus the entropy of the second profile minus the entropy of the joint profile viewed one genome at a time [3]. The third (orange) uses the weighted hypergeometric distribution that considers the occupancy of each genome across all genes. The fourth (yellow) is the same as the third but on a reduced set of organisms. The fifth (green) combines the weighted hypergeometric -value and a -value for the observed or a smaller number of runs in the observed matches. Methods are benchmarked against the GO cellular localization and biological process ontologies. The for each pair of proteins is the probability for the genes of that pair to share a GO term at least as specific as their most specific shared term, and we compute the cumulative average logGO -value for top pairs as ranked by each metric. Introducing runs into the calculations improves results by tending to yield more significant GO -values. The inset compares the fifth method (green) to a full tree-based method (blue). Due to the computational difficulty of evaluating Pagel's method, we only compared it to our novel method on a random subset of 100,000 benchmarkable pairs. Each such sampled pair represents approximately 35 pairs in a full all-versus-all run. The average logGO -value over all benchmarkable pairs is approximately -0.40 and is shown in the inset (but lies above the top of the main plot)

    An improved method for identifying functionally linked proteins using phylogenetic profiles-3

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "An improved method for identifying functionally linked proteins using phylogenetic profiles"</p><p>http://www.biomedcentral.com/1471-2105/8/S4/S7</p><p>BMC Bioinformatics 2007;8(Suppl 4):S7-S7.</p><p>Published online 22 May 2007</p><p>PMCID:PMC1892086.</p><p></p>ghted hypergeometric network that does not use runs. The phylogenetic profiles of the corresponding genes are shown in (). Significant edges are shown in () with blue edges being identified by both methods while green edges belong only to the runs-informed network. We note that the only elements of this network detected by the non-runs-using method are two highly homologous nitrate reductase complex subunits; the other members are less homologous and are missed by it. Even though these genes occur in relatively few genomes, those in which they occur are widely scattered and form many runs in the profiles, leading to their correct inclusion in the runs-informed network
    corecore