16 research outputs found
Density plot for correlation scores using Jones-Taylor-Thornton matrix for common interacting and non-interacting protein pairs from Dataset 1 (A) and the corresponding ROC plot (B).
<p>Density plot for correlation scores using Jones-Taylor-Thornton matrix for common interacting and non-interacting protein pairs from Dataset 1 (A) and the corresponding ROC plot (B).</p
Average correlation vs. evolutionary span for Dataset 3.
<p>A). Interacting protein pairs. B). Non-interacting protein pairs. The evolutionary span is defined as the time since last common ancestor for the most distantly related species in the data subset. Correlation scores are mean values for each different evolutionary span, error bar shown as the standard deviation of the correlation scores within respective correlation score range. Range of conservation is defined by the range of the relevant OMA orthology sets. Time since last common ancestor is derived from the TimeTree database <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0081100#pone.0081100-Hedges1" target="_blank">[35]</a>. It is seen that the mean score is lower and the standard deviation is larger for data subsets that contain only closely related species.</p
Plot of sensitivity, specificity, and MCC vs. threshold for binary classification using Dataset 1.
<p>It is seen that the peak of the MCC (dashed vertical line) occurs in this case where the specificity is somewhat larger than the sensitivity. A user may wish to use a threshold either larger or smaller than the position of the peak of the MCC, depending on whether specificity or sensitivity is more highly valued.</p
Matthews correlation coefficient (MCC) vs. choice of binary classification threshold for Datasets 1, 2, 3.
<p>It is seen that there is a much higher and more distinct peak for Dataset 1, supporting the inference derived from the relative AUC scores (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0081100#pone-0081100-g001" target="_blank">Figures 1</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0081100#pone-0081100-g002" target="_blank">2</a>, and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0081100#pone-0081100-g003" target="_blank">3</a>) that the Dataset 1 provides the best differentiation between the interacting and non-interacting pairs.</p
Protein sequences' within ortholog set degree of conservation (mean pairwise fraction identity for all orthologs in each set) vs. protein pairs correlation score for Dataset 1.
<p>A). Scatter plots of degree of conservation vs. protein pairs correlation score for interacting protein pairs. B). Scatter plots of degree of conservation vs. protein pairs correlation score for non-interacting protein pairs. C). Mean degree of conservation vs. protein pairs correlation score for interacting pairs with standard deviation as error bar. D). Mean degree of conservation vs. protein pairs correlation score for interacting pairs with standard deviation as error bar.</p
Correlation density plot for interacting (A) and non-interacting (B) Protein pairs of different evolutionary span from Dataset 3.
<p>In this plot we separately consider the protein pairs that are conserved only in chordates, the pairs that are conserved across the metazoan but not elsewhere in the eukaryotes, and finally the protein pairs that are distributed across the eukaryotes beyond the metazoan. C). The corresponding ROC plots for the correlation analysis for these 3 different sub-datasets.</p
Heat maps and statistical analysis of the p-values of number of orthologs for each species and each set.
<p>Each row of the heat maps in Figs 3a and 3b represents one of the honey bee experimental sets. Each column represents one species. Fig 3a is based on the InParanoid ortholog database and 3 b is based on the OrthoMCL database. The white represents the honey bee. The species are ordered along the x-axis by evolutionary distance from the honey bee based on NCBI taxonomy common tree (<a href="http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi" target="_blank">http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi</a>). The order is further refined based on the tree from Flybase (<a href="http://flybase.org/blast/species_tree.png" target="_blank">http://flybase.org/blast/species_tree.png</a>), WormBook [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004921#pcbi.1004921.ref042" target="_blank">42</a>] and UCSC Genome Browser [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004921#pcbi.1004921.ref043" target="_blank">43</a>]. The evolutionary relationships are illustrated by the cladogram along the base of each heat map. Those species to the left of the honey bee (the white vertical column) belong to lineages that diverged from the insects earlier than the insects diverged from the lineage leading to the mammals. The species immediately to the right of the honey bee are insects and other arthropods. The species at the far right are mammals. Between the insects and the mammals are marine invertebrates, marine chordates, fish, amphibians, and birds. The shading code (vertical bar on the right hand side of the figure) represents the p-value for statistical significance of enrichment of orthology in each species relative to the degree of orthology of all the genes on the microarray. The numerical data for these plots are in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004921#pcbi.1004921.s004" target="_blank">S1 Table</a>. Panel c represents the enhanced orthology between the honey bee alarm pheromone set and the placental mammals by means of boxplots. Reading from left to right the boxplots show: i) the distribution of individual species p-values for orthology enhancement between the honey bees and the placental mammals using the InParanoid database, ii) the distribution of individual species p-values for orthology enhancement between the honey bees and their fellow arthropods using the InParanoid data base, iii) the same as i) but using the OrthoMCL database, and iv) the same as ii) but using the OrthoMCL database. It is seen that the p-values for orthology enhancement of the honey bee alarm pheromone set are much much lower (and therefore more favorable) for placental mammals than for arthropods. This is in spite of the fact that overall the honey bee is much more closely related to the other arthropods than to the mammals, as indicated by cladograms in Fig 3a and 3b. The relative positions of the boxplots in Fig 3c are just the opposite of what would pertain if the degree of orthology followed the species relationship. Fig 3d shows the Kolmogorov-Smirnov cumulative fraction difference plot for the distributions of p-values for orthology enrichment of the Alarm Pheromone gene set against the placental mammals (solid trace) and against the arthropods, using the InParanoid orthology database. The horizontal axis is p-values. Each vertical position on each trace is the fraction of p-values making up that trace whose p-value is below the indicated value. One important feature of such a plot is the maximum vertical distance between the two traces, D. In this figure, D is 0.75. P, the likelihood that this difference would be achieved by chance for a distribution with this number of data points, is 0.001 Fig 3e shows the same graph as Fig 3d using the OrthoMCL orthology database. In this graph the value of D is 1.0, since there is no overlap between the distributions at all. P, the likelihood that this degree of separation would occur by chance among members of the same underlying distribution, is vanishingly small. Based on the Kolmogorov-Smirnov statistics, depicted in Fig 3d and 3e we can say with confidence that the Alarm Pheromone honey bee gene set is relatively more enriched in orthologs to placental mammals than in orthologs to other arthropods. Fig 3f shows the Kolmogrov-Smirnov cumulative fraction plot for the differences between the p-values for orthology enrichment for the soldier cg and soldier wg sets. It is seen that the degree of conservation is dramatically higher for the cg than the wg set, which can also be inferred qualitatively from the shading in the heat map of Fig 3a. The value of D, the greatest vertical distance between the two traces shown by the double-headed arrow, is 0.81. P, the probability that the difference between the two traces is due to chance, is < .0005.</p
Examples of Behavior/neural-related âEutheria-conservedâ Alarm_Pheromone Genesâ Human Orthologs (Part 1).
<p>Examples of Behavior/neural-related âEutheria-conservedâ Alarm_Pheromone Genesâ Human Orthologs (Part 1).</p
Statistics of the ortholog count data of sets of differentially expressed honey bee genes.
<p>Statistics of the ortholog count data of sets of differentially expressed honey bee genes.</p
Normalized distribution of the number of all of the âArray-Unspottedâ Honey Bee genesâ orthologs in 54 species.
<p>âArray-Unspottedâ means that these genes are present in the InParanoid database but not spotted on the Honey Bee Oligonucleotide Microarray. There are 1631 such honey bee genes. X-axis is the number of orthologs in 54 species (53 metazoan species+ yeast). Y-axis is the percentage of these 1631 honey bee genes that have the corresponding number of orthologs. (Note vertical scale difference between Figs <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004921#pcbi.1004921.g001" target="_blank">1</a> and 2).</p