# Heat maps and statistical analysis of the p-values of number of orthologs for each species and each set.

- Publication date
- Publisher

## Abstract

<p>Each row of the heat maps in Figs 3a and 3b represents one of the honey bee experimental sets. Each column represents one species. Fig 3a is based on the InParanoid ortholog database and 3 b is based on the OrthoMCL database. The white represents the honey bee. The species are ordered along the x-axis by evolutionary distance from the honey bee based on NCBI taxonomy common tree (<a href="http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi" target="_blank">http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi</a>). The order is further refined based on the tree from Flybase (<a href="http://flybase.org/blast/species_tree.png" target="_blank">http://flybase.org/blast/species_tree.png</a>), WormBook [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004921#pcbi.1004921.ref042" target="_blank">42</a>] and UCSC Genome Browser [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004921#pcbi.1004921.ref043" target="_blank">43</a>]. The evolutionary relationships are illustrated by the cladogram along the base of each heat map. Those species to the left of the honey bee (the white vertical column) belong to lineages that diverged from the insects earlier than the insects diverged from the lineage leading to the mammals. The species immediately to the right of the honey bee are insects and other arthropods. The species at the far right are mammals. Between the insects and the mammals are marine invertebrates, marine chordates, fish, amphibians, and birds. The shading code (vertical bar on the right hand side of the figure) represents the p-value for statistical significance of enrichment of orthology in each species relative to the degree of orthology of all the genes on the microarray. The numerical data for these plots are in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004921#pcbi.1004921.s004" target="_blank">S1 Table</a>. Panel c represents the enhanced orthology between the honey bee alarm pheromone set and the placental mammals by means of boxplots. Reading from left to right the boxplots show: i) the distribution of individual species p-values for orthology enhancement between the honey bees and the placental mammals using the InParanoid database, ii) the distribution of individual species p-values for orthology enhancement between the honey bees and their fellow arthropods using the InParanoid data base, iii) the same as i) but using the OrthoMCL database, and iv) the same as ii) but using the OrthoMCL database. It is seen that the p-values for orthology enhancement of the honey bee alarm pheromone set are much much lower (and therefore more favorable) for placental mammals than for arthropods. This is in spite of the fact that overall the honey bee is much more closely related to the other arthropods than to the mammals, as indicated by cladograms in Fig 3a and 3b. The relative positions of the boxplots in Fig 3c are just the opposite of what would pertain if the degree of orthology followed the species relationship. Fig 3d shows the Kolmogorov-Smirnov cumulative fraction difference plot for the distributions of p-values for orthology enrichment of the Alarm Pheromone gene set against the placental mammals (solid trace) and against the arthropods, using the InParanoid orthology database. The horizontal axis is p-values. Each vertical position on each trace is the fraction of p-values making up that trace whose p-value is below the indicated value. One important feature of such a plot is the maximum vertical distance between the two traces, D. In this figure, D is 0.75. P, the likelihood that this difference would be achieved by chance for a distribution with this number of data points, is 0.001 Fig 3e shows the same graph as Fig 3d using the OrthoMCL orthology database. In this graph the value of D is 1.0, since there is no overlap between the distributions at all. P, the likelihood that this degree of separation would occur by chance among members of the same underlying distribution, is vanishingly small. Based on the Kolmogorov-Smirnov statistics, depicted in Fig 3d and 3e we can say with confidence that the Alarm Pheromone honey bee gene set is relatively more enriched in orthologs to placental mammals than in orthologs to other arthropods. Fig 3f shows the Kolmogrov-Smirnov cumulative fraction plot for the differences between the p-values for orthology enrichment for the soldier cg and soldier wg sets. It is seen that the degree of conservation is dramatically higher for the cg than the wg set, which can also be inferred qualitatively from the shading in the heat map of Fig 3a. The value of D, the greatest vertical distance between the two traces shown by the double-headed arrow, is 0.81. P, the probability that the difference between the two traces is due to chance, is < .0005.</p