29 research outputs found

    Example of a phylogenetic tree with a topology that supports the hybridization scenario.

    No full text
    <p>Example of a phylogenetic tree in which two copies were retained after the formation of the tetraploid. One copy shows topology A, while the second copy shows topology C. The putative duplication event is indicated in red. Support for the topology is indicated as aLRT values.</p

    Assessment of hybridization parental lineages.

    No full text
    <p>(A) Schematic example of how the pre-KLE node is found in the common ancestor of the two parents, whereas the tetraploid was formed afterwards. (B) Schematic example of duplication inference at the pre-KLE position from a gene tree with genes coming from two parentals. (C) Top: maximum likelihood species tree representing the evolution of <i>S</i>. <i>pastorianus</i>. The tree was obtained using the same approach as the tree in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.g001" target="_blank">Fig 1</a>: 215 alignments from genes present in single copy in <i>S</i>. <i>pastorianus</i> and with orthologs in all the species considered were concatenated and analysed using maximum likelihood. Bootstrap support was maximal (100%) in all branches. The red dot represents the branch where the duplication peak can be found. Bottom: Graph representing the duplication density (duplications per gene per branch) found at three different branches in the species tree. (D) Schematic representation of the inferred positions of the putative parents, related to the main fungal groups considered. The most likely position of the two parents is marked in a black, dashed line, while a second possible position is marked in a grey, dashed line. Data on which this figure is based are provided in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.s001" target="_blank">S1 Data</a>.</p

    Assessment of hybridization parental lineages.

    No full text
    <p>(A) Schematic example of how the pre-KLE node is found in the common ancestor of the two parents, whereas the tetraploid was formed afterwards. (B) Schematic example of duplication inference at the pre-KLE position from a gene tree with genes coming from two parentals. (C) Top: maximum likelihood species tree representing the evolution of <i>S</i>. <i>pastorianus</i>. The tree was obtained using the same approach as the tree in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.g001" target="_blank">Fig 1</a>: 215 alignments from genes present in single copy in <i>S</i>. <i>pastorianus</i> and with orthologs in all the species considered were concatenated and analysed using maximum likelihood. Bootstrap support was maximal (100%) in all branches. The red dot represents the branch where the duplication peak can be found. Bottom: Graph representing the duplication density (duplications per gene per branch) found at three different branches in the species tree. (D) Schematic representation of the inferred positions of the putative parents, related to the main fungal groups considered. The most likely position of the two parents is marked in a black, dashed line, while a second possible position is marked in a grey, dashed line. Data on which this figure is based are provided in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.s001" target="_blank">S1 Data</a>.</p

    Topological analysis of polyploids.

    No full text
    <p>(A) Phylogenetic representation of the three possible topologies regarding the placement of the post-WGD and the two parental sequences (ZT and KLE). The pie chart on the left represents the average percentage of trees found in all the <i>S</i>. <i>cerevisiae</i> reduced phylomes that supported each topology. The average was calculated from the results of the different reduced phylomes (see <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.s010" target="_blank">S9 Fig</a>). The pie chart on the right represents the same pie chart but only using those trees within the reduced phylomes that contain <i>S</i>. <i>cerevisiae</i> proteins that have a conserved ohnolog. The average was calculated from the results of the different reduced phylomes considering only trees in which the <i>S</i>. <i>cerevisiae</i> sequence has a conserved ohnolog (see <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.s011" target="_blank">S10 Fig</a>). Numbers below the pie charts indicate the average number of trees that passed the filters and the percentage it represents when compared to the total. (B) Same pie charts as in A but for two genomes that underwent a WGD. (C) Same pie charts as in A but for two genomes that underwent a hybridization. (D) Same pie charts as in A but for two genomes that have not been duplicated. (E) Same pie chart as in A but for the simulated phylome. Data on which this figure is based are provided in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.s001" target="_blank">S1 Data</a>.</p

    Evidence of a duplication peak pre-dating the WGD.

    No full text
    <p>(A) Evolutionary relationships of the analysed species. The tree was built using a maximum likelihood approach on a concatenated alignment of 516 widespread orthologs. All branches had maximal bootstrap support (100%). The WGD and the pre-KLE (<i>Kluyveromyces</i>, <i>Lachancea</i>, and <i>Eremothecium</i>) branch are marked with coloured circles. Branches in the lineage leading from <i>S</i>. <i>cerevisiae</i> to the root are numbered from more ancestral (n1) to more recent (n8). (B) Duplication densities (duplications per gene per branch) calculated for each annotated branch, either using the entire set of gene trees (green dots) or only the ohnologs (yellow dots). (C) Sequence divergence between yeast sequences belonging to two populations: duplication mapped at the WGD branch (blue) and duplication mapped at the pre-KLE branch (red). Graphs represent frequencies of normalized blast scores, Kimura distances, and estimated divergence age, respectively. Normalized blast score is the result of dividing the blast score obtained when aligning the seed yeast protein to the ohnolog pair by the blast score obtained from aligning the seed yeast protein to itself. The Kimura distance between the two sequences was calculated using protdist as implemented in the phylip package after aligning the two sequences. PL-R8s [<a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.ref014" target="_blank">14</a>] was used to assess the divergence times in individual trees that contained two ohnologous genes. Data on which this figure is based are provided in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.s001" target="_blank">S1 Data</a>.</p

    Genome rearrangements of syntenic blocks.

    No full text
    <p>Average number of genome rearrangements as calculated by MGR (Multiple Genome Rearrangements) [<a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.ref043" target="_blank">43</a>] for each syntenic block inferred from Gordon et al. [<a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.ref011" target="_blank">11</a>]. Orange dots represent the number of rearrangements between the <i>S</i>. <i>cerevisiae</i> block and the orthologs found in the ZT genomes, while the green dots show the same value for the comparison between the <i>S</i>. <i>cerevisiae</i> genome and the KLE genomes. Data on which this figure is based are provided in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1002220#pbio.1002220.s001" target="_blank">S1 Data</a>.</p

    Signatures of gene expression levels in branch lengths of protein evolutionary trees

    No full text
    <p><strong>Introduction.</strong> Over 3500 microbial genomes have been sequenced and this number rises rapidly, leading to new insight in pathogenicity and drug resistance, host-microbiome interactions, survival in extreme environments etc.</p> <p>Microbes may adapt to diverse environments by changes in gene expression, encompassing changes in basal expression levels, and changes in transcriptional regulation. Predicting such changes will prove useful in elucidating various aspects of microbial physiology and ecology.</p> <p><strong>Motivation, goals.</strong> It is known that the expression level of a gene influences the rate of evolutionary change in the corresponding protein: highly expressed proteins evolve slower. We aim to systematically investigate whether this correlation is strong enough to be predictive of gene / protein levels in practical terms. Furthermore, a tremendous amount of high-throughput phylogenetics data is available e.g. in PhylomeDB, presenting a unique opportunity for unbiased large-scale screens. We aim to further develop a data mining metodology for systematic screens for evolutionary signatures in phylogenomics data, in theory applicable to any gene functional property.</p> <p><strong>Methods.</strong> Reconstructing phylogenetic trees for all proteins across 19 diverse bacterial genomes (phylomes) allowed us to compare the terminal branch lengths in the trees to experimentally measured mRNA levels. In addition, we have systematically examined a number of other features extracted from the phylogenetic trees: (A) topological similarity to the consensus tree, (B) tree length, root-to-tips, (C) terminal branch lengths, (D) gene family distribution breadth across genomes, and (E) number & age of duplications. As a baseline, we compared against the prediction accuracy of codon biases, a widely accepted sequence signature of basal expression levels.</p> <p>The predictive power of each set of features was evaluated using Random Forests (ensembles of M5' regression trees in Weka).</p> <p><strong>Results.</strong> The signatures of evolutionary history at the protein sequence level captured by our phylogenetic tree descriptors predict gene expression equally well as the codon biases at the DNA level. The two sources of expression-related evolutionary signal complement each other to some extent.</p> <p>A combination of only two sets of phylogenetic features was highly informative: (i) the terminal branch lengths, and (ii) the # and age of duplications. The first finding is consistent with the slower evolution rate of highly expressed proteins, while the second might possibly reflect the divergence in expression levels after duplications.</p> <p>The predictive ability of the tree features varies greatly between the 19 bacterial genomes examined. Additionaly, the correlation coeffients between the codon bias and the tree features are themselves correlated.</p> <p>(poster presented at the ECCB 2012 conference)</p

    Additional file 2: Table S1. of Phylogenomics of the olive tree (Olea europaea) reveals the relative contribution of ancient allo- and autopolyploidization events

    No full text
    List of species included in the reconstruction of the eight phylomes used in this study. Columns indicate, in this order, the species code for each species, the species name, the source for the protein and the coding DNA sequences, and the phylome in which the species was used (O. europaea var. europaea-215, F. excelsior-216, M. guttatus-217, S. indicum-218, U. gibba-219, S. miltiorrhiza-220, O. e. var. europaea-221, and O.e. var. sylvestris-222). Table S2. List of the GO terms enriched in the expanded protein families and at each evolutionary period as described in Fig. 1b. The first column shows the GO term, the second, the term level, the third, the p value, and the fourth, the term name. Table S3. List of parsimony scores for each of the different hypothesis shown in Additional file 9: Figure S8; and considering the two sets of trees with EST data. Nodes are named as shown in Fig. 3. Table S4. Syntenic regions between coffee and olive used in Fig. 4. In the first column, we can see the letter of the graph. The second and sixth columns show the scaffold names used in the graph (names starting with “C” are for coffee and “O” are for olive). The third and seventh columns show the scaffold names of the genome in coffee and olive, respectively. The fourth and fifth columns show the start and end of the region in coffee. The eighth and ninth columns show the start and end of the syntenic region in olive. (XLS 698 kb

    Additional file 10: Figure S9. of Phylogenomics of the olive tree (Olea europaea) reveals the relative contribution of ancient allo- and autopolyploidization events

    No full text
    Example gene tree that shows the three events we have described in olive: the species-specific duplication and the two allopolyploidizations. The whole-genome duplication previously described in non-Oleaceae Lamiales and the species-specific duplications in U. gibba can also be seen. (TIFF 1402 kb

    Additional file 11: Figure S10. of Phylogenomics of the olive tree (Olea europaea) reveals the relative contribution of ancient allo- and autopolyploidization events

    No full text
    Species tree of the family Oleaceae, including P. angustifolia, F. excelsior, J. sambac, Olea europaea subsp. europaea var. europaea, and Olea europaea subsp. europaea var. sylvestris. The duplication rates are shown in red for set 1 (gene trees that included genes of J. sambac and P. angustifolia) and in blue for set 2 (gene trees that have a monophyletic clade of the family Oleaceae). The bars on the right show the taxonomic classification. (TIFF 494 kb
    corecore