17 research outputs found

    A probable example of ILS visible on a subtree of an Ensembl gene family.

    No full text
    <p>The monophyly of the chimpanzee and gorilla genes (ENSPTRP00000033018 and ENSGGOP00000011432) is well supported by the sequences (left tree, constructed by PhyML, with aLRT supports), while synteny argues for orthology of both with the human genes (ENSP00000414208 and ENSP00000378687) (right tree, constructed by ProfileNJ followed by ParalogyCorrector), so that a scenario of duplications and losses compatible with the left tree is unlikely.</p

    Sequence likelihood, ancestral genome content and ancestral chromosome linearity for ProfileNJ, Synteny and Ensembl trees.

    No full text
    <p><b>(A)</b> Proportion of trees with a significantly better likelihood computed with PhyML. AU tests were computed for the three trees for each family, and if the tree at the first rank was significantly better than the second, it was stored as the best likelihood, and if not, it was stored as “no significant difference at the first rank”. <b>(B)</b> Gene content computed with DeCo. Gene content has one value for each node of the phylogeny of 63 species, except for extant genomes, for which it has one value for each leaf. <b>(C)</b> Genome linearity computed with DeCo. Genome linearity is represented by a graph, whose <i>x</i> axis is the number of neighbors a gene can have, and the <i>y</i> axis shows the proportion of genes having this number of neighbors. Parameters from extant genomes are given as a reference in (B) and (C). Statistics for ancestral genomes are assumed better when close to the extant ones.</p

    A general view on RefineTree when run on the Ensembl Compara gene families.

    No full text
    <p>An example is given for a species tree <i>S</i> of four fish species, a gene family of six genes (a gene is represented by the picture of the species it belongs to, and two paralogs belonging to the same species are distinguished by a different frame color), a rooted gene tree <i>G</i> (although it can be non rooted in general) with branch support, and a given threshold for branch contraction. Data framed in black are the input and those framed in blue are the output of the correction algorithm labeling the edge linking the considered frames. Black arrows depict the use we make of RefineTree on the Ensembl gene trees. The green arrow and the green “or” are alternative uses avoiding one or both of the correction tools ParalogyCorrector and Unduplicator. Any framed set of data can be alternatively provided to the pipeline as input. For example, orthology constraints obtained from various sources can be directly provided as input to ParalogyCorrector. The method for inferring orthology constraints from synteny blocks is described in the text.</p

    Topology accuracy of RAxML, TreeFix and ProfileNJ trees, measured by RF distance with the true tree, on ∼ 2500 simulated trees from the fungal dataset.

    No full text
    <p>We use a sample of trees simulated under four different DL rate: (1<i>r</i><sub><i>D</i></sub>—1<i>r</i><sub><i>L</i></sub>), (2<i>r</i><sub><i>D</i></sub>—2<i>r</i><sub><i>L</i></sub>), (4<i>r</i><sub><i>D</i></sub>—4<i>r</i><sub><i>L</i></sub>) and (4<i>r</i><sub><i>D</i></sub>—1<i>r</i><sub><i>L</i></sub>). Percentage of reconstructed trees (y-axis) with a given RF distance (x-axis) to the true tree. TreeFix and ProfileNJ have a similar reconstruction accuracy (75% of trees match the true trees) while the input trees (RAxML) have the lowest accuracy. The graph is cut on the right, but contains more than 99% of the data.</p

    RefineTree web interface.

    No full text
    <p>The input is a species tree (or by default the Ensembl species tree) and a gene tree (or an Ensembl gene tree ID), gene sequences and additional options such as the branch contraction threshold, the request to test all roots, the maximum number of trees to be output by ProfileNJ and sorted by likelihood, etc. The integrated algorithms are ProfileNJ and ParalogyCorrector. Using this second algorithm requires, in addition, the input of a set of orthology constraints.</p

    The unduplication principle (figure redrawn from [33]).

    No full text
    <p>A non linearity is detected in an ancestral genome (gene <i>g</i> has three neighbors). Two of its neighbors <i>g</i><sub>1</sub> and <i>g</i><sub>2</sub> are issued from a possibly dubious duplication labeled node. The tree is rearranged so that its root is labeled with a speciation instead of a duplication. In the resulting configuration and are in two different species, so that <i>g</i> can have only one neighbor in this family, and linearity is recovered.</p

    Maps of viral siRNAs and their contigs.

    No full text
    <p>The graphs plot the number of 20–25 nt viral siRNA reads (redundant and non-redundant) at each nucleotide position of the genomes of CaMV, ORMV and CaLCuV (DNA-A and DNA-B); Bars above the axis represent sense reads starting at respective positions; those below the antisense reads ending at respective positions. Circular DNA genomes of CaMV and CaLCuV and linear RNA genome of ORMV are shown below the graphs, with the siRNA contigs covering the genomes depicted as green lines with arrowheads. Mismatches between the ORMV contig and the reference genome are indicated.</p

    VIGS phenotypes and accumulation of primary and secondary siRNAs in L2 transgenic plants infected with CaLCuV::GFP viruses targeting the <i>GFP</i> transcribed region.

    No full text
    <p>(<b>A</b>) The L2 T-DNA region containing the 35S-GFP transgene is shown schematically. Positions of the duplicated CaMV 35S enhancer and core promoter elements, <i>GFP</i> mRNA elements including 5′UTR, translation start (AUG) and stop (UAA) codons and 3′UTR with poly(A) signal (AAUAAA), and 35S terminator sequences indicated. Numbering is from the T-DNA left border (LB). The VIGS target sequences, inserted in the CaLCuV::GFP viruses <i>Lead</i>, <i>CodB, CodM, CodE, Trail</i> and <i>polyA</i>, are indicated with dotted boxes; (<b>B</b>) Pictures under UV light of L2 transgenic plants infected with the above viruses; (<b>C</b>) Blot hybridization analysis of total RNA isolated from plants shown in Panel B. The blot was successively hybridized with short DNA probes specific for CaLCuV <i>AC4</i> gene (AC4_s) and 35S::GFP transgene sequences inserted in the CaLCuV::GFP viruses (<i>Lead, CodB, CodM, CodE, Trail</i> and <i>polyA</i>), the <i>GFP</i> mRNA 3′UTR non-target sequence (3′UTR) and <i>Arabidopsis</i> miR173 and Met-tRNA (the latter two serve as loading control).</p
    corecore