13 research outputs found
Additional file 1: of New genomic data and analyses challenge the traditional vision of animal epithelium evolution
Figure S1A. Comparison of p120 sequences. Residues involved in interaction with E-cadherin are boxed in red. Most of them are conserved. Figure S1B. Comparison of ÎČ-catenin sequences. A single ÎČ-catenin gene copy was identified in every studied species except for calcareous sponges that exhibit a duplication. All residues essential for E-cadherin interaction are boxed in pink and are highly conserved except for the R386 and N387 residues (replaced by L and T, respectively) in two hexactinellids and a more anecdotal change from A656 to S in placozoans. Residues boxed in blue are involved in α-catenin binding and in orange for the DTDL PDZ binding motif. Figure S1C. Analyses of α-catenins and vinculins sequences. Sequences of α-catenins and vinculins were aligned based on the structural domains helix0 to helix5 in Mus musculus α-catenin and vinculin. Helices are boxed and the numbers at the end of each sequence indicate the range encompassed in the alignment. Secondary structure prediction by JNet (Jalview option) identified six helices in all sponge α-catenin sequences except for A. queenslandica (missing the 4 first helices) and A. vastus (missing helix0). All species analyzed in this study have one copy of α-catenin and one copy of vinculin well-separated in Bayesian tree with high support (ppâ=â1) (bottom). Figure S2. Structure of Par3 proteins in metazoans. Par3 exhibits a conserved N-terminal domain (CR1), three central PDZ domains, and a C-terminal region containing multiple protein binding sites including the aPKC-binding motif. Figure S5. Domain composition of PatJ (D. melanogaster), INADL and MUPP1 (M. musculus) and Multiple PDZ containing protein (MPDZ) (O. lobularis, S. ciliatum, A. queenslandica and O. minuta). Note that only O. lobularis exhibits an MPDZ with a well-detected L27 domain (Evalueâ=â8.5 10ââ4) as bilaterians. A. queenslandica and S. ciliatum MPDZ have a low-scoring L27 domain (shaded in grey) according to the HMM profile search. There is no recognizable similarity to the L27 domain in the N- terminal region of O. minuta MPDZ. Tables S1. and S2. Information on the domain structures of the Lethal giant larvae (LGL) and Scribble (Src) proteins in various metazoans. Spreadsheet containing Tables S3. and S4. The spreadsheet contains information on the characteristics of the new private databases and public databases (natureâ=âgenome/transcriptome) and links for new sequences used in this study. Spreadsheet containing Tables S5. and S6. The spreadsheet contains information on the accession numbers or contig/scaffold references where candidate genes were identified. In bold accession numbers of sequences annotated from our new transcriptomic and genomic sponge datasets. Links for new sequences used in this study are provided. (PDF 2610 kb
DoS statistics as a function of GC3 and expression level.
<p>Correlation between GC3 and DoS computed on WS changes (left panel) or between expression level (measured through RPKM) and DoS computed on UP changes (right). Pearson correlation coefficients are given for each species (red: significant at the 5% level, blue non-significant).</p
Patterns of codon preference among the 11 studied species.
<p>The colour scale indicates the magnitude of Î RSCU, the difference in the Relative Synonymous Codon Usage between highly and lowly expressed genes. The greenest codons are the most preferred and the reddest the least preferred. Codons ending in G or C are in red and those ending in A or T in blue.</p
Combined effect of GC3 and expression level on DoS statistics.
<p>The DoS statistics was computed on W/S (gBGC) or U/P (SCU) changes for four gene categories: GC-rich and highly expressed, GC-rich and lowly expressed, GC-poor and highly expressed, GC-poor and lowly expressed.</p
List of studied species and datasets characteristics.
<p>List of studied species and datasets characteristics.</p
Schematic presentation of the method to estimate recent and ancestral gBGC or SCU.
<p>In addition to polymorphic derived mutations used to infer recent gBGC or selection (<i>B</i><sub>1</sub>/<i>S</i><sub>1</sub>) as in [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1006799#pgen.1006799.ref038" target="_blank">38</a>] we also consider substitutions (<i>i</i>.<i>e</i>. fixed derived mutations) on the branch leading to the focal species. Each box corresponds to a site position in a sequence alignment. Both kinds of mutations are polarized with the two same outgroups and are thus sensitive to the same probability of polarization error. We assume that gBGC and selection may have change so that fixed mutations may have undergo a different intensity. Note that these two <i>B</i> or <i>S</i> values correspond to average of potentially more complex variations over the two periods.</p
Phylogeny of the species used in this study.
<p>Phylogenetic relationship of the species used in this study. The phylogeny was computed with PhyML [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1006799#pgen.1006799.ref075" target="_blank">75</a>] on a set of 33 1â1 orthologous protein clusters obtained with SiLiX [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1006799#pgen.1006799.ref076" target="_blank">76</a>] and the resulting tree was made ultrametric (see untransformed trees in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1006799#pgen.1006799.s015" target="_blank">S5</a> and <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1006799#pgen.1006799.s016" target="_blank">S6</a> Figs). Images for <i>S</i>. <i>bicolor</i>, <i>T</i>. <i>monococcum</i>, <i>D</i>. <i>abyssinica</i> and <i>O</i>. <i>europaea</i> come from the pixabay website. Images for <i>S</i>. <i>pimpinellifolium</i> and <i>M</i>. <i>acuminata</i> are provided by the authors. All other images come from the Wikimedia website.</p
Separated estimations of recent and ancestral gBGC (<i>B</i> = 4<i>N</i><sub><i>e</i></sub><i>b</i>) and SCU (<i>S</i> = 4<i>N</i><sub><i>e</i></sub><i>s</i>).
<p>Separated estimations of recent and ancestral gBGC (<i>B</i> = 4<i>N</i><sub><i>e</i></sub><i>b</i>) and SCU (<i>S</i> = 4<i>N</i><sub><i>e</i></sub><i>s</i>).</p
Relationship between the frequency of optimal codons (FOP) and expression in the 11 studied species.
<p>For each species, genes have been split into eight categories of expression (based on RPKM) of same size and the mean FOP for each category is plotted with its 95% confidence interval.</p
Best model for the joined estimations of recent and ancestral gBGC (<i>B</i> = 4<i>N</i><sub><i>e</i></sub><i>b</i>) and SCU (<i>S</i> = 4<i>N</i><sub><i>e</i></sub><i>s</i>).
<p>Best model for the joined estimations of recent and ancestral gBGC (<i>B</i> = 4<i>N</i><sub><i>e</i></sub><i>b</i>) and SCU (<i>S</i> = 4<i>N</i><sub><i>e</i></sub><i>s</i>).</p