14 research outputs found
Oxytricha trifallax macronuclear PCAP 2.1.8 assembly
Oxytricha trifallax macronuclear PCAP 2.1.8 assembl
Oxytricha trifallax macronuclear IDBA assembly
Contigs in gzipped fasta format
Oxytricha trifallax macronuclear PE-Assembler/SSAKE assembly
Contigs in gzipped fasta format
Oxytricha trifallax macronuclear genome fosmids
Oxytricha trifallax macronuclear genome fosmid
Development of the <i>Oxytricha</i> macronuclear genome from the micronuclear genome.
<p>During conjugation of <i>Oxytricha</i> cells, segments of the micronuclear genome (MDSs) are excised and stitched together to form the nanochromosomes of the new macronuclear genome, and the remainder of the micronuclear genome is eliminated (including the IESs interspersed between MDSs). The old macronuclear genome is also degraded during development. The segments that are stitched together may be either in order (e.g., forming nanochromosome 1, on the left) or out of order or inverted (e.g., forming the two forms of nanochromosome 2), in which case they need to be āunscrambled.ā Two rounds of DNA amplification produce nanochromosomes at an average copy number of ā¼1,900 <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Prescott1" target="_blank">[2]</a>. Alternative fragmentation of DNA during nanochromosome development may also occur, irrespective of unscrambling, giving rise to longer (2a) and shorter (2b) nanochromosome isoforms. The mature nanochromosomes are capped on both ends with telomeres.</p
Comparison of key ciliate macronuclear genomes.
<p>The phylogeny represents the bootstrap consensus of 100 replicates from PhyML (with the HKY85 substitution model) based on a MUSCLE multiple sequence alignment of 18S rRNA genes from seven ciliates (<i>Oxytricha trifallax</i>āFJ545743; <i>Stylonychia lemnae</i>āAJJRB310497; <i>Euplotes crassus</i>āAJJRB310492; <i>Nyctotherus ovalis</i>āAJ222678; <i>Tetrahymena thermophila</i>āM10932; <i>Ichthyophthirius multifiliis</i>āIMU17354; and <i>Paramecium tetraurelia</i>āAB252009) rooted with two other alveolates (<i>Perkinsus marinus</i>āX75762 and <i>Plasmodium falciparum</i>āNC_004325). All bootstrap values are ā„80, except for the node between <i>Nyctotherus</i> and <i>Oxytricha</i>/<i>Stylonychia</i>/<i>Euplotes</i>, which has a boostrap value of 60. <i>Euplotes</i> and <i>Nyctotherus</i> both have nanochromosomes, like <i>Oxytricha</i>. Other than the genome statistics for <i>Oxytricha trifallax</i>, which were determined in this study, table statistics were obtained from the following sources: <sup>a</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Prescott1" target="_blank">[2]</a>, <sup>b</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Duerr1" target="_blank">[22]</a>,<a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Lipps1" target="_blank">[116]</a>, <sup>c</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Nock1" target="_blank">[117]</a>, <sup>d</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Bender1" target="_blank">[99]</a>, <sup>e</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Ricard1" target="_blank">[94]</a>, <sup>f</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Eisen1" target="_blank">[56]</a> (the number of chromosomes is an estimate), <sup>g</sup> -<a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Coyne3" target="_blank">[118]</a>, <sup>h</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-White1" target="_blank">[119]</a>, <sup>i</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Austerberry1" target="_blank">[120]</a>, <sup>j</sup>- <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Coyne2" target="_blank">[64]</a> (for a single stage of the <i>Ichthyophthirius</i> life cycle), <sup>k</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Aury1" target="_blank">[121]</a>, <sup>l</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Duret1" target="_blank">[69]</a>, <sup>m</sup> - <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Gardner1" target="_blank">[122]</a>. Table statistics for <i>Perkinsus marinus</i> are for the current assembly deposited in GenBank (GCA_000006405.1).</p
Key features of <i>Oxytricha</i> protein-coding nanochromosomes.
<p>Representative nanochromosome features are not drawn to scale, but their lengths are indicated. UTR, untranslated region; UTS, untranscribed region. 3ā² UTRs and the subtelomeric signal overlap. The subtelomeric base composition bias signal found on either end of the nanochromosome is shown above the nanochromosome diagram.</p
Telomere end-binding protein-Ī± paralogs in ciliates.
<p>The phylogeny is an ML tree generated by PhyML <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Guindon1" target="_blank">[123]</a> with a single substitution rate category and the JTT substitution model, optimized for tree topology and branch length. Bootstrap percentages for 1,000 replicates are indicated at the tree nodes. The multiple sequence alignments underlying the phylogeny were produced with MAFFT (v 6.418b <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Katoh1" target="_blank">[124]</a>) (default parameters; BLOSUM 62 substitution matrix) and were trimmed with trimal1.2 <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-CapellaGutierrez1" target="_blank">[125]</a> with the ā-automated1ā parameter to remove excess gaps and poorly aligned regions. GenBank accessions are provided for the taxa unless otherwise indicated. <i>Euplotes crassus</i> is indicated in blue (Q06184 and Q06183), and an additional match from our preliminary <i>Euplotes</i> genome assembly is EUP_contig393834_f1_1. <i>Perkinsus marinus</i> is purple (EER00428) and <i>Oxytricha nova</i> is light green (P29549). <i>Tetrahymena thermophila</i> (salmon color) accessions are from the <i>Tetrahymena</i> genome database <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Stover1" target="_blank">[126]</a>āTTHERM_00378980 and TTHERM_00378990; <i>Paramecium tetraurelia</i>'s TeBP-Ī± protein (pink) is from ParameciumDB <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473-Arnaiz2" target="_blank">[127]</a> (GSPATP00001065001). All the nodes beginning with āContigā are <i>Oxytricha trifallax</i> TeBP-Ī± paralogs (dark green) and Contig22209.0.g66 is TeBP-Ī±1, the original TeBP-Ī±. The tree is rooted at the midpoint of the branch between <i>Arabidopsis thaliana</i> (Pot1aāAAX78213 and Pot1bāAAS99712) and <i>Homo sapiens</i> (Pot1āEAW83616; black) and the rest of the phylogeny. Gene expression levels are normalized RNA-seq counts (see <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio.1001473.s059" target="_blank">Text S1</a>; Supporting Materials and Methods) before (āfedā) and during conjugation (0ā60 h) are shown for the <i>Oxytricha trifallax</i> TeBP-Ī± paralogs; coding sequence lengths are also indicated (in bp) for each of these paralogs.</p
Comparison of <i>Oxytricha</i> macronuclear genome assemblies.
<p>The 2-telomere contigs have both 5ā² (CCCCAAAACCCC; with degenerate basesāsee <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#s3" target="_blank">Materials and Methods</a>) and 3ā² (GGGGTTTTGGGG; with degenerate bases) telomeric repeats. Note that 2-telomere contigs are mostly complete nanochromosomes but may also be alternatively fragmented nanochromosomes with one or more additional missing ends and that multitelomere contigs may be either alternatively fragmented nanochromosomes or nanochromosomes with internal telomere-like repeats. Raw read coverage is calculated from LAST (default parameters; version 159; contig telomeres were masked) matches (ā„70 bp long and ā„90% identical) to the assemblies. Read coverage was calculated for the total high quality PE sequence data set and one of the three lanes of SE sequence data. Of the PE reads 13% were telomere bearing, as opposed to 4.7% of the SE reads.</p
Nanochromosomal variant frequencies.
<p>(A) Normalized to form a probability density (cumulative frequency of 1) and (B) unnormalized median nanochromosomal variant frequencies for six increasing ranges of mean SNP heterozygosity. Variant frequencies were determined for nanochromosomes with no non-self matches to the genome assembly (the same nanochromosomes underlying the SNP heterozygosity histogram for āmatchlessā nanochromosomes in <a href="http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.1001473#pbio-1001473-g004" target="_blank">Figure 4</a>), with variant positions called at the same minimum variant frequency as that used to determine potentially heterozygous sites (5% for sites with ā„20Ć read coverage). To exclude potentially paralogous mapped reads, we only analyzed nanochromosomes with ā¤4 reads mapped to other contigs (using all nanochromosomes does not substantially change the form of the distributions). Variant frequency bins are labeled by their lower bounds. Variant frequencies ā„40 bp from either nanochromosome end were counted to avoid possible incorrect variant calling resulting from telomeric bases that were not masked (due to sequencing errors).</p