7 research outputs found

    RNA Editome in Rhesus Macaque Shaped by Purifying Selection

    No full text
    <div><p>Understanding of the RNA editing process has been broadened considerably by the next generation sequencing technology; however, several issues regarding this regulatory step remain unresolved – the strategies to accurately delineate the editome, the mechanism by which its profile is maintained, and its evolutionary and functional relevance. Here we report an accurate and quantitative profile of the RNA editome for rhesus macaque, a close relative of human. By combining genome and transcriptome sequencing of multiple tissues from the same animal, we identified 31,250 editing sites, of which 99.8% are A-to-G transitions. We verified 96.6% of editing sites in coding regions and 97.5% of randomly selected sites in non-coding regions, as well as the corresponding levels of editing by multiple independent means, demonstrating the feasibility of our experimental paradigm. Several lines of evidence supported the notion that the adenosine deamination is associated with the macaque editome – A-to-G editing sites were flanked by sequences with the attributes of <i>ADAR</i> substrates, and both the sequence context and the expression profile of <i>ADARs</i> are relevant factors in determining the quantitative variance of RNA editing across different sites and tissue types. In support of the functional relevance of some of these editing sites, substitution valley of decreased divergence was detected around the editing site, suggesting the evolutionary constraint in maintaining some of these editing substrates with their double-stranded structure. These findings thus complement the “continuous probing” model that postulates tinkering-based origination of a small proportion of functional editing sites. In conclusion, the macaque editome reported here highlights RNA editing as a widespread functional regulation in primate evolution, and provides an informative framework for further understanding RNA editing in human.</p></div

    Experimental and computational strategies for accurate editome identification in rhesus macaque.

    No full text
    <p>Potential false-positives in the RNA editing calling workflow were minimized by a more thorough design in our pipeline strategy. (<b>A</b>) Two discrepancies between RNA and genomic-DNA sequences (highlighted by blue boxes) were located in a <i>cis</i>-natural antisense region where both DNA strands could be transcribed. Strand-specific RNA-Seq clearly distinguished the sequence reads transcribed from the two strands and correctly assigned this site as A-to-G editing, as no discrepancy was detected in the plus-strand transcribed gene. (<b>B</b>) Based on the macaque gene structures defined in-house (<b>RhesusBase Structure</b>), one of the exon-intron boundaries of <i>ENSMMUT00000021567</i> was incorrectly defined by a previous annotation (<b>Ensembl Structure</b>). Two T-to-A DNA-RNA discrepancies highlighted by blue boxes would be incorrectly identified as T-to-A RNA editing with the RNA-Seq reads being aligned to the mis-annotated transcript structure. (<b>C</b>) The genotype of the site highlighted in the blue boxes was incorrectly recognized as homozygous in DNA and heterozygous in RNA, since only 1 out of 28 sequence reads supported the mutant allele T in DNA, leading to incorrect assignment of a C-to-T editing event. Both Sequenom mass array and Sanger sequencing validations excluded such false-positives, which may arise due to low sequencing coverage and biased allele capture efficiency in the exome-Seq assay.</p

    <i>ADARs</i>-mediated enzymatic reactions is associated with the macaque editome.

    No full text
    <p>(<b>A</b>) The enriched (above the top line) and depleted (below the bottom line) nucleotides nearby the focal editing sites are displayed in Two-Sample Logo, with the level of preference/depletion shown in height proportional to the scale. (<b>B</b>) The editing sites were divided into four categories on the basis of the local sequence context nearby the editing site, as described in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004274#s4" target="_blank"><b>Materials and Methods</b></a>. For each category, levels of RNA editing are shown in boxplots according to the tissue types. (<b>C</b>) Distribution of the percentages of editing sites showing tissue distribution of editing levels positively correlated with the expression of <i>ADARs</i> (Spearman's rank correlation coefficient at ≥0.5), for 10,000 permutation datasets neglecting tissue relationships for the tissue expression profile. The percentage for the real data was indicated by the arrow with <i>Monte Carlo p-value</i>. (<b>D</b>) Distributions of <i>R<sup>2</sup></i> values in models assuming association of editing level with <i>ADARs</i> expression are shown as the <b>Real Data</b>, as well as the <b>Background</b>, which correspond to randomly shuffled profiles. (<b>E, F</b>) The tissue expression profiles of <i>ADAR1</i> or <i>ADAR2</i> were ordered based on RNA expression levels, and normalized editing levels of A-to-G sites were aligned accordingly. These A-to-G editing sites showed similar trends in the distribution of editing levels along the ordered tissue expression profile of <i>ADAR1</i> (<b>E</b>) or <i>ADAR2</i> (<b>F</b>).</p

    Contribution of purifying selection to the RNA editome in primates.

    No full text
    <p>(<b>A</b>) The percentages of macaque editing sites with corresponding editing sites in human and/or chimpanzee (red bars), or genomically encoded in the two species (blue bars), are shown for the total editome (top), or for editing sites in different genomic regions (bottom). (<b>B</b>) The genomic sequences nearby the macaque editing sites were compiled according to the distances to the editing sites. For each 6-nucleotide window, the proportion of divergent sites between human and rhesus macaque are shown for different genomic categories. (<b>C</b>) Distribution of human-macaque synonymous divergent sites nearby the A-to-G editing sites. The codons with RNA-editing sites are highlighted in yellow and each synonymous divergent site in purple. The distribution of synonymous divergence (<i>dS</i>) values near the RNA-editing site, calculated using a 6-codon window, is shown in the lower panel, with the genome-wide <i>dN</i> and <i>dS</i> between human and rhesus macaque indicated by the dotted line.</p

    Genome-wide identification and verification of RNA editome in one rhesus macaque.

    No full text
    <p>(<b>A</b>) Overview of the experimental design – genome-wide identification, and medium- or low-throughput verification of RNA-editing sites. (<b>B</b>) An example showing the genotyping results for the genomic DNA (gDNA) and cDNA (cDNA) of one verified RNA-editing site (chr11:5028364, <i>KCNA1</i>). The levels of RNA editing were estimated from high-throughput, medium-throughput and low-scale data on the basis of read number, signal intensity contrast and peak height ratio between the edited and wild-type alleles, respectively. The primer peak and the genotype peak on mass spectrum are indicated by dotted lines in red. (<b>C</b>) Comparison of the levels of RNA editing estimated by high-throughput (H), medium-throughput (M) and low-scale (L) platforms. The example in (<b>B</b>) is highlighted in red. Pearson correlation coefficients between different platforms are shown on the right.</p

    Characteristics of the rhesus macaque editome.

    No full text
    <p>(<b>A</b>) For editing sites in each type of tissue, the distribution of the levels of RNA editing was shown in boxplot. (<b>B</b>) Hierarchical clustering of editing levels of all editing sites across multiple macaque tissues and animals. Editing levels were estimated on the basis of RNA-Seq data in this study (Testis, Lung, Kidney, Heart, Muscle, Prefrontal cortex) and other public RNA-Seq data [Brain (1–6), Cerebellum (1–2), Muscle (1–8), Heart (1–5), Kidney (1–3), Lung (1–3), Testis (1–3)], with missing data shown in dark cyan. (<b>C</b>) Hierarchical clustering of editing levels is shown for selected RNA editing sites located in coding regions. Editing levels were estimated on the basis of mass array-based genotyping in seven macaque tissues derived from the same macaque (Testis, Lung, Kidney, Heart, Muscle, Cerebellum, Prefrontal Cortex), as well as five muscle and four brain samples obtained from different macaque animals [Muscles (A–E), Whole Brains (A–D)], with missing data shown in dark cyan. (<b>D</b>) The distribution of pair-wise comparison of intra-population and cross-tissue coefficient of variance (CV) values is shown in boxplot.</p
    corecore