38 research outputs found
Localization of ENOR Transcripts
<p>qRT-PCR was carried out using total and cytoplasmic RNA from mouse whole
brain and the corresponding primer pairs (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020037#pgen-0020037-st003" target="_blank">Table
S3</a>). ENORs are listed in increasing order based on the
estimated length of each region. Apart from the results shown, we also
examined the localization of other mRNAs <i>(β-actin</i> and
<i>GAPDH)</i> and additional regions of
<i>Rian</i> and other ENORs, and these results were
consistent with the rest (unpublished data).</p
Northern Blot Analysis of ENOR Transcripts
<p>Mouse whole brain total RNA (10 μg/lane) was used for the analysis except
for ENOR2 and ENOR61, where mouse thymus total RNA was used. DNA
fragments without any predicted repeated sequences were PCR-amplified
from cDNAs in ENORs (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020037#pgen-0020037-st003" target="_blank">Table S3</a>), labeled with
<sup>32</sup>P-dCTP (Amersham Biosciences), and then used as probes.
RNA size was estimated with an RNA ladder (Invitrogen). ENORs are listed
in increasing order based on the estimated length of each region.</p
Discovery Pipeline for ENORs
<p>FANTOM and public transcripts were clustered into 37,348 TUs by grouping
any two or more transcripts that shared genomic coordinates. Then, the
following procedures were applied. (1) Protein-coding TUs were excluded
by removing any whose transcripts had an open reading frame of either
150 amino acids or more (RIKEN/MGC cDNAs) or one amino acid or more
(non-RIKEN/MGC cDNAs). (2) TUs wholly encompassed within introns of
protein-coding TUs were excluded to avoid possible pre-mRNA intronic
transcripts. (3) Intron-containing TUs were excluded to select for
unspliced transcripts. (4) TUs lacking adjunct adenine-rich regions or
containing polyA signals were excluded to select for internally primed
transcripts. (5) Remaining UNA TUs that mapped within 100 Kb of one
another on the mouse genome (mm5) were clustered together, provided they
did not overlap the genomic coordinates of a protein-coding TU/NCBI
RefSeq/Ensembl gene model with a CDS of 150 amino acids or more or a
noncoding TU with a polyA signal within 100 bp of the 3′ end and without
an adjunct adenine-rich region. (6) Reliably expressed UNA TU clusters
were selected by identifying those with at least ten supporting ESTs.
(7) Selected UNA TU clusters were then manually screened and separated
based upon evidence of possible internal transcription state sites
(based upon CpG islands, CAGE tags, and EST clusters), resulting in the
identification of 66 ENORs.</p
Presence of Transcription between Adjacent cDNAs
<p>PCR was carried out with and without reverse transcription (RT[+] and
RT[−], respectively) using midbrain total RNA and the corresponding
primer pairs (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020037#pgen-0020037-st003" target="_blank">Table S3</a>). PCR using genomic DNA was
also carried out as a control. A DNA ladder (Promega; <a href="http://www.promega.com" target="_blank">http://www.promega.com</a>) was used as a
size marker. The amplified fragments were confirmed as the expected ones
by analyzing digestion pattern using several restriction enzymes. The
lower band, observed in the RT(+) lane of the amplified fragment C,
seems to be nonspecific, because it was amplified using only the right
primer and because it showed a digestion pattern with restriction
enzymes quite different from that of the upper band and the band of the
genomic DNA (unpublished data).</p
qRT-PCR Analysis
<p>Analysis of (A) <i>Air,</i> (B) ENOR28, and (C) ENOR31 loci.
Above in each panel, screen shots of the GEV featuring the loci around
<i>Air,</i> ENOR28, and ENOR31 are shown. The orange bars
indicate the regions for <i>Air,</i> ENOR28, and ENOR31. cDNA
sequences from the RIKEN and public databases are shown. Sequences
mapped on the plus strand and minus strand are brown and purple,
respectively. Predicted genes from Ensembl, NCBI, and RefSeq databases
are shown in gray. For RIKEN imprinted transcripts, imprinted cDNA
candidates identified previously [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020037#pgen-0020037-b038" target="_blank">38</a>] are shown. CpG islands as defined
by the UCSC Genome Browser are shown. Positions of primer pairs are
marked by small vertical arrows. Below in each panel, qRT-PCR results
for midbrain, hippocampus, thalamus, striatum, and testis using the
corresponding primer pairs are shown.</p
Snapshots of the GEV Showing Transcription
<div><p>(A) The <i>Air</i>/<i>Igf2r</i> locus (Chromosome 17:
12,091,531–12,258,195).</p>
<p>(B) The <i>Xist</i>/<i>Tsix</i> locus (X chromosome:
94,835,096–94,888,536).</p>
<p>(C) The dystrophin <i>(Dmd)</i> locus (X chromosome:
76,500,000–76,754,601).</p>
<p>For the transcripts, cDNA sequences from the RIKEN and public databases
are shown, and are colored in brown and purple depending upon their
chromosomal strand of origin. Predicted genes from Ensembl, NCBI, and
RefSeq databases are shown in gray. CpG islands as defined by the UCSC
Genome Browser are shown. Blue circles indicate unspliced, noncoding
RIKEN cDNAs with adjunct adenine-rich regions. Red circles indicate
RIKEN imprinted cDNA candidates [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020037#pgen-0020037-b038" target="_blank">38</a>].</p></div
TU Pairs Searched For
<p>We defined a <i>cis–</i>antisense pair as two oppositely transcribed TUs that share at least 20 bp of exon sequence, a non-exon-overlapping antisense pair as two oppositely transcribed TUs that overlap by at least 20 bp, but not within exons, and a bidirectionally promoted pair as two divergently transcribed TUs that overlap by less than 20 bp and are less than 1,000 bp apart.</p
TSS Variability at the <i>Ddx49</i>/<i>Cope</i> Bidirectional Promoter in Mouse
<div><p>(A) The charts show the distribution of CAGE tag 5′-ends over the first five exons of each of the two genes <i>Ddx49</i> and <i>Cope,</i> and over their intergenic region. CAGE tag mappings indicate that transcription of <i>Cope</i> can start within two wide regions in the first exon of the gene. The initial part of this first exon (hatched) has support from several ESTs, but no cDNA sequences. The three large TCs at the <i>Ddx49</i>/<i>Cope</i> locus span 79, 114, and 150 bp, indicating great variability of transcriptional initiation within each cluster. To confirm the existence of such variability by qRT-PCR, primers (connected boxes) were designed to measure expression of selected regions of the <i>Ddx49</i> (primer pairs A1–A4) and <i>Cope</i> (primer pairs B1–B5) transcripts.</p>
<p>(B) Detailed view of CAGE tag frequencies and primer locations over the three transcription initiation regions indicated by CAGE tags. Gray lines show cumulative CAGE tag frequencies.</p>
<p>(C) Expression levels of different regions of the <i>Ddx49</i> and <i>Cope</i> transcripts in adult brain RNA as measured by qRT-PCR. Primer pairs A1 and A2 confirmed low level of expression of the longest <i>Ddx49</i> transcripts indicated by CAGE (copy numbers in 12.5 ng of total RNA were 3.2 [standard deviation = 1.1] and 5.1 [standard deviation = 3.0] for A1 and A2, respectively). Primer pair B1 confirmed transcription of <i>Cope</i> from upstream of the canonical initiation region. Primer pairs B2–B4 supported variability of transcriptional initiation within the canonical region.</p></div
Estimating the Extent and Conservation of Antisense Transcription
<div><p>(A and B) Estimation of proportion of TUs involved in <i>cis–</i>antisense pairs. Open circles indicate the fraction of all human TUs on the plus strand (A) and all mouse TUs on the plus strand (B) that were found to be involved in <i>cis–</i>antisense pairs when the minus-strand TUs were recomputed starting from random transcript sequence samples of different sizes. Filled circles represent the full datasets based on all available transcript sequences. The saturation curves (see Equation 1) indicated by the lines fit almost perfectly to the sampled data. Fitted human and mouse saturation curves approach 0.45 and 0.43, respectively, as the number of transcript sequences increases, indicating that more than 40% of all TUs might be involved in <i>cis–</i>antisense pairs. Similar estimates were obtained by other sampling approaches (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020047#pgen-0020047-sg003" target="_blank">Figure S3</a>).</p>
<p>(C) Estimation of the proportion of human <i>cis–</i>antisense pairs that are conserved in mouse. Open circles indicate the proportion of human <i>cis–</i>antisense pairs found to be conserved in mouse when the full human dataset was compared to mouse datasets recomputed from random mouse transcript sequence samples of different sizes. The same type of saturation curve as in (A) was fitted to the data. Here, a model with <i>c</i> = 1 (i.e., hyperbolic saturation) was preferable as it provided an equally good fit while being simpler. The fitted curve approaches 0.25 as the number of mappings grows, indicating that about 25% of human <i>cis–</i>antisense pairs are conserved in mouse.</p></div
Landmark Sequence Composition of Bidirectional Promoters
<p>We defined the midpoint of a bidirectional promoter as the midpoint between the most 5′ TSS in each of the two divergently oriented TCs defining the bidirectional promoter. Sequences corresponding to the region spanned by the TCs were extracted from the genomic plus strand. All bidirectional promoter sequences were aligned at their midpoint and the logo created with WebLogo [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020047#pgen-0020047-b049" target="_blank">49</a>]. The logo displays the four nucleotides ranked by their frequency at each position, so that more common nucleotides appear above less common ones. The charts above the logo show the distribution of CAGE tag 5′-ends mapping to the plus strand (upper chart) and minus strand (lower chart) around bidirectional promoter midpoints. The CAGE tag distribution was computed as the sum of tag counts at each position over all bidirectional promoters. The peak of nearly 5,000 tags on the plus strand is due to the <i>Rps2</i> gene, which appears to be most highly expressed from a single TSS.</p