269 research outputs found

    Establishing the baseline level of repetitive element expression in the human cortex

    Get PDF
    Background: Although nearly half of the human genome is comprised of repetitive sequences, the expression profile of these elements remains largely uncharacterized. Recently developed high throughput sequencing technologies provide us with a powerful new set of tools to study repeat elements. Hence, we performed whole transcriptome sequencing to investigate the expression of repetitive elements in human frontal cortex using postmortem tissue obtained from the Stanley Medical Research Institute. Results: We found a significant amount of reads from the human frontal cortex originate from repeat elements. We also noticed that Alu elements were expressed at levels higher than expected by random or background transcription. In contrast, L1 elements were expressed at lower than expected amounts. Conclusions: Repetitive elements are expressed abundantly in the human brain. This expression pattern appears to be element specific and can not be explained by random or background transcription. These results demonstrate that our knowledge about repetitive elements is far from complete. Further characterization is required to determine the mechanism, the control, and the effects of repeat element expression

    Exon deletions and intragenic insertions are not rare in ataxia with oculomotor apraxia 2

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The autosomal recessively inherited ataxia with oculomotor apraxia 2 (AOA2) is a neurodegenerative disorder characterized by juvenile or adolescent age of onset, gait ataxia, cerebellar atrophy, axonal sensorimotor neuropathy, oculomotor apraxia, and elevated serum AFP levels. AOA2 is caused by mutations within the senataxin gene (<it>SETX</it>). The majority of known mutations are nonsense, missense, and splice site mutations, as well as small deletions and insertions.</p> <p>Methods</p> <p>To detect mutations in patients showing a clinical phenotype consistent with AOA2, the coding region including splice sites of the <it>SETX </it>gene was sequenced and dosage analyses for all exons were performed on genomic DNA. The sequence of cDNA fragments of alternative transcripts isolated after RT-PCR was determined.</p> <p>Results</p> <p>Sequence analyses of the <it>SETX </it>gene in four patients revealed a heterozygous nonsense mutation or a 4 bp deletion in three cases. In another patient, PCR amplification of exon 11 to 15 dropped out. Dosage analyses and breakpoint localisation yielded a 1.3 kb LINE1 insertion in exon 12 (patient P1) and a 6.1 kb deletion between intron 11 and intron 14 (patient P2) in addition to the heterozygous nonsense mutation R1606X. Patient P3 was compound heterozygous for a 4 bp deletion in exon 10 and a 20.7 kb deletion between intron 10 and 15. This deletion was present in a homozygous state in patient P4.</p> <p>Conclusion</p> <p>Our findings indicate that gross mutations seem to be a frequent cause of AOA2 and reveal the importance of additional copy number analysis for routine diagnostics.</p

    Enrichment analysis of Alu elements with different spatial chromatin proximity in the human genome

    Get PDF
    Transposable elements (TEs) have no longer been totally considered as “junk DNA” for quite a time since the continual discoveries of their multifunctional roles in eukaryote genomes. As one of the most important and abundant TEs that still active in human genome, Alu, a SINE family, has demonstrated its indispensable regulatory functions at sequence level, but its spatial roles are still unclear. Technologies based on 3C(chromosomeconformation capture) have revealed the mysterious three-dimensional structure of chromatin, and make it possible to study the distal chromatin interaction in the genome. To find the role TE playing in distal regulation in human genome, we compiled the new released Hi-C data, TE annotation, histone marker annotations, and the genome-wide methylation data to operate correlation analysis, and found that the density of Alu elements showed a strong positive correlation with the level of chromatin interactions (hESC: r=0.9, P<2.2×1016; IMR90 fibroblasts: r = 0.94, P < 2.2 × 1016) and also have a significant positive correlation withsomeremote functional DNA elements like enhancers and promoters (Enhancer: hESC: r=0.997, P=2.3×10−4; IMR90: r=0.934, P=2×10−2; Promoter: hESC: r = 0.995, P = 3.8 × 10−4; IMR90: r = 0.996, P = 3.2 × 10−4). Further investigation involving GC content and methylation status showed the GC content of Alu covered sequences shared a similar pattern with that of the overall sequence, suggesting that Alu elements also function as the GC nucleotide and CpG site provider. In all, our results suggest that the Alu elements may act as an alternative parameter to evaluate the Hi-C data, which is confirmed by the correlation analysis of Alu elements and histone markers. Moreover, the GC-rich Alu sequence can bring high GC content and methylation flexibility to the regions with more distal chromatin contact, regulating the transcription of tissue-specific genes

    Alu pair exclusions in the human genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The human genome contains approximately one million <it>Alu </it>elements which comprise more than 10% of human DNA by mass. <it>Alu </it>elements possess direction, and are distributed almost equally in positive and negative strand orientations throughout the genome. Previously, it has been shown that closely spaced <it>Alu </it>pairs in opposing orientation (inverted pairs) are found less frequently than <it>Alu </it>pairs having the same orientation (direct pairs). However, this imbalance has only been investigated for <it>Alu </it>pairs separated by 650 or fewer base pairs (bp) in a study conducted prior to the completion of the draft human genome sequence.</p> <p>Results</p> <p>We performed a comprehensive analysis of all (> 800,000) full-length <it>Alu </it>elements in the human genome. This large sample size permits detection of small differences in the ratio between inverted and direct <it>Alu </it>pairs (I:D). We have discovered a significant depression in the full-length <it>Alu </it>pair I:D ratio that extends to repeat pairs separated by ≤ 350,000 bp. Within this imbalance bubble (those <it>Alu </it>pairs separated by ≤ 350,000 bp), direct pairs outnumber inverted pairs. Using PCR, we experimentally verified several examples of inverted <it>Alu </it>pair exclusions that were caused by deletions.</p> <p>Conclusions</p> <p>Over 50 million full-length <it>Alu </it>pairs reside within the I:D imbalance bubble. Their collective impact may represent one source of <it>Alu </it>element-related human genomic instability that has not been previously characterized.</p

    RISCI - Repeat Induced Sequence Changes Identifier: a comprehensive, comparative genomics-based, in silico subtractive hybridization pipeline to identify repeat induced sequence changes in closely related genomes

    Get PDF
    <p>Abstract</p> <p>Background -</p> <p>The availability of multiple whole genome sequences has facilitated <it>in silico </it>identification of fixed and polymorphic transposable elements (TE). Whereas polymorphic loci serve as makers for phylogenetic and forensic analysis, fixed species-specific transposon insertions, when compared to orthologous loci in other closely related species, may give insights into their evolutionary significance. Besides, TE insertions are not isolated events and are frequently associated with subtle sequence changes concurrent with insertion or post insertion. These include duplication of target site, 3' and 5' flank transduction, deletion of the target locus, 5' truncation or partial deletion and inversion of the transposon, and post insertion changes like inter or intra element recombination, disruption etc. Although such changes have been studied independently, no automated platform to identify differential transposon insertions and the associated array of sequence changes in genomes of the same or closely related species is available till date. To this end, we have designed RISCI - 'Repeat Induced Sequence Changes Identifier' - a comprehensive, comparative genomics-based, <it>in silico </it>subtractive hybridization pipeline to identify differential transposon insertions and associated sequence changes using specific alignment signatures, which may then be examined for their downstream effects.</p> <p>Results -</p> <p>We showcase the utility of RISCI by comparing full length and truncated L1HS and AluYa5 retrotransposons in the reference human genome with the chimpanzee genome and the alternate human assemblies (Celera and HuRef). Comparison of the reference human genome with alternate human assemblies using RISCI predicts 14 novel polymorphisms in full length L1HS, 24 in truncated L1HS and 140 novel polymorphisms in AluYa5 insertions, besides several insertion and post insertion changes. We present comparison with two previous studies to show that RISCI predictions are broadly in agreement with earlier reports. We also demonstrate its versatility by comparing various strains of <it>Mycobacterium tuberculosis </it>for IS 6100 insertion polymorphism.</p> <p>Conclusions -</p> <p>RISCI combines comparative genomics with subtractive hybridization, inferring changes only when exclusive to one of the two genomes being compared. The pipeline is generic and may be applied to most transposons and to any two or more genomes sharing high sequence similarity. Such comparisons, when performed on a larger scale, may pull out a few critical events, which may have seeded the divergence between the two species under comparison.</p

    The association of Alu repeats with the generation of potential AU-rich elements (ARE) at 3' untranslated regions.

    Get PDF
    BACKGROUND: A significant portion (about 8% in the human genome) of mammalian mRNA sequences contains AU (Adenine and Uracil) rich elements or AREs at their 3' untranslated regions (UTR). These mRNA sequences are usually stable. However, an increasing number of observations have been made of unstable species, possibly depending on certain elements such as Alu repeats. ARE motifs are repeats of the tetramer AUUU and a monomer A at the end of the repeats ((AUUU)(n)A). The importance of AREs in biology is that they make certain mRNA unstable. Proto-oncogene, such as c-fos, c-myc, and c-jun in humans, are associated with AREs. Although it has been known that the increased number of ARE motifs caused the decrease of the half-life of mRNA containing ARE repeats, the exact mechanism is as of yet unknown. We analyzed the occurrences of AREs and Alu and propose a possible mechanism for how human mRNA could acquire and keep AREs at its 3' UTR originating from Alu repeats. RESULTS: Interspersed in the human genome, Alu repeats occupy 5% of the 3' UTR of mRNA sequences. Alu has poly-adenine (poly-A) regions at its end, which lead to poly-thymine (poly-T) regions at the end of its complementary Alu. It has been found that AREs are present at the poly-T regions. From the 3' UTR of the NCBI's reference mRNA sequence database, we found nearly 40% (38.5%) of ARE (Class I) were associated with Alu sequences (Table 1) within one mismatch allowance in ARE sequences. Other ARE classes had statistically significant associations as well. This is far from a random occurrence given their limited quantity. At each ARE class, random distribution was simulated 1,000 times, and it was shown that there is a special relationship between ARE patterns and the Alu repeats. CONCLUSION: AREs are mediating sequence elements affecting the stabilization or degradation of mRNA at the 3' untranslated regions. However, AREs' mechanism and origins are unknown. We report that Alu is a source of ARE. We found that half of the longest AREs were derived from the poly-T regions of the complementary Alu

    Repetitive Elements May Comprise Over Two-Thirds of the Human Genome

    Get PDF
    Transposable elements (TEs) are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo “clouds”). We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%–69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM), to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (∼25 bp). Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed “element-specific” P-clouds (ESPs) to identify novel Alu and MIR SINE elements, and using it we identified ∼100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed

    Transposon Excision from an Atypical Site: A Mechanism of Evolution of Novel Transposable Elements

    Get PDF
    The role of transposable elements in sculpting the genome is well appreciated but remains poorly understood. Some organisms, such as humans, do not have active transposons; however, transposable elements were presumably active in their ancestral genomes. Of specific interest is whether the DNA surrounding the sites of transposon excision become recombinogenic, thus bringing about homologous recombination. Previous studies in maize and Drosophila have provided conflicting evidence on whether transposon excision is correlated with homologous recombination. Here we take advantage of an atypical Dissociation (Ds) element, a maize transposon that can be mobilized by the Ac transposase gene in Arabidopsis thaliana, to address questions on the mechanism of Ds excision. This atypical Ds element contains an adjacent 598 base pairs (bp) inverted repeat; the element was allowed to excise by the introduction of an unlinked Ac transposase source through mating. Footprints at the excision site suggest a micro-homology mediated non-homologous end joining reminiscent of V(D)J recombination involving the formation of intra-helix 3′ to 5′ trans-esterification as an intermediate, a mechanism consistent with previous observations in maize, Antirrhinum and in certain insects. The proposed mechanism suggests that the broken chromosome at the excision site should not allow recombinational interaction with the homologous chromosome, and that the linked inverted repeat should also be mobilizable. To test the first prediction, we measured recombination of flanking chromosomal arms selected for the excision of Ds. In congruence with the model, Ds excision did not influence crossover recombination. Furthermore, evidence for correlated movement of the adjacent inverted repeat sequence is presented; its origin and movement suggest a novel mechanism for the evolution of repeated elements. Taken together these results suggest that the movement of transposable elements themselves may not directly influence linkage. Possibility remains, however, for novel repeated DNA sequences produced as a consequence of transposon movement to influence crossover in subsequent generations

    Effects of L1-ORF2 fragments on green fluorescent protein gene expression

    Get PDF
    The retrotransposon known as long interspersed nuclear element-1 (L1) is 6 kb long, although most L1s in mammalian and other eukaryotic cells are truncated. L1 contains two open reading frames, ORF1 and ORF2, that code for an RNA-binding protein and a protein with endonuclease and reverse transcriptase activities, respectively. In this work, we examined the effects of full length L1-ORF2 and ORF2 fragments on green fluorescent protein gene (GFP) expression when inserted into the pEGFP-C1 vector downstream of GFP. All of the ORF2 fragments in sense orientation inhibited GFP expression more than when in antisense orientation, which suggests that small ORF2 fragments contribute to the distinct inhibitory effects of this ORF on gene expression. These results provide the first evidence that different 280-bp fragments have distinct effects on the termination of gene transcription, and that when inserted in the antisense direction, fragment 280-9 (the 3' end fragment of ORF2) induces premature termination of transcription that is consistent with the effect of ORF2
    corecore