69 research outputs found

    Polyadenylation Mediated by LINE-1

    Get PDF
    Transposable elements (TEs) are sequences that change position within the genome and play an important role in genome expansion. TEs are grouped into two categories based on their transposition mechanism. Class 1 retrotransposons spread via target-primed reverse transcription (RNA to DNA) into different genomic locations. Long interspersed element 1 (L1) is a class 1 retrotransposon that is able to move autonomously, as they encode the protein machinery with an endonuclease and reverse transcriptase activity, to insert themselves back into the genome. L1s were the focus of this study, because they are implicated in creating alternate poly(A) sites in genes. We analyzed 778,128 isoforms produced from 12 samples of long-read RNA (PacBio HiFi) sequencing data to investigate if L1s introduce polyadenylation sites. Isoforms were filtered based on L1 location within the isoforms’ 3’UTR, resulting in roughly 3,000 isoforms, spread across 757 genes. L1 subfamilies have arisen throughout evolutionary history due to species-specific substitutions. The L1 subfamilies in the dataset are mostly mammalian specific, while only 43 contain primate specific L1s. The majority of the L1s studied were classified as L1M5 (329), L1ME4b (165), L1MB7 (105), and L1ME4c (105). These L1s contain canonical and noncanonical polyadenylation signals within their 3’UTRs. Alternatively polyadenylated mRNA variants, generated from the same gene, are likely bound by different combinations of trans-acting factors that can affect mRNA localization, translation, stability, and decay. Understanding the roles of L1s in alternative polyadenylation will shed light on the impact of TEs on processing efficiency of gene expression

    A mobile threat to genome stability: The impact of non-LTR retrotransposons upon the human genome

    Get PDF
    It is now commonly agreed that the human genome is not the stable entity originally presumed. Deletions, duplications, inversions, and insertions are common, and contribute significantly to genomic structural variations (SVs). Their collective impact generates much of the inter-individual genomic diversity observed among humans. Not only do these variations change the structure of the genome; they may also have functional implications, e.g. altered gene expression. Some SVs have been identified as the cause of genetic disorders, including cancer predisposition. Cancer cells are notorious for their genomic instability, and often show genomic rearrangements at the microscopic and submicroscopic level to which transposable elements (TEs) contribute. Here, we review the role of TEs in genome instability, with particular focus on non-LTR retrotransposons. Currently, three non-LTR retrotransposon families - long interspersed element 1 (L1), SVA (short interspersed element (SINE-R), variable number of tandem repeats (VNTR), and Alu), and Alu (a SINE) elements - mobilize in the human genome, and cause genomic instability through both insertion- and post-insertion-based mutagenesis. Due to the abundance and high sequence identity of TEs, they frequently mislead the homologous recombination repair pathway into non-allelic homologous recombination, causing deletions, duplications, and inversions. While less comprehensively studied, non-LTR retrotransposon insertions and TE-mediated rearrangements are probably more common in cancer cells than in healthy tissue. This may be at least partially attributed to the commonly seen global hypomethylation as well as general epigenetic dysfunction of cancer cells. Where possible, we provide examples that impact cancer predisposition and/or development. © 2010 Elsevier Ltd

    LINEs and SINEs of primate evolution

    Get PDF
    The primate order is a monophyletic group thought to have diverged from the Euarchonta more than 65 mya.1 Recent paleontological and molecular evolution studies place the last common ancestor of primates even earlier (≥ 85 mya).2 More than 300 extant primate species are recognized today,3, 4 clearly emphasizing their diversity and success. Our understanding of the evolution of primates and the composition of their genomes has been revolutionized within the last decade through the increasing availability and analyses of sequenced genomes. However, several aspects of primate evolution have yet to be resolved. DNA sequencing of a wider array of primate species now underway will provide an opportunity to investigate and expand on these questions in great detail. One of the most surprising findings of the human (Homo sapiens) genome project was the high content of repetitive sequences, in particular of mobile DNA.5 This finding has been replicated in all available and analyzed primate draft genome sequences analyzed to date.5-7 In fact, transposable elements (TEs) contribute about 50% of the genome size of humans,5 chimpanzees (Pan troglodytes),6 and rhesus macaques (Macacca mulatta).7 The proportion of TEs among the overall genome content is likely even higher due to the decay of older mobile elements beyond recognition, rearrangements of genomes over the course of evolution, and the challenge of sequencing and assembling repeat-rich regions of the genome.8, 9 Retrotransposons, in particular L1, long interspersed element 1 (LINE1), and Alu, a short interspersed element (SINE), are prominent in primate genomes, and have played a major role in genome evolution and architecture. The evolution and success of the primate-specific LINE and SINE subfamilies (L1 and Alu in particular), their application in phylogenetic studies, and their impact on the architecture of primate genomes will be the focus of this review. In addition, we will briefly cover the emergence and impact of SVA (SINE-R/VNTR/Alu), a composite retrotransposon of relatively recent origin, and of other SINEs that are not common to all primates. © 2010 Wiley Periodicals, Inc

    Identification and characterization of novel polymorphic LINE-1 insertions through comparison of two human genome sequence assemblies

    Get PDF
    Mobile elements represent a relatively new class of markers for the study of human evolution. Long interspersed elements (LINEs) belong to a group of retrotransposons comprising approximately 21% of the human genome. Young LINE-1 (L1) elements that have integrated recently into the human genome can be polymorphic for insertion presence/absence in different human populations at particular chromosomal locations. To identify putative novel L1 insertion polymorphisms, we computationally compared two draft assemblies of the whole human genome (Public and Celera Human Genome assemblies). We identified a total of 148 potential polymorphic L1 insertion loci, among which 73 were candidates for novel polymorphic loci. Based on additional analyses we selected 34 loci for further experimental studies. PCR-based assays and DNA sequence analysis were performed for these 34 loci in 80 unrelated individuals from four diverse human populations: African-American, Asian, Caucasian, and South American. All but two of the selected loci were confirmed as polymorphic in our human population panel. Approximately 47% of the analyzed loci integrated into other repetitive elements, most commonly older L1s. One of the insertions was accompanied by a BC200 sequence. Collectively, these mobile elements represent a valuable source of genomic polymorphism for the study of human population genetics. Our results also suggest that the exhaustive identification of L1 insertion polymorphisms is far from complete, and new whole genome sequences are valuable sources for finding novel retrotransposon insertion polymorphisms. © 2006 Elsevier B.V. All rights reserved

    Tangram: A comprehensive toolbox for mobile element insertion detection

    Get PDF
    © 2014 Wu et al.; licensee BioMed Central Ltd. Background: Mobile elements (MEs) constitute greater than 50% of the human genome as a result of repeated insertion events during human genome evolution. Although most of these elements are now fixed in the population, some MEs, including ALU, L1, SVA and HERV-K elements, are still actively duplicating. Mobile element insertions (MEIs) have been associated with human genetic disorders, including Crohn\u27s disease, hemophilia, and various types of cancer, motivating the need for accurate MEI detection methods. To comprehensively identify and accurately characterize these variants in whole genome next-generation sequencing (NGS) data, a computationally efficient detection and genotyping method is required. Current computational tools are unable to call MEI polymorphisms with sufficiently high sensitivity and specificity, or call individual genotypes with sufficiently high accuracy.Results: Here we report Tangram, a computationally efficient MEI detection program that integrates read-pair (RP) and split-read (SR) mapping signals to detect MEI events. By utilizing SR mapping in its primary detection module, a feature unique to this software, Tangram is able to pinpoint MEI breakpoints with single-nucleotide precision. To understand the role of MEI events in disease, it is essential to produce accurate individual genotypes in clinical samples. Tangram is able to determine sample genotypes with very high accuracy. Using simulations and experimental datasets, we demonstrate that Tangram has superior sensitivity, specificity, breakpoint resolution and genotyping accuracy, when compared to other, recently developed MEI detection methods.Conclusions: Tangram serves as the primary MEI detection tool in the 1000 Genomes Project, and is implemented as a highly portable, memory-efficient, easy-to-use C++ computer program, built under an open-source development model

    A computational reconstruction of Papio phylogeny using Alu insertion polymorphisms

    Get PDF
    © 2018 The Author(s). Background: Since the completion of the human genome project, the diversity of genome sequencing data produced for non-human primates has increased exponentially. Papio baboons are well-established biological models for studying human biology and evolution. Despite substantial interest in the evolution of Papio, the systematics of these species has been widely debated, and the evolutionary history of Papio diversity is not fully understood. Alu elements are primate-specific transposable elements with a well-documented mutation/insertion mechanism and the capacity for resolving controversial phylogenetic relationships. In this study, we conducted a whole genome analysis of Alu insertion polymorphisms unique to the Papio lineage. To complete these analyses, we created a computational algorithm to identify novel Alu insertions in next-generation sequencing data. Results: We identified 187,379 Alu insertions present in the Papio lineage, yet absent from M. mulatta [Mmul8.0.1]. These elements were characterized using genomic data sequenced from a panel of twelve Papio baboons: two from each of the six extant Papio species. These data were used to construct a whole genome Alu-based phylogeny of Papio baboons. The resulting cladogram fully-resolved relationships within Papio. Conclusions: These data represent the most comprehensive Alu-based phylogenetic reconstruction reported to date. In addition, this study produces the first fully resolved Alu-based phylogeny of Papio baboons

    Sequence analysis and characterization of active human alu subfamilies based on the 1000 genomes pilot project

    Get PDF
    © The Author(s) 2015. The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic young Alu insertion events, absent fromthe human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths.All the sequenced Alu lociwerederived from the Alu Y lineagewith no evidence of retrotransposition activity involving older Alu families (e.g., AluJandAluS). AluYa5 is currently themost active Alu subfamily in the human lineage, followed by AluYb8, andmany others including three newly identified subfamilieswe have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages
    • …
    corecore