6,707 research outputs found

    Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

    Get PDF
    We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics

    Global Mapping of DNA Conformational Flexibility on Saccharomyces cerevisiae

    Get PDF
    In this study we provide the first comprehensive map of DNA conformational flexibility in Saccharomyces cerevisiae complete genome. Flexibility plays a key role in DNA supercoiling and DNA/protein binding, regulating DNA transcription, replication or repair. Specific interest in flexibility analysis concerns its relationship with human genome instability. Enrichment in flexible sequences has been detected in unstable regions of human genome defined fragile sites, where genes map and carry frequent deletions and rearrangements in cancer. Flexible sequences have been suggested to be the determinants of fragile gene proneness to breakage; however, their actual role and properties remain elusive. Our in silico analysis carried out genome-wide via the StabFlex algorithm, shows the conserved presence of highly flexible regions in budding yeast genome as well as in genomes of other Saccharomyces sensu stricto species. Flexibile peaks in S. cerevisiae identify 175 ORFs mapping on their 3’UTR, a region affecting mRNA translation, localization and stability. (TA)n repeats of different extension shape the central structure of peaks and co-localize with polyadenylation efficiency element (EE) signals. ORFs with flexible peaks share common features. Transcripts are characterized by decreased half-life: this is considered peculiar of genes involved in regulatory systems with high turnover; consistently, their function affects biological processes such as cell cycle regulation or stress response. Our findings support the functional importance of flexibility peaks, suggesting that the flexible sequence may be derived by an expansion of canonical TAYRTA polyadenylation efficiency element. The flexible (TA)n repeat amplification could be the outcome of an evolutionary neofunctionalization leading to a differential 3’-end processing and expression regulation in genes with peculiar function. Our study provides a new support to the functional role of flexibility in genomes and a strategy for its characterization inside human fragile sites

    The Evolution of LINE-1 in Vertebrates

    Full text link
    The abundance and diversity of the LINE-1 (L1) retrotransposon differ greatly among vertebrates. Mammalian genomes contain hundreds of thousands L1s that have accumulated since the origin of mammals. A single group of very similar elements is active at a time in mammals, thus a single lineage of active families has evolved in this group. In contrast, non-mammalian genomes (fish, amphibians, reptiles) harbor a large diversity of concurrently transposing families, which are all represented by very small number of recently inserted copies. Why the pattern of diversity and abundance of L1 is so different among vertebrates remains unknown. To address this issue,we performed a detailed analysis of the evolutionof active L1 in14mammalsand in3non-mammalianvertebrate model species. We examined the evolution of base composition and codon bias, the general structure, and the evolution of the different domains of L1 (50UTR, ORF1, ORF2, 30UTR). L1s differ substantially in length, base composition, and structure among vertebrates. The most variation is found in the 50UTR, which is longer in amniotes, and in the ORF1, which tend to evolve faster in mammals. The highly divergent L1familiesof lizard, frog, and fish share species-specific features suggesting that they are subjected to the same functional constraints imposed by their host. The relative conservation of the 50UTR and ORF1 in non-mammalian vertebrates suggests that the repression of transposition by the host does not act in a sequence-specific manner and did not result in an arms race, as is observed in mammals

    Identifying transposable element-derived enhancers in cancers

    Get PDF
    Transposable elements (TEs) are repetitive DNA elements that have an autonomous capability of replicating and inserting themselves into new loci within genomes. TEs have often been thought to be a part of the non-functional “junk DNA”, but advances in next-generation sequencing technology has revealed that TEs have rich biochemical functions: Specific TEs bind a large fraction of transcription factors and harbor marks of active chromatin in human genomes. However, there have been few genome-wide functional enhancer activity studies on the role of TEs in gene regulation, especially in the context of human cancers. In normal cellular homeostasis, TE activity is tightly controlled by epigenetic mechanisms. In contrast, the cell state in cancer genomes is often permissive for TE activation: genome instability and mutations, notably p53 inactivation, and nonmutational epigenetic reprogramming such as DNA hypomethylation are common characteristics of cancers. TE transcription, somatic retrotransposition and activation of cryptic promoter elements occur frequently in tumors. TE activation is highly heterogeneous between cancer types: for example gastrointestinal tract cancers such as colorectal cancer show high somatic insertion activity, whereas retrotransposition is rare in hematolymphoid malignancies. Due to the widespread TE activation in cancers, we posited that this may also be seen in the activity of cryptic TE enhancers, with specific active families of TEs characteristic to different cell types due to lineage-specific TF binding. We asked if TEs contribute to the enhancer landscape of cancers and to which extent, what are the differences between cancers in the activity and transcription factor binding and whether TE enhancers may have a role in tumorigenesis and the regulation of cancer-specific genes. To functionally study TEs, we utilized a high-resolution, unbiased, genome-wide massively parallel reporter assay (MPRA). We utilized colorectal and hepatocellular cancer cell lines to study the differences and similarities between TE activation and combined the MPRA data with orthogonal epigenetic data to study the in vivo signatures of the TEs. We found that both cell lines show common and highly enriched TE subfamilies that were mostly specific for p53, as well as TEs that were highly unique in both cell lines. By using in vitro methylated MPRA libraries, we found that CpG methylation has relatively minor a role in regulating the enhancer activity of some TE subfamilies in the reporter assay. By comparing the epigenetic context of the TEs, we found that especially colorectal cancer has specific highly active TE subfamilies with signatures of canonical active enhancers. We also used an in silico model to predict TE enhancer to gene contacts and found that these subfamilies regulated genes that were frequently overexpressed. Thus, we present the widespread functional activity of TE enhancers in cancers, providing evidence for further functional validation of TEs and their effects on transcriptional programs and especially dysregulation of gene expression in cancer

    Unique structure and positive selection promote the rapid divergence of Drosophila Y chromosomes

    Get PDF
    Y chromosomes across diverse species convergently evolve a gene-poor, heterochromatic organization enriched for duplicated genes, LTR retrotransposons, and satellite DNA. Sexual antagonism and a loss of recombination play major roles in the degeneration of young Y chromosomes. However, the processes shaping the evolution of mature, already degenerated Y chromosomes are less well-understood. Because Y chromosomes evolve rapidly, comparisons between closely related species are particularly useful. We generated de novo long-read assemblies complemented with cytological validation to reveal Y chromosome organization in three closely related species of the Drosophila simulans complex, which diverged only 250,000 years ago and share \u3e98% sequence identity. We find these Y chromosomes are divergent in their organization and repetitive DNA composition and discover new Y-linked gene families whose evolution is driven by both positive selection and gene conversion. These Y chromosomes are also enriched for large deletions, suggesting that the repair of double-strand breaks on Y chromosomes may be biased toward microhomology-mediated end joining over canonical non-homologous end-joining. We propose that this repair mechanism contributes to the convergent evolution of Y chromosome organization across organisms

    Giant reverse transcriptase-encoding transposable elements at telomeres

    Get PDF
    Author Posting. © The Author(s), 2017. This is the author's version of the work. It is posted here by permission of Oxford University Press for personal use, not for redistribution. The definitive version was published in Molecular Biology and Evolution 34 (2017): 2245–2257, doi:10.1093/molbev/msx159.Transposable elements are omnipresent in eukaryotic genomes and have a profound impact on chromosome structure, function and evolution. Their structural and functional diversity is thought to be reasonably well-understood, especially in retroelements, which transpose via an RNA intermediate copied into cDNA by the element-encoded reverse transcriptase, and are characterized by a compact structure. Here we report a novel type of expandable eukaryotic retroelements, which we call Terminons. These elements can attach to G-rich telomeric repeat overhangs at the chromosome ends, in a process apparently facilitated by complementary C-rich repeats at the 3’-end of the RNA template immediately adjacent to a hammerhead ribozyme motif. Terminon units, which can exceed 40 kb in length, display an unusually complex and diverse structure, and can form very long chains, with host genes often captured between units. As the principal polymerizing component, Terminons contain Athena reverse transcriptases previously described in bdelloid rotifers and belonging to the enigmatic group of Penelope-like elements, but can additionally accumulate multiple co-oriented ORFs, including DEDDy 3’-exonucleases, GDSL esterases/lipases, GIY-YIG-like endonucleases, rolling-circle replication initiator (Rep) proteins, and putatively structural ORFs with coiled-coil motifs and transmembrane domains. The extraordinary length and complexity of Terminons and the high degree of inter-family variability in their ORF content challenge the current views on the structural organization of eukaryotic retroelements, and highlight their possible connections with the viral world and the implications for the elevated frequency of gene transfer.This work was supported by the National Institutes of Health (grant GM111917 to I.A.).2018-05-3

    BMI1 mediated heterochromatin compaction represses G-quadruplex formation in Alzheimer's disease

    Full text link
    La maladie d'Alzheimer (MA) est la dĂ©mence la plus importante dans le monde dĂ©veloppĂ©. Cette maladie neurodĂ©gĂ©nĂ©rative rend de plus en plus difficile la capacitĂ© d'accomplir les tĂąches quotidiennes de routine, elle peut Ă©galement faire oublier les mots aux patients, les dĂ©sorienter dans le temps et l'espace, et Ă  des stades avancĂ©s entraĂźne une perte de mĂ©moire. Malheureusement, la MA est considĂ©rĂ©e comme le prochain grand dĂ©fi pour la santĂ© publique de la plupart des pays, le nombre de cas devant doubler au cours des 20 prochaines annĂ©es en raison du vieillissement de la population. Cette augmentation du nombre de patients s'accompagne d'une augmentation des besoins de financement et de personnel de santĂ© afin de rĂ©pondre aux demandes et aux besoins de ces patients. La MA peut ĂȘtre divisĂ©e en deux entitĂ©s distinctes: une maladie hĂ©rĂ©ditaire bien dĂ©finie et bien comprise qui reprĂ©sente jusqu'Ă  5% de tous les cas de MA appelĂ©s maladie d'Alzheimer familiale, et une maladie moins dĂ©finie appelĂ©e maladie d'Alzheimer sporadique. Le facteur de risque le plus dĂ©fini pour la MA est l'Ăąge, mais rĂ©cemment, il a Ă©tĂ© dĂ©montrĂ© que le cerveau des patients atteints de MA avait un niveau rĂ©duit de BMI1 et que la suppression de BMI1 dans les neurones humains ou chez la souris dĂ©clenche les caractĂ©ristiques de cette maladie. Alors que BMI1 Ă©tait connu pour ĂȘtre important dans les stades de dĂ©veloppement, nous rapportons ici qu'il est crucial dans les cellules adultes pour maintenir la compaction de la chromatine et l’inhibition de la transcription des sĂ©quences rĂ©pĂ©titives. De plus, ces deux fonctions de BMI1 empĂȘchent l'ADN d'acquĂ©rir une conformation G4. Cette conformation peut entraĂźner une instabilitĂ© du gĂ©nome, une augmentation des dommages Ă  l'ADN et une altĂ©ration de l'expression des gĂšnes, mais surtout, nous avons montrĂ© que dans les neurones corticaux, les structures G4 peuvent influencer l'Ă©pissage alternatif de divers gĂšnes, notamment APP. Ces rĂ©sultats apportent un Ă©clairage nouveau sur l'origine de la maladie et l'importance de BMI1 et de la structure secondaire de l'ADN dans le cadre de la MA.Alzheimer's disease is the most prominent dementia in the developed world. This neurodegenerative disease renders the ability to do the routine daily tasks more and more difficult; it can also cause patients to forget words, be disoriented in time and space, leading to a memory loss. Unfortunately, AD is considered the next big challenge for most country’s public health, with the number of cases thought to be doubling within the next 20 years due to the aging of the population. This increase in the number of patients comes with an increase in the need for funding and for healthcare personnel to meet the demands and the requirements of these patients. AD is divided into two separate entities: a well-defined and understood hereditary disease that makes up to 5% of all AD cases called familial Alzheimer disease, and a less defined one called sporadic Alzheimer disease. sAD most defined risk factor is age, but recently it was shown that brains of sAD patients had a reduced level of BMI1 and that the knockdown of BMI1 in human neurons or mice triggers the hallmarks of this disease. While BMI1 was known to be important in the developmental stages, we report here that it is crucial in adult cells to maintain the compaction of the chromatin and the silencing of the repetitive sequences. Furthermore, these two functions of BMI1 prevent the DNA from acquiring a G4 conformation. This conformation can lead to genome instability, increased DNA damage, and altered gene expression. However, most importantly, we showed that in cortical neurons, G4 structures could influence the alternative splicing of various genes, notably APP. These results shed new light on the origin of AD, and the importance of BMI1 and the secondary structure of the DNA in its context

    G-quadruplex RNA motifs influence gene expression in the malaria parasite Plasmodium falciparum.

    Get PDF
    Funder: Hong Kong PhD Fellowship SchemeFunder: Hong Kong Special Administrative Region GovernmentG-quadruplexes are non-helical secondary structures that can fold in vivo in both DNA and RNA. In human cells, they can influence replication, transcription and telomere maintenance in DNA, or translation, transcript processing and stability of RNA. We have previously showed that G-quadruplexes are detectable in the DNA of the malaria parasite Plasmodium falciparum, despite a very highly A/T-biased genome with unusually few guanine-rich sequences. Here, we show that RNA G-quadruplexes can also form in P. falciparum RNA, using rG4-seq for transcriptome-wide structure-specific RNA probing. Many of the motifs, detected here via the rG4seeker pipeline, have non-canonical forms and would not be predicted by standard in silico algorithms. However, in vitro biophysical assays verified formation of non-canonical motifs. The G-quadruplexes in the P. falciparum transcriptome are frequently clustered in certain genes and associated with regions encoding low-complexity peptide repeats. They are overrepresented in particular classes of genes, notably those that encode PfEMP1 virulence factors, stress response genes and DNA binding proteins. In vitro translation experiments and in vivo measures of translation efficiency showed that G-quadruplexes can influence the translation of P. falciparum mRNAs. Thus, the G-quadruplex is a novel player in post-transcriptional regulation of gene expression in this major human pathogen.UK Medical Research Council [grants MR/K000535/1 and MR/L008823/1] to CJM. Shenzhen Basic Research Project [JCYJ20180507181642811], Research Grants Council of the Hong Kong SAR, China Projects [CityU 11100421, CityU 11101519, CityU 11100218, N_CityU110/17, CityU 21302317], Croucher Foundation [Project No. 9500030, 9509003], State Key Laboratory of Marine Pollution Director Discretionary Fund, City University of Hong Kong [projects 6000711, 7005503, 9667222, 9680261] to CKK. A generous donation from Mr. and Mrs. Sunny Yang, the University Grants Committee Area of Excellence Scheme (AoE/M-403/16), and the Innovation and Technology Commission, Hong Kong Special Administrative Region Government to the State Key Laboratory of Agrobiotechnology (CUHK) to TFC. EYCC was supported by the Hong Kong PhD Fellowship Scheme
    • 

    corecore