28 research outputs found

    Participation of Multifunctional RNA in Replication, Recombination and Regulation of Endogenous Plant Pararetroviruses (EPRVs)

    Get PDF
    Pararetroviruses, taxon Caulimoviridae, are typical of retroelements with reverse transcriptase and share a common origin with retroviruses and LTR retrotransposons, presumably dating back 1.6 billion years and illustrating the transition from an RNA to a DNA world. After transcription of the viral genome in the host nucleus, viral DNA synthesis occurs in the cytoplasm on the generated terminally redundant RNA including inter- and intra-molecule recombination steps rather than relying on nuclear DNA replication. RNA recombination events between an ancestral genomic retroelement with exogenous RNA viruses were seminal in pararetrovirus evolution resulting in horizontal transmission and episomal replication. Instead of active integration, pararetroviruses use the host DNA repair machinery to prevail in genomes of angiosperms, gymnosperms and ferns. Pararetrovirus integration – leading to Endogenous ParaRetroViruses, EPRVs – by illegitimate recombination can happen if their sequences instead of homologous host genomic sequences on the sister chromatid (during mitosis) or homologous chromosome (during meiosis) are used as template. Multiple layers of RNA interference exist regulating episomal and chromosomal forms of the pararetrovirus. Pararetroviruses have evolved suppressors against this plant defense in the arms race during co-evolution which can result in deregulation of plant genes. Small RNAs serve as signaling molecules for Transcriptional and Post-Transcriptional Gene Silencing (TGS, PTGS) pathways. Different populations of small RNAs comprising 21–24 nt and 18–30 nt in length have been reported for Citrus, Fritillaria, Musa, Petunia, Solanum and Beta. Recombination and RNA interference are driving forces for evolution and regulation of EPRVs

    A sheep pangenome reveals the spectrum of structural variations and their effects on tail phenotypes

    Get PDF
    Structural variations (SVs) are a major contributor to genetic diversity and phenotypic variations, but their prevalence and functions in domestic animals are largely unexplored. Here we generated high-quality genome assemblies for 15 individuals from genetically diverse sheep breeds using Pacific Biosciences (PacBio) high-fidelity sequencing, discovering 130.3 Mb nonreference sequences, from which 588 genes were annotated. A total of 149,158 biallelic insertions/deletions, 6531 divergent alleles, and 14,707 multiallelic variations with precise breakpoints were discovered. The SV spectrum is characterized by an excess of derived insertions compared to deletions (94,422 vs. 33,571), suggesting recent active LINE expansions in sheep. Nearly half of the SVs display low to moderate linkage disequilibrium with surrounding single-nucleotide polymorphisms (SNPs) and most SVs cannot be tagged by SNP probes from the widely used ovine 50K SNP chip. We identified 865 population-stratified SVs including 122 SVs possibly derived in the domestication process among 690 individuals from sheep breeds worldwide. A novel 168-bp insertion in the 5' untranslated region (5' UTR) of HOXB13 is found at high frequency in long-tailed sheep. Further genome-wide association study and gene expression analyses suggest that this mutation is causative for the long-tail trait. In summary, we have developed a panel of high-quality de novo assemblies and present a catalog of structural variations in sheep. Our data capture abundant candidate functional variations that were previously unexplored and provide a fundamental resource for understanding trait biology in sheep

    Exploiting novel germplasm

    No full text

    Traits with ecological functions

    No full text

    Repetitive DNA in eukaryotic genomes

    No full text
    Repetitive DNA-sequence motifs repeated hundreds or thousands of times in the genome-makes up the major proportion of all the nuclear DNA in most eukaryotic genomes. However, the significance of repetitive DNA in the genome is not completely understood, and it has been considered to have both structural and functional roles, or perhaps even no essential role. High-throughput DNA sequencing reveals huge numbers of repetitive sequences. Most bioinformatic studies focus on low-copy DNA including genes, and hence, the analyses collapse repeats in assemblies presenting only one or a few copies, often masking out and ignoring them in both DNA and RNA read data. Chromosomal studies are proving vital to examine the distribution and evolution of sequences because of the challenges of analysis of sequence data. Many questions are open about the origin, evolutionary mode and functions that repetitive sequences might have in the genome. Some, the satellite DNAs, are present in long arrays of similar motifs at a small number of sites, while others, particularly the transposable elements (DNA transposons and retrotranposons), are dispersed over regions of the genome; in both cases, sequence motifs may be located at relatively specific chromosome domains such as centromeres or subtelomeric regions. Here, we overview a range of works involving detailed characterization of the nature of all types of repetitive sequences, in particular their organization, abundance, chromosome localization, variation in sequence within and between chromosomes, and, importantly, the investigation of their transcription or expression activity. Comparison of the nature and locations of sequences between more, and less, related species is providing extensive information about their evolution and amplification. Some repetitive sequences are extremely well conserved between species, while others are among the most variable, defining differences between even closely relative species. These data suggest contrasting modes of evolution of repetitive DNA of different types, including selfish sequences that propagate themselves and may even be transferred horizontally between species rather than by descent, through to sequences that have a tendency to amplification because of their sequence motifs, to those that have structural significance because of their bulk rather than precise sequence. Functional consequences of repeats include generation of variability by movement and insertion in the genome (giving useful genetic markers), the definition of centromeres, expression under stress conditions and regulation of gene expression via RNA moieties. Molecular cytogenetics and bioinformatic studies in a comparative context are now enabling understanding of the nature and behaviour of this major genomic component

    Characterization and Diversity of Novel PIF/Harbinger DNA Transposons in Brassica Genomes

    Full text link
    Among DNA transposons, PIF/Harbinger is most recently identified superfamily characterized by 3 bp target site duplications (TSDs), flanked by 14-45 bp terminal inverted repeats (TIRs) and displaying DDD or DDE domain displaying transposase. Their autonomous elements contain two open reading frames, ORF1 and ORF2 encoding superfamily specific transposase and DNA-binding domain. Harbinger DNA transposons are recently identified in few plants. In present study, computational and molecular approaches were used for the identification of 8 Harbinger transposons, of which only 2 were complete with putative transposase, while rest 6 lack transposase and are considered as defective or non-autonomous elements. They ranged in size from 0.5-4 kb with 3 bp TSDs, 15-42 bp TIRs and internal AT rich regions. The PCR amplification of Brassica Harbinger transposase revealed diversity and ancient nature of these elements. The amplification polymorphism of some non-autonomous Harbingers showed species specific distribution. Phylogenetic analyses of transposase clustered them into two clades (monocot and dicot) and five sub-clades. The Brassica, Arabidopsis and Malus transposase clustered into genera specific sub-clades; although a lot of homology in transposase was observed. The multiple sequence alignment of Brassica and related transposase showed homology in five conserved blocks. The DD₃₅E triad and sequences showed similarity to already known Pong-like or Arabidopsis ATISI12 Harbinger transposase in contrast to other transposase having DD₄₇E or DD₄₈E motifs. The present study will be helpful in the characterization of Harbingers, their structural diversity in related genera and Harbinger based molecular markers for varietal/lines identifications

    The repetitive DNA landscape in sheep

    Full text link
    Repetitive DNA sequences, representing the majority of most mammalian genomes, can be broadly divided into tandemly repeated or satellite sequences (mostly located in the heterochromatin) and transposable elements (TEs) dispersed over the genome. Some repetitive DNA sequences are highly conserved but other sequences show substantial diversification in copy number, sequence and organization between individuals, breeds, and related species. Here, we report the repetitive DNA landscape of sheep (Ovis aries) based on de novo analysis of >6Gbp of sequence from each of five individuals. Major classes of repetitive DNA sequences were identified and quantified by network analysis (using the program RepeatExplorer), frequency analysis of short motifs (K-mers), and alignment to reference genome assemblies. The genomic organization of the major repetitive motifs was characterized by in situ hybridization to chromosomes. The well-known c. 816 bplong centromere-associated satellite SatI represented 4 to 6 % of the genome while SatII (c. 600 bp long) was 1 to 2 % of the genome. Notably, these satellites showed contrasting behaviour at meiotic prophase: Sat I sequences cover a larger area indicating a looser chromatin loop organization. While, Sat II sequences are tightly organized and are attached to the synaptonemal complex (SC) at a more distal position than SatI sequences at the end of SCs of acrocentric chromosomes. The repetitive sequence analysis identified other much less abundant satellite sequences and simple repeats, some with novel genomic distributions. Families of non-LTR retrotransposons including LINEs (L1 and RTE) and derived SINEs represented more than 25 % of the genome. Non-LTR families showed characteristic distributions on chromosomes with some showing greater abundance on metacentric autosomes or on sex chromosomes. Endogenous retrovirus classes grouped into clusters with some families showing centromeric and others more dispersed distributions. Rapidly evolving repetitive sequences allow us to study processes of chromosome or genome evolution and diversification in sheep, and more broadly across the Bovidae
    corecore