Search CORE

7,024 research outputs found

Genomic abundance is not predictive of tandem repeat localization in grass genomes.

Author: Bilinski Paul
Estep Matt C
Han Yonghua
Hufford Matthew B
Jiang Jiming
Lorant Anne
Ross-Ibarra Jeffrey
Zhang Pingdong
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

Highly repetitive regions have historically posed a challenge when investigating sequence variation and content. High-throughput sequencing has enabled researchers to use whole-genome shotgun sequencing to estimate the abundance of repetitive sequence, and these methodologies have been recently applied to centromeres. Previous research has investigated variation in centromere repeats across eukaryotes, positing that the highest abundance tandem repeat in a genome is often the centromeric repeat. To test this assumption, we used shotgun sequencing and a bioinformatic pipeline to identify common tandem repeats across a number of grass species. We find that de novo assembly and subsequent abundance ranking of repeats can successfully identify tandem repeats with homology to known tandem repeats. Fluorescent in-situ hybridization shows that de novo assembly and ranking of repeats from non-model taxa identifies chromosome domains rich in tandem repeats both near pericentromeres and elsewhere in the genome

Directory of Open Access Journals

eScholarship - University of California

Analysis of the giant genomes of Fritillaria (Liliaceae) indicates that a lack of DNA removal characterizes extreme expansions in genome size.

Author: Andrew R. Leitch
Ilia J. Leitch
Jaume Pellicer
Jiří Macas
Laura J. Kelly
Maddison DR
Madeleine Berger
Martin A. Lysak
Michael F. Fay
Pavel Neumann
Peter D. Day
Petr Novák
R Core Team
Richard A. Nichols
Rix EM
Simon Renny‐Byfield
Swofford DL
The International Brachypodium Initiative
Thompson W
Publication venue: 'Wiley'
Publication date: 01/01/2015
Field of study

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.Plants exhibit an extraordinary range of genome sizes, varying by > 2000-fold between the smallest and largest recorded values. In the absence of polyploidy, changes in the amount of repetitive DNA (transposable elements and tandem repeats) are primarily responsible for genome size differences between species. However, there is ongoing debate regarding the relative importance of amplification of repetitive DNA versus its deletion in governing genome size. Using data from 454 sequencing, we analysed the most repetitive fraction of some of the largest known genomes for diploid plant species, from members of Fritillaria. We revealed that genomic expansion has not resulted from the recent massive amplification of just a handful of repeat families, as shown in species with smaller genomes. Instead, the bulk of these immense genomes is composed of highly heterogeneous, relatively low-abundance repeat-derived DNA, supporting a scenario where amplified repeats continually accumulate due to infrequent DNA removal. Our results indicate that a lack of deletion and low turnover of repetitive DNA are major contributors to the evolution of extremely large genomes and show that their size cannot simply be accounted for by the activity of a small number of high-abundance repeat families.Thiswork was supported by the Natural Environment ResearchCouncil (grant no. NE/G017 24/1), the Czech Science Fou nda-tion (grant no. P501/12/G090), the AVCR (grant no.RVO:60077344) and a Beatriu de Pinos postdoctoral fellowshipto J.P. (grant no. 2011-A-00292; Catalan Government-E.U. 7thF.P.)

Crossref

Shared Research Repository

PubMed Central

Queen Mary Research Online

Rothamsted Repository

Low coverage sequencing for repetitive DNA analysis in Passiflora edulis Sims: Citogenomic characterization of transposable elements and satellite DNA

Author: Almeida Costa Eduardo
Carvalho Cayres Pamponét Vanessa
Corrêa Ronan Xavier
Ferreira de Melo Cláusio Antônio
Gomes de Oliveira Sarah
Magalhães Souza Margarete
Micheli Fabienne
Santos Silva Gonçalo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Background: The cytogenomic study of repetitive regions is fundamental for the understanding of morphofunctional mechanisms and genome evolution. Passiflora edulis a species of relevant agronomic value, this work had its genome sequenced by next generation sequencing and bioinformatics analysis performed by RepeatExplorer pipeline. The clusters allowed the identification and characterization of repetitive elements (predominant contributors to most plant genomes). The aim of this study was to identify, characterize and map the repetitive DNA of P. edulis, providing important cytogenomic markers, especially sequences associated with the centromere. Results: Three clusters of satellite DNAs (69, 118 and 207) and seven clusters of Long Terminal Repeat (LTR) retrotransposons of the superfamilies Ty1/Copy and Ty3/Gypsy and families Angela, Athila, Chromovirus and Maximus-Sire (6, 11, 36, 43, 86, 94 and 135) were characterized and analyzed. The chromosome mapping of satellite DNAs showed two hybridization sites co-located in the 5S rDNA region (PeSat_1), subterminal hybridizations (PeSat_3) and hybridization in four sites, co-located in the 45S rDNA region (PeSat_2). Most of the retroelements hybridizations showed signals scattered in the chromosomes, diverging in abundance, and only the cluster 6 presented pericentromeric regions marking. No satellite DNAs and retroelement associated with centromere was observed. Conclusion: P. edulis has a highly repetitive genome, with the predominance of Ty3/Gypsy LTR retrotransposon. The satellite DNAs and LTR retrotransposon characterized are promising markers for investigation of the evolutionary patterns and genetic distinction of species and hybrids of Passiflora

Directory of Open Access Journals

Agritrop

Next Generation Sequencing-Based Analysis of Repetitive DNA in the Model Dioceous Plant Silene latifolia

Author: Andrea Koblížková
Boris Vyskot
Eduard Kejnovský
Jiří Macas
Pavel Neumann
Petr Novák
Zhanjiang Liu
Publication venue: Public Library of Science
Publication date
Field of study

BACKGROUND: Silene latifolia is a dioecious [corrected] plant with well distinguished X and Y chromosomes that is used as a model to study sex determination and sex chromosome evolution in plants. However, efficient utilization of this species has been hampered by the lack of large-scale sequencing resources and detailed analysis of its genome composition, especially with respect to repetitive DNA, which makes up the majority of the genome. METHODOLOGY/PRINCIPAL FINDINGS: We performed low-pass 454 sequencing followed by similarity-based clustering of 454 reads in order to identify and characterize sequences of all major groups of S. latifolia repeats. Illumina sequencing data from male and female genomes were also generated and employed to quantify the genomic proportions of individual repeat families. The majority of identified repeats belonged to LTR-retrotransposons, constituting about 50% of genomic DNA, with Ty3/gypsy elements being more frequent than Ty1/copia. While there were differences between the male and female genome in the abundance of several repeat families, their overall repeat composition was highly similar. Specific localization patterns on sex chromosomes were found for several satellite repeats using in situ hybridization with probes based on k-mer frequency analysis of Illumina sequencing data. CONCLUSIONS/SIGNIFICANCE: This study provides comprehensive information about the sequence composition and abundance of repeats representing over 60% of the S. latifolia genome. The results revealed generally low divergence in repeat composition between the sex chromosomes, which is consistent with their relatively recent origin. In addition, the study generated various data resources that are available for future exploration of the S. latifolia genome

Crossref

Directory of Open Access Journals

PubMed Central

In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae

Author: A Fleischmann
A Navrátilová
A Zuccolo
A Zuccolo
Andrea Koblížková
Andreas Houben
B Piegu
C Llorens
CA Thomas
D Aird
E Kejnovský
F Otto
F Ronquist
G García
G Pertea
H Schaefer
H Weiss-Schneeweiss
HJT Pagan
Ilia J. Leitch
Iva Fuková
J Doležel
J Doležel
J Doležel
J Doležel
J Greilhuber
J Ištvánek
J Macas
J Macas
J Macas
J Macas
J Macas
J Macas
J Pellicer
J Pellicer
JA Ågren
JA Ågren
Jana Čížková
Jaroslav Doležel
Jaume Pellicer
Jiří Macas
JL Bennetzen
JPM Camacho
JS Hawkins
JS Hawkins
KM Devos
KR Oliver
KR Oliver
Laura J. Kelly
LD Ingham
LJ Kelly
LJ Kelly
LJ Kelly
M El Baidouri
M Kidwell
M Lynch
M Nouzová
M Piednoël
MA Lysák
MC Estep
MI Tenaillon
P Neumann
P Neumann
P Novák
P Novák
P Novák
P Smýkal
P Trávníček
Pavel Neumann
Petr Novák
RB Flavell
RJ Britten
S Klemme
S Linquist
S Lockton
S Renny-Byfield
SF Altschul
T Hall
T Wicker
TP Michael
TR Gregory
V Hemleben
V Steinbauerová
V Steinbauerová
Z Cai
Z Gong
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 25/11/2015
Field of study

The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes

Public Library of Science (PLOS)

Crossref

Shared Research Repository

Directory of Open Access Journals

PubMed Central

Queen Mary Research Online

FigShare

A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data

Author: Brom Timothy H.
Brown C. Titus
Howe Adina
Pyrkosz Alexis B.
Zhang Qingpeng
Publication venue
Publication date: 21/05/2012
Field of study

Deep shotgun sequencing and analysis of genomes, transcriptomes, amplified single-cell genomes, and metagenomes has enabled investigation of a wide range of organisms and ecosystems. However, sampling variation in short-read data sets and high sequencing error rates of modern sequencers present many new computational challenges in data interpretation. These challenges have led to the development of new classes of mapping tools and {\em de novo} assemblers. These algorithms are challenged by the continued improvement in sequencing throughput. We here describe digital normalization, a single-pass computational algorithm that systematizes coverage in shotgun sequencing data sets, thereby decreasing sampling variation, discarding redundant data, and removing the majority of errors. Digital normalization substantially reduces the size of shotgun data sets and decreases the memory and time requirements for {\em de novo} sequence assembly, all without significantly impacting content of the generated contigs. We apply digital normalization to the assembly of microbial genomic data, amplified single-cell genomic data, and transcriptomic data. Our implementation is freely available for use and modification

arXiv.org e-Print Archive

CiteSeerX

The peculiar landscape of repetitive sequences in the olive (Olea europaea L.) genome

Author: Barghini E
Cattonaro F
Cavallini A
Cossu RM
Giordani T
Morgante M
Natali L
Pindo M
Scalabrin S
Velasco R
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2014
Field of study

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome

LJMU Research Online (Liverpool John Moores University)

Archivio istituzionale della ricerca - Università degli Studi di Udine

PubMed Central

Stretching the Rules: Monocentric Chromosomes with Multiple Centromere Domains

Author: A Kawabe
AE Hall
AF Dernburg
Alice Navrátilová
Andrea Koblížková
AW Higgins
BA Sullivan
BE Black
Beth A. Sullivan
DA Pepper
E Schroeder-Reiter
Elizabeth Schroeder-Reiter
Eva Chocholová
Gerhard Wanner
HR Lee
HS Malik
HS Malik
IC Moraes
J Macas
J Monen
Jiří Macas
JM Jiang
K Nagaki
K Nagaki
K Prufer
M Plohl
M Sanei
MD Bennett
OJ Marshall
P Binarova
P Neumann
P Neumann
P Neumann
P Novák
Pavel Neumann
PB Talbert
Petr Novák
PW Barlow
R ten Hoopen
RJ McFarlane
RK Dawe
S Heckmann
SA Ribeiro
T Haizel
Veronika Steinbauerová
W Zhang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/06/2012
Field of study

The centromere is a functional chromosome domain that is essential for faithful chromosome segregation during cell division and that can be reliably identified by the presence of the centromere-specific histone H3 variant CenH3. In monocentric chromosomes, the centromere is characterized by a single CenH3-containing region within a morphologically distinct primary constriction. This region usually spans up to a few Mbp composed mainly of centromere-specific satellite DNA common to all chromosomes of a given species. In holocentric chromosomes, there is no primary constriction; the centromere is composed of many CenH3 loci distributed along the entire length of a chromosome. Using correlative fluorescence light microscopy and high-resolution electron microscopy, we show that pea (Pisum sativum) chromosomes exhibit remarkably long primary constrictions that contain 3-5 explicit CenH3-containing regions, a novelty in centromere organization. In addition, we estimate that the size of the chromosome segment delimited by two outermost domains varies between 69 Mbp and 107 Mbp, several factors larger than any known centromere length. These domains are almost entirely composed of repetitive DNA sequences belonging to 13 distinct families of satellite DNA and one family of centromeric retrotransposons, all of which are unevenly distributed among pea chromosomes. We present the centromeres of Pisum as novel ``meta-polycentric'' functional domains. Our results demonstrate that the organization and DNA composition of functional centromere domains can be far more complex than previously thought, do not require single repetitive elements, and do not require single centromere domains in order to segregate properly. Based on these findings, we propose Pisum as a useful model for investigation of centromere architecture and the still poorly understood role of repetitive DNA in centromere evolution, determination, and function

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

Open Access LMU

PubMed Central

FigShare

The Utility of Graph Clustering of 5S Ribosomal DNA Homoeologs in Plant Allopolyploids, Homoploid Hybrids, and Cryptic Introgressants

Author: Aïnouche Malika
Borowska-Żuchowska Natalia
Garcia Sònia
Kovarik Ales
Kuderova Alena
Wendel Jonathan F.
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2020
Field of study

Introduction: Ribosomal DNA (rDNA) loci have been widely used for identification of allopolyploids and hybrids, although few of these studies employed high-throughput sequencing data. Here we use graph clustering implemented in the RepeatExplorer (RE) pipeline to analyze homoeologous 5S rDNA arrays at the genomic level searching for hybridogenic origin of species. Data were obtained from more than 80 plant species, including several well-defined allopolyploids and homoploid hybrids of different evolutionary ages and from widely dispersed taxonomic groups. Results: (i) Diploids show simple circular-shaped graphs of their 5S rDNA clusters. In contrast, most allopolyploids and other interspecific hybrids exhibit more complex graphs composed of two or more interconnected loops representing intergenic spacers (IGS). (ii) There was a relationship between graph complexity and locus numbers. (iii) The sequences and lengths of the 5S rDNA units reconstituted in silico from k-mers were congruent with those experimentally determined. (iv) Three-genomic comparative cluster analysis of reads from allopolyploids and progenitor diploids allowed identification of homoeologous 5S rRNA gene families even in relatively ancient (c. 1 Myr) Gossypium and Brachypodium allopolyploids which already exhibit uniparental partial loss of rDNA repeats. (v) Finally, species harboring introgressed genomes exhibit exceptionally complex graph structures. Conclusion: We found that the cluster graph shapes and graph parameters (k-mer coverage scores and connected component index) well-reflect the organization and intragenomic homogeneity of 5S rDNA repeats. We propose that the analysis of 5S rDNA cluster graphs computed by the RE pipeline together with the cytogenetic analysis might be a reliable approach for the determination of the hybrid or allopolyploid plant species parentage and may also be useful for detecting historical introgression events

HAL-INSU

Digital.CSIC

Repozytorium Uniwersytetu Śląskiego RE-BUŚ

HAL-Rennes 1

Recommended from our members

Reconstructing an ancestral genotype of two hexachlorocyclohexane-degrading Sphingobium species using metagenomic sequence data.

Author: Gilbert Jack A
Khurana Jitendra P
Khurana Paramjit
Kumar Roshan
Lal Rup
Lax Simon
Negi Vivek
Sangwan Naseer
Verma Helianthous
Publication venue: eScholarship, University of California
Publication date: 01/02/2014
Field of study

Over the last 60 years, the use of hexachlorocyclohexane (HCH) as a pesticide has resulted in the production of >4 million tons of HCH waste, which has been dumped in open sinks across the globe. Here, the combination of the genomes of two genetic subspecies (Sphingobium japonicum UT26 and Sphingobium indicum B90A; isolated from two discrete geographical locations, Japan and India, respectively) capable of degrading HCH, with metagenomic data from an HCH dumpsite (∼450 mg HCH per g soil), enabled the reconstruction and validation of the last-common ancestor (LCA) genotype. Mapping the LCA genotype (3128 genes) to the subspecies genomes demonstrated that >20% of the genes in each subspecies were absent in the LCA. This includes two enzymes from the 'upper' HCH degradation pathway, suggesting that the ancestor was unable to degrade HCH isomers, but descendants acquired lin genes by transposon-mediated lateral gene transfer. In addition, anthranilate and homogentisate degradation traits were found to be strain (selectively retained only by UT26) and environment (absent in the LCA and subspecies, but prevalent in the metagenome) specific, respectively. One draft secondary chromosome, two near complete plasmids and eight complete lin transposons were assembled from the metagenomic DNA. Collectively, these results reinforce the elastic nature of the genus Sphingobium, and describe the evolutionary acquisition mechanism of a xenobiotic degradation phenotype in response to environmental pollution. This also demonstrates for the first time the use of metagenomic data in ancestral genotype reconstruction, highlighting its potential to provide significant insight into the development of such phenotypes

eScholarship - University of California