16 research outputs found

    Organization and Evolution of Primate Centromeric DNA from Whole-Genome Shotgun Sequence Data

    Get PDF
    The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%–5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution

    MiIP: The Monomer Identification and Isolation Program

    Get PDF
    Repetitive elements within genomic DNA are both functionally and evolutionarily informative. Discovering these sequences ab initio is computationally challenging, compounded by the fact that selection on these repeats is often relaxed; thus sequence identity between repetitive elements can vary significantly. Here we present a new application, the Monomer Identification and Isolation Program (MiIP), which provides functionality to both search for a particular repeat as well as discover repetitive elements within a larger genomic sequence. To compare MiIP’s performance with other repeat detection tools, analysis was conducted for synthetic sequences as well as several α21-II clones and HC21 BAC sequences. The primary benefit of MiIP is the fact that it is a single tool capable of searching for both known monomeric sequences as well as discovering the occurrence of repeats ab initio, per the user’s required sensitivity of the search. Furthermore, the report functionality helps easily facilitate subsequent phylogenetic analysis

    DNA Repeats Detection Using a Dedicated Dot-Plot Analysis

    Full text link

    Conserved DNA motifs, including the CENP-B box-like, are involved in satellite DNA array rearrangements

    Get PDF
    Satellite DNAs (satDNAs), despite rapid evolution that continuously remodel the genomic landscape, occupy functionally essential centromeric regions. Difficult to be explored due to their repetitive nature and divergence, satDNAs are still hardly accessible frontiers of eukaryotic genomes and knowledge concerning functional significance of satellite DNAs is rather limited. In this work, we provide a comprehensive analysis of six satDNAs in the library of recently separated root-knot nematodes Meloidogyne chitwoodi and M. fallax. We disclosed two different conserved regions common for analyzed satDNAs. One appeared to be highly similar to the CENP-B box of human alpha satDNA, which emerged, in sequence alignment, as a conserved segment common for six divergent satDNAs shared by closely related genomes. Observed results emphasize it as the most prominent example of the CENP-B box-like motif out of mammals. The proposed feature of the CENP-B box-like motif is to act as a promoter in the hypothesized cut-and-paste transposition-related mechanism. This observation could represent a novel role of the CENP-B box, in addition to the known function in centromere protein binding. We propose that the second conserved sequence motif detected in explored satDNAs is involved in illegitimate recombination. In parallel to alpha satDNAs, we found organization of satDNA arrays in nematodes comparable to that found in human and primates, in the form of simple and complex higher order repeats (HORs). In contrast to human satDNA organization, characterized by phylogenetically distinct HOR and monomeric forms, organizational patterns observed in nematodes are consistent with frequent and continuous shuffling of sequences between HORs and monomeric arrays. Our results suggest the role of conserved domains in mechanisms that cause rapid shuffling of sequences among divergent satDNAs, on the level of short-segment tracts. In context of satDNA evolution, our finding provides, for the first time, an experimentally verified link between conserved domains and satDNA rearrangement events

    The Evolutionary Origin of Man Can Be Traced in the Layers of Defunct Ancestral Alpha Satellites Flanking the Active Centromeres of Human Chromosomes

    Get PDF
    Alpha satellite domains that currently function as centromeres of human chromosomes are flanked by layers of older alpha satellite, thought to contain dead centromeres of primate progenitors, which lost their function and the ability to homogenize satellite repeats, upon appearance of a new centromere. Using cladistic analysis of alpha satellite monomers, we elucidated complete layer patterns on chromosomes 8, 17, and X and related them to each other and to primate alpha satellites. We show that discrete and chronologically ordered alpha satellite layers are partially symmetrical around an active centromere and their succession is partially shared in non-homologous chromosomes. The layer structure forms a visual representation of the human evolutionary lineage with layers corresponding to ancestors of living primates and to entirely fossil taxa. Surprisingly, phylogenetic comparisons suggest that alpha satellite arrays went through periods of unusual hypermutability after they became “dead” centromeres. The layer structure supports a model of centromere evolution where new variants of a satellite repeat expanded periodically in the genome by rounds of inter-chromosomal transfer/amplification. Each wave of expansion covered all or many chromosomes and corresponded to a new primate taxon. Complete elucidation of the alpha satellite phylogenetic record would give a unique opportunity to number and locate the positions of major extinct taxa in relation to human ancestors shared with extant primates. If applicable to other satellites in non-primate taxa, analysis of centromeric layers could become an invaluable tool for phylogenetic studies

    Organization and Molecular Evolution of CENP-A–Associated Satellite DNA Families in a Basal Primate Genome

    Get PDF
    Centromeric regions in many complex eukaryotic species contain highly repetitive satellite DNAs. Despite the diversity of centromeric DNA sequences among species, the functional centromeres in all species studied to date are marked by CENP-A, a centromere-specific histone H3 variant. Although it is well established that families of multimeric higher-order alpha satellite are conserved at the centromeres of human and great ape chromosomes and that diverged monomeric alpha satellite is found in old and new world monkey genomes, little is known about the organization, function, and evolution of centromeric sequences in more distant primates, including lemurs. Aye-Aye (Daubentonia madagascariensis) is a basal primate and is located at a key position in the evolutionary tree to study centromeric satellite transitions in primate genomes. Using the approach of chromatin immunoprecipitation with antibodies directed to CENP-A, we have identified two satellite families, Daubentonia madagascariensis Aye-Aye 1 (DMA1) and Daubentonia madagascariensis Aye-Aye 2 (DMA2), related to each other but unrelated in sequence to alpha satellite or any other previously described primate or mammalian satellite DNA families. Here, we describe the initial genomic and phylogenetic organization of DMA1 and DMA2 and present evidence of higher-order repeats in Aye-Aye centromeric domains, providing an opportunity to study the emergence of chromosome-specific modes of satellite DNA evolution in primate genomes

    Tandemly repeated DNA families in the mouse genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Functional and morphological studies of tandem DNA repeats, that combine high portion of most genomes, are mostly limited due to the incomplete characterization of these genome elements. We report here a genome wide analysis of the large tandem repeats (TR) found in the mouse genome assemblies.</p> <p>Results</p> <p>Using a bioinformatics approach, we identified large TR with array size more than 3 kb in two mouse whole genome shotgun (WGS) assemblies. Large TR were classified based on sequence similarity, chromosome position, monomer length, array variability, and GC content; we identified four superfamilies, eight families, and 62 subfamilies - including 60 not previously described. 1) The superfamily of centromeric minor satellite is only found in the unassembled part of the reference genome. 2) The pericentromeric major satellite is the most abundant superfamily and reveals high order repeat structure. 3) Transposable elements related superfamily contains two families. 4) The superfamily of heterogeneous tandem repeats includes four families. One family is found only in the WGS, while two families represent tandem repeats with either single or multi locus location. Despite multi locus location, TRPC-21A-MM is placed into a separated family due to its abundance, strictly pericentromeric location, and resemblance to big human satellites.</p> <p>To confirm our data, we next performed <it>in situ </it>hybridization with three repeats from distinct families. TRPC-21A-MM probe hybridized to chromosomes 3 and 17, multi locus TR-22A-MM probe hybridized to ten chromosomes, and single locus TR-54B-MM probe hybridized with the long loops that emerge from chromosome ends. In addition to <it>in silico </it>predicted several extra-chromosomes were positive for TR by <it>in situ </it>analysis, potentially indicating inaccurate genome assembly of the heterochromatic genome regions.</p> <p>Conclusions</p> <p>Chromosome-specific TR had been predicted for mouse but no reliable cytogenetic probes were available before. We report new analysis that identified <it>in silico </it>and confirmed <it>in situ </it>3/17 chromosome-specific probe TRPC-21-MM. Thus, the new classification had proven to be useful tool for continuation of genome study, while annotated TR can be the valuable source of cytogenetic probes for chromosome recognition.</p
    corecore