470,386 research outputs found

    Amplification and adaptation of centromeric repeats in polyploid switchgrass species.

    Get PDF
    Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats from a single satellite repeat family. Why centromeres are dominated by a single satellite repeat and how the satellite repeats originate and evolve are among the most intriguing and long-standing questions in centromere biology. We identified eight satellite repeats in the centromeres of tetraploid switchgrass (Panicum virgatum). Seven repeats showed characteristics associated with classical centromeric repeats with monomeric lengths ranging from 166 to 187 bp. Interestingly, these repeats share an 80-bp DNA motif. We demonstrate that this 80-bp motif may dictate translational and rotational phasing of the centromeric repeats with the cenH3 nucleosomes. The sequence of the last centromeric repeat, Pv156, is identical to the 5S ribosomal RNA genes. We demonstrate that a 5S ribosomal RNA gene array was recruited to be the functional centromere for one of the switchgrass chromosomes. Our findings reveal that certain types of satellite repeats, which are associated with unique sequence features and are composed of monomers in mono-nucleosomal length, are favorable for centromeres. Centromeric repeats may undergo dynamic amplification and adaptation before the centromeres in the same species become dominated by the best adapted satellite repeat

    A role for non-B DNA forming sequences in mediating microlesions causing human inherited disease

    Get PDF
    Missense/nonsense mutations and micro-deletions/micro-insertions of <21bp together represent ~76% of all mutations causing human inherited disease. Previous studies have shown that their occurrence is influenced by sequences capable of non-B DNA formation (direct, inverted and mirror repeats; G-quartets). We found that a greater than expected proportion (~21%) of both micro-deletions and micro-insertions occur within direct repeats and are explicable by slipped misalignment. A novel mutational mechanism, non-B DNA triplex formation followed by DNA repair, is proposed to explain ~5 % of micro-deletions and micro-insertions at mirror repeats. Further, G-quadruplex-forming sequences, direct and inverted repeats appear to play a prominent role in mediating missense mutations, whereas only direct and inverted repeats mediate nonsense mutations. We suggest a mutational mechanism involving slipped strand mispairing, slipped structure formation and DNA repair, to explain ~15% of missense and ~12% of nonsense mutations leading to the formation of perfect direct repeat s from imperfect repeats, or to the extension of existing direct repeats. Similar proportions of missense and nonsense mutations were explicable by the mechanism of hairpin loop formation and DNA repair leading to the formation of perfect inverted repeats from imperfect repeats. The proposed mechanisms provide new insights into mutagenesis underlying pathogenic micro-lesions

    Genomic abundance is not predictive of tandem repeat localization in grass genomes.

    Get PDF
    Highly repetitive regions have historically posed a challenge when investigating sequence variation and content. High-throughput sequencing has enabled researchers to use whole-genome shotgun sequencing to estimate the abundance of repetitive sequence, and these methodologies have been recently applied to centromeres. Previous research has investigated variation in centromere repeats across eukaryotes, positing that the highest abundance tandem repeat in a genome is often the centromeric repeat. To test this assumption, we used shotgun sequencing and a bioinformatic pipeline to identify common tandem repeats across a number of grass species. We find that de novo assembly and subsequent abundance ranking of repeats can successfully identify tandem repeats with homology to known tandem repeats. Fluorescent in-situ hybridization shows that de novo assembly and ranking of repeats from non-model taxa identifies chromosome domains rich in tandem repeats both near pericentromeres and elsewhere in the genome

    Inverted and mirror repeats in model nucleotide sequences

    Get PDF
    We analytically and numerically study the probabilistic properties of inverted and mirror repeats in model sequences of nucleic acids. We consider both perfect and non-perfect repeats, i.e. repeats with mismatches and gaps. The considered sequence models are independent identically distributed (i.i.d.) sequences, Markov processes and long range sequences. We show that the number of repeats in correlated sequences is significantly larger than in i.i.d. sequences and that this discrepancy increases exponentially with the repeat length for long range sequences.Comment: 12 pages, 6 figure

    Real-Time Audio-to-Score Alignment of Music Performances Containing Errors and Arbitrary Repeats and Skips

    Full text link
    This paper discusses real-time alignment of audio signals of music performance to the corresponding score (a.k.a. score following) which can handle tempo changes, errors and arbitrary repeats and/or skips (repeats/skips) in performances. This type of score following is particularly useful in automatic accompaniment for practices and rehearsals, where errors and repeats/skips are often made. Simple extensions of the algorithms previously proposed in the literature are not applicable in these situations for scores of practical length due to the problem of large computational complexity. To cope with this problem, we present two hidden Markov models of monophonic performance with errors and arbitrary repeats/skips, and derive efficient score-following algorithms with an assumption that the prior probability distributions of score positions before and after repeats/skips are independent from each other. We confirmed real-time operation of the algorithms with music scores of practical length (around 10000 notes) on a modern laptop and their tracking ability to the input performance within 0.7 s on average after repeats/skips in clarinet performance data. Further improvements and extension for polyphonic signals are also discussed.Comment: 12 pages, 8 figures, version accepted in IEEE/ACM Transactions on Audio, Speech, and Language Processin

    Comparative Analysis of Tandem Repeats from Hundreds of Species Reveals Unique Insights into Centromere Evolution

    Get PDF
    Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. The assumption that the most abundant tandem repeat is the centromere DNA was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and in length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond ~50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution, including the appearance of higher order repeat structures in which several polymorphic monomers make up a larger repeating unit. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animals and plants. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes

    Nucleotide repeats in mitochondrial genome determine human lifespan

    Get PDF
    Direct nucleotide repeats can facilitate deletions of segments of mitochondrial genome1, leading to a wide range of neuromuscular disorders1,2 as well as aging2,3 in humans. We hypothesized that the number of the direct perfect repeats in human mitochondrial genomes influences longevity through the formation of harmful mtDNA deletions in the somatic cells. The analysis of the complete mitochondrial genomes of 762 unrelated Japanese individuals4-6 reveals a negative correlation between the abundance of the direct perfect repeats and the expected longevity. This association is largely due to the disruption of the common repeat (8470,13447) by a point mutation 8473C which occurred at the origin of the D4a haplogroup characterized by extreme longevity in Japan7. Our results provide the first evidence for correlation between the number of nucleotide repeats and the lifespan on intraspecific level

    Coplanar Repeats by Energy Minimization

    Full text link
    This paper proposes an automated method to detect, group and rectify arbitrarily-arranged coplanar repeated elements via energy minimization. The proposed energy functional combines several features that model how planes with coplanar repeats are projected into images and captures global interactions between different coplanar repeat groups and scene planes. An inference framework based on a recent variant of α\alpha-expansion is described and fast convergence is demonstrated. We compare the proposed method to two widely-used geometric multi-model fitting methods using a new dataset of annotated images containing multiple scene planes with coplanar repeats in varied arrangements. The evaluation shows a significant improvement in the accuracy of rectifications computed from coplanar repeats detected with the proposed method versus those detected with the baseline methods.Comment: 14 pages with supplemental materials attache

    Evolution of genes and repeats in the Nimrod superfamily

    Get PDF
    The recently identified Nimrod superfamily is characterized by the presence of a special type of EGF repeat, the NIM repeat, located right after a typical CCXGY/W amino acid motif. On the basis of structural features, nimrod genes can be divided into three types. The proteins encoded by Draper-type genes have an EMI domain at the N-terminal part and only one copy of the NIM motif, followed by a variable number of EGF-like repeats. The products of Nimrod B-type and Nimrod C-type genes (including the eater gene) have different kinds of N-terminal domains, and lack EGF-like repeats but contain a variable number of NIM repeats. Draper and Nimrod C-type (but not Nimrod B-type) proteins carry a transmembrane domain. Several members of the superfamily were claimed to function as receptors in phagocytosis and/or binding of bacteria, which indicates an important role in the cellular immunity and the elimination of apoptotic cells. In this paper, the evolution of the Nimrod superfamily is studied with various methods on the level of genes and repeats. A hypothesis is presented in which the NIM repeat, along with the EMI domain, emerged by structural reorganizations at the end of an EGF-like repeat chain, suggesting a mechanism for the formation of novel types of repeats. The analyses revealed diverse evolutionary patterns in the sequences containing multiple NIM repeats. Although in the Nimrod B and Nimrod C proteins show characteristics of independent evolution, many internal NIM repeats in Eater sequences seem to have undergone concerted evolution. An analysis of the nimrod genes has been performed using phylogenetic and other methods and an evolutionary scenario of the origin and diversification of the Nimrod superfamily is proposed. Our study presents an intriguing example how the evolution of multigene families may contribute to the complexity of the innate immune response
    • …
    corecore