61 research outputs found

    The Impact of Recombination on Nucleotide Substitutions in the Human Genome

    Get PDF
    Unraveling the evolutionary forces responsible for variations of neutral substitution patterns among taxa or along genomes is a major issue for detecting selection within sequences. Mammalian genomes show large-scale regional variations of GC-content (the isochores), but the substitution processes at the origin of this structure are poorly understood. We analyzed the pattern of neutral substitutions in 1 Gb of primate non-coding regions. We show that the GC-content toward which sequences are evolving is strongly negatively correlated to the distance to telomeres and positively correlated to the rate of crossovers (R2β€Š=β€Š47%). This demonstrates that recombination has a major impact on substitution patterns in human, driving the evolution of GC-content. The evolution of GC-content correlates much more strongly with male than with female crossover rate, which rules out selectionist models for the evolution of isochores. This effect of recombination is most probably a consequence of the neutral process of biased gene conversion (BGC) occurring within recombination hotspots. We show that the predictions of this model fit very well with the observed substitution patterns in the human genome. This model notably explains the positive correlation between substitution rate and recombination rate. Theoretical calculations indicate that variations in population size or density in recombination hotspots can have a very strong impact on the evolution of base composition. Furthermore, recombination hotspots can create strong substitution hotspots. This molecular drive affects both coding and non-coding regions. We therefore conclude that along with mutation, selection and drift, BGC is one of the major factors driving genome evolution. Our results also shed light on variations in the rate of crossover relative to non-crossover events, along chromosomes and according to sex, and also on the conservation of hotspot density between human and chimp

    Identification of three new Alu Yb subfamilies by source tracking of recently integrated Alu Yb elements

    Full text link

    Molecular Evolution of Numt, a Recent Transfer and Tandem Amplification of Mitochondrial DNA into the Nuclear Genome of the Domestic Cat (Felis catus)

    Get PDF
    Mitochondrial DNA (mtDNA) are functional cytoplasmic chromosomes, tracing origins to a symbiotic infection of eukaryotic cells by bacterial progenitors. As prescribed by the Serial Endosymbiosis Theory, symbionts have gradually transferred their genes to the nuclear genome that enable functional interaction. In this dissertation, a 7.9 kb transposition of a typically 17.0 kb mitochondrial genome to a specific chromosomal position in the domestic cat is reported. The integrated mtDNA has amplified about 38-76 times and now occurs as a macrosatellite -like tandem repeat with multiple length alleles resolved by pulse field gel electrophoresis (PFGE) segregating in cat populations. To examine the tempo and mode of evolution between different organelles, characterization of the complete 7946 bp nuclear mitochondrial DNA monomer, Numt, and cytoplasmic mtDNA (17,009 bp) sequences reveals about 95% similarity, which supports recent divergence within 1.8-2.0 MYA and the radiation of four modern species in genus Felis. The motif, (ACACACGT), appears imperfectly repeated at the deletion junction of the control region (CR), and a likely target for recombination. Simple repeats are also implicated in indel generation. Most substitutions between the cat homologues are attributable to accelerated cytoplasmic mtDNA evolution, yet maintain a uniform rate of synonymous substitutions between different mitochondrial genes. Results of ribonuclease protection assays on cellular RNA verify the lack of Numt-specific transcription and the appraisal of Numt as a molecular fossil . Despite an elevated number of transversions and no increase in dA/dT content over cytoplasmic mtDNA, Numt resembles archetypal pseudogene evolution. To place the felid data in the context of functional mitochondrial genomes, pairwise similarity comparisons of all 37 mtDNA coding genes and the CR among eight complete mitochondrial genomes of five placental mammals were performed. In carnivores, the ND4L and ATPase 6 genes exhibit higher sequence conservation, while cyt B shows accelerated divergence. Lastly, the occurrence of Numt-like loci in other exotic felids deviates from current phylogenetic predictions. To confirm homology with the F. catus Numt locus, a series of experiments was conducted to isolate chromosomal sequences directly flanking Numt-like loci. These observations provide an empirical glimpse of historic genomic events that may parallel the accommodation of organelles in eukaryotes

    HUMAN GENOME VARIATIONS AND EVOLUTION WITH A FOCUS ON THE ANALYSIS OF TRANSPOSABLE ELEMENTS

    Get PDF
    Genome sequence varies in numerous ways among individuals although the gross architecture is fixed for all humans. Retrotransposons create one of the most abundant structural variants in the human genome and are divided in many families, with certain members in some families, e.g., L1, Alu, SVA, and HERV-K, remaining active for transposition. Along with other types of genomic variants, retrotransponson-derived variants contribute to the whole spectrum of genome variants in humans. With the advancement of sequencing techniques, many human genomes are being sequenced at the individual level, fueling the comparative research on these variants among individuals. In this thesis, the evolution and functional impact of structural variations is examined primarily focusing on retrotransposons in the context of human evolution. The thesis comprises of three different studies on the topics that are presented in three data chapters. First, the recent evolution of all human specific AluYb members, representing the second most active subfamily of Alus, was tracked to identify their source/master copy using a novel approach. All human-specific AluYb elements from the reference genome were extracted, aligned with one another to construct clusters of similar copies and each cluster was analyzed to generate the evolutionary relationship between the members of the cluster. The approach resulted in identification of one major driver copy of all human specific Yb8 and the source copy of the Yb9 lineage. Three new subfamilies within the AluYb family – Yb8a1, Yb10 and Yb11 were also identified, with Yb11 being the youngest and most polymorphic. Second, an attempt to construct a relation between transposable elements (TEs) and tandem repeats (TRs) was made at a genome-wide scale for the first time. Upon sequence comparison, positional cross-checking and other relevant analyses, it was observed that over 20% of all TRs are derived from TEs. This result established the first connection between these two types of repetitive elements, and extends our appreciation for the impact of TEs on genomes. Furthermore, only 6% of these TE-derived TRs follow the already postulated initiation and expansion mechanisms, suggesting that the others are likely to follow a yet-unidentified mechanism. Third, by taking a combination of multiple computational approaches involving all types of genetic variations published so far including transposable elements, the first whole genome sequence of the most recent common ancestor of all modern human populations that diverged into different populations around 125,000-100,000 years ago was constructed. The study shows that the current reference genome sequence is 8.89 million base pairs larger than our common ancestor’s genome, contributed by a whole spectrum of genetic mechanisms. The use of this ancestral reference genome to facilitate the analysis of personal genomes was demonstrated using an example genome and more insightful recent evolutionary analyses involving the Neanderthal genome. The three data chapters presented in this thesis conclude that the tandem repeats and transposable elements are not two entirely distinctly isolated elements as over 20% TRs are actually derived from TEs. Certain subfamilies of TEs themselves are still evolving with the generation of newer subfamilies. The evolutionary analyses of all TEs along with other genomic variants helped to construct the genome sequence of the most recent common ancestor to all modern human populations which provides a better alternative to human reference genome and can be a useful resource for the study of personal genomics, population genetics, human and primate evolution

    Population History of the Dniester-Carpathians: evidence from Alu insertion and Y-chromosome polymorphisms

    Get PDF
    The Dniester-Carpathian region has attracted much attention from historians, linguists, and anthropologists, but remains insufficiently studied genetically. We have analyzed a set of autosomal polymorphic loci and Y-chromosome markers in six autochthonous Dniester-Carpathian population groups: 2 Moldavian, 1 Romanian, 1 Ukrainian and 2 Gagauz populations. To gain insight into the population history of the region, the data obtained in this study were compared with corresponding data for other populations of Western Eurasia. The analysis of 12 Alu human-specific polymorphisms in 513 individuals from the Dniester-Carpathian region showed a high degree of homogeneity among Dniester-Carpathian as well as southeastern European populations. The observed homogeneity suggests either a common ancestry of all southeastern European populations or a strong gene flow between them. Nevertheless, tree reconstruction and principle component analyses allow the distinction between Balkan-Carpathian (Macedonians, Romanians, Moldavians, Ukrainians and Gagauzes) and Eastern Mediterranean (Turks, Greeks and Albanians) population groups. These results are consistent with those from classical and other DNA markers and are compatible with archaeological and paleoanthropological data. Haplotypes constructed from Y-chromosome markers were used to trace the paternal origin of the Dniester-Carpathian populations. A set of 32 binary and 7 STR Y-chromosome polymorphisms was genotyped in 322 Dniester-Carpathian Y-chromosomes. On this basis, 21 stable haplogroups and 171 combination binary marker/STR haplotypes were identified. The haplogroups E3b1, G, J1, J2, I1b, R1a1, and R1b3, most common in the Dniester-Carpathian region, are also common in European and Near Eastern populations. Ukrainians and southeastern Moldavians show a high proportion of eastern European lineages, while Romanians and northern Moldavians demonstrate a high proportion of western Balkan lineages. The Gagauzes harbor a conspicuous proportion of lineages of Near Eastern origin, comparable to that in Balkan populations. In general, the Dniester-Carpathian populations demonstrate the closest affinities to the neighboring southeastern and eastern European populations. The expansion times were estimated for 4 haplogroups (E3b1, I1b, R1a1, and R1b3) from associated STR diversity. The presence in the studied area of genetic components of different age indicates successive waves of migration from diverse source areas of Western Eurasia. Neither of the genetic systems used in this study revealed any correspondence between genetic and linguistic patterns in the Dniester-Carpathian region or in Southeastern Europe, a fact which suggests either that the ethnic differentiation in these regions was indeed very recent or that the linguistic and other social barriers were not strong enough to prevent genetic flow between populations. In particular, Gagauzes, a Turkic speaking population, show closer affinities not to other Turkic peoples, but to their geographical neighbors

    Evolutionary genomics : statistical and computational methods

    Get PDF
    This open access book addresses the challenge of analyzing and understanding the evolutionary dynamics of complex biological systems at the genomic level, and elaborates on some promising strategies that would bring us closer to uncovering of the vital relationships between genotype and phenotype. After a few educational primers, the book continues with sections on sequence homology and alignment, phylogenetic methods to study genome evolution, methodologies for evaluating selective pressures on genomic sequences as well as genomic evolution in light of protein domain architecture and transposable elements, population genomics and other omics, and discussions of current bottlenecks in handling and analyzing genomic data. Written for the highly successful Methods in Molecular Biology series, chapters include the kind of detail and expert implementation advice that lead to the best results. Authoritative and comprehensive, Evolutionary Genomics: Statistical and Computational Methods, Second Edition aims to serve both novices in biology with strong statistics and computational skills, and molecular biologists with a good grasp of standard mathematical concepts, in moving this important field of study forward

    Investigation of distal repetitive sequences in the genus allium

    Get PDF
    PhDThe telomere is a DNA/protein structure required to maintain the ends of linear chromosomes. Usually the DNA component comprises a highly conserved tandemly repeated minisatellite sequence. In most plants the minisatellite sequence is typically present in several hundred copies at each chromosome end, and is extended primarily by telomerase, which adds telomere repeats to the 3’ end. In the plant genus Allium, which contains around 700 species, there is an absence of typical telomeric DNA repeats. It is of great interest to determine what sequence or sequences have replaced the ancestral repeats and how they are lengthened. A range of molecular cloning methods were used to isolate candidate telomere sequences from the genomes of two diverged species, Allium cernuum and Allium cepa. I analyse several putative telomere sequences, isolated in this work and by others, but no proven candidate sequence has emerged. Nevertheless, one of those sequences, 35S ribosomal DNA (rDNA) encoding 35S rRNA, proved to have a structure that is previously not described for plants. I show that some units have a Ty1/copia retrotransposon fragment in the intergenic spacer region. Sequence analysis indicates that there was a single insertion followed by amplification, probably involving homogenisation mechanisms. Furthermore, I show high levels of rDNA length heterogeneity and rDNA unit divergence both within species and across the genus, respectively
    • …
    corecore