23 research outputs found

    Parallel Genetic Ensemble Feature Selection

    Get PDF

    Heterogeneous Genomic Molecular Clocks in Primates

    Get PDF
    Copyright: © 2006 Kim et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.DOI: 10.1371/journal.pgen.0020163Using data from primates, we show that molecular clocks in sites that have been part of a CpG dinucleotide in recent past (CpG sites) and non-CpG sites are of markedly different nature, reflecting differences in their molecular origins. Notably, single nucleotide substitutions at non-CpG sites show clear generation-time dependency, indicating that most of these substitutions occur by errors during DNA replication. On the other hand, substitutions at CpG sites occur relatively constantly over time, as expected from their primary origin due to methylation. Therefore, molecular clocks are heterogeneous even within a genome. Furthermore, we propose that varying frequencies of CpG dinucleotides in different genomic regions may have contributed significantly to conflicting earlier results on rate constancy of mammalian molecular clock. Our conclusion that different regions of genomes follow different molecular clocks should be considered when inferring divergence times using molecular data and in phylogenetic analysis

    Mutations of Different Molecular Origins Exhibit Contrasting Patterns of Regional Substitution Rate Variation

    Get PDF
    Transitions at CpG dinucleotides, referred to as “CpG substitutions”, are a major mutational input into vertebrate genomes and a leading cause of human genetic disease. The prevalence of CpG substitutions is due to their mutational origin, which is dependent on DNA methylation. In comparison, other single nucleotide substitutions (for example those occurring at GpC dinucleotides) mainly arise from errors during DNA replication. Here we analyzed high quality BAC-based data from human, chimpanzee, and baboon to investigate regional variation of CpG substitution rates

    Evolutionary impacts of DNA methylation on vertebrate genomes

    No full text
    DNA methylation is an epigenetic modification in which a methyl group is covalently added to the DNA. In vertebrate genomes methylation occurs almost exclusively at cytosines immediately followed by a guanine (CpG dinucleotides). Two important aspects of DNA methylation have inspired several recent scientific investigations including those in this dissertation. First, methylated cytosines are hotspots of point mutation due to a methylation-dependent mutation mechanism, which has caused a deficiency of CpGs in vertebrate genomes. Second, DNA methylation in promoters is linked with transcriptional silencing of the associated genes. This dissertation presents the results of four studies in which I investigated the impacts of DNA methylation on the neutral and functional evolution of vertebrate genomes. The results of the first two studies demonstrate that DNA methylation has profound impacts on both inter- and intra-genomic neutral substitution rate variation. The third and fourth studies demonstrate that DNA methylation has played critical roles in shaping the evolution of vertebrate promoters and gene regulation.Ph.D

    Functional Relevance of CpG Island Length for Regulation of Gene Expression

    No full text
    CpG islands mark CpG-enriched regions in otherwise CpG-depleted vertebrate genomes. While the regulatory importance of CpG islands is widely accepted, it is little appreciated that CpG islands vary greatly in lengths. For example, CpG islands in the human genome vary ∼30-fold in their lengths. Here we report findings suggesting that the lengths of CpG islands have functional consequences. Specifically, we show that promoters associated with long CpG islands (long-CGI promoters) are distinct from other promoters. First, long-CGI promoters are uniquely associated with genes with an intermediate level of gene expression breadths. Notably, intermediate expression breadths require the most complex mode of gene regulation, from the standpoint of information content. Second, long-CGI promoters encode more RNA polymerase II (Polr2a) binding sites than other promoters. Third, the actual binding patterns of Polr2a occur in a more tissue-specific manner in long-CGI promoters compared to other CGI promoters. Moreover, long-CGI promoters contain the largest numbers of experimentally characterized transcription start sites compared to other promoters, and the types of transcription start sites in them are biased toward tissue-specific patterns of gene expression. Finally, long-CGI promoters are preferentially associated with genes involved in development and regulation. Together, these findings indicate that functionally relevant variations of CpG islands exist. By investigating consequences of certain CpG island traits, we can gain additional insights into the mechanism and evolution of regulatory complexity of gene expression

    Primate phylogenomics: developing numerous nuclear non-coding, non-repetitive markers for ecological and phylogenetic applications and analysis of evolutionary rate variation

    Get PDF
    © 2009 Peng et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2164/10/247DOI: 10.1186/1471-2164-10-247Background: Genetic analyses are often limited by the availability of appropriate molecular markers. Markers from neutrally evolving genomic regions may be particularly useful for inferring evolutionary histories because they escape the constraints of natural selection. For the majority of taxa however, obtaining such markers is challenging. Advances in genomics have the potential to alleviate the shortage of neutral markers. Here we present a method to develop numerous markers from putatively neutral regions of primate genomes. Results: We began with the available whole genome sequences of human, chimpanzee and macaque. Using computational methods, we identified a total of 280 potential amplicons from putatively neutral, non-coding, non-repetitive regions of these genomes. Subsequently we amplified, using experimental methods, many of these amplicons from diverse primate taxa, including a ringtailed lemur, which is distantly related to the genomic resources. Using a subset of 10 markers, we demonstrate the utility of the developed markers in phylogenetic and evolutionary rate analyses. Particularly, we uncovered substantial evolutionary rate variation among lineages, some of which are previously not reported. Conclusion: We successfully developed numerous markers from putatively neutral regions of primate genomes using a strategy combining computational and experimental methods. Applying these markers to phylogenetic and evolutionary rate variation analyses exemplifies the utility of these markers. Diverse ecological and evolutionary analyses will benefit from these markers. Importantly, methods similar to those presented here can be applied to other taxa in the near future

    DNA methylation is widespread and associated with differential gene expression in castes of the honeybee, Apis mellifera

    No full text
    The recent, unexpected discovery of a functional DNA methylation system in the genome of the social bee Apis mellifera underscores the potential importance of DNA methylation in invertebrates. The extent of genomic DNA methylation and its role in A. mellifera remain unknown, however. Here we show that genes in A. mellifera can be divided into 2 distinct classes, one with low-CpG dinucleotide content and the other with high-CpG dinucleotide content. This dichotomy is explained by the gradual depletion of CpG dinucleotides, a well-known consequence of DNA methylation. The loss of CpG dinucleotides associated with DNA methylation also may explain the unusual mutational patterns seen in A. mellifera that lead to AT-rich regions of the genome. A detailed investigation of this dichotomy implicates DNA methylation in A. mellifera development. High-CpG genes, which are predicted to be hypomethylated in germlines, are enriched with functions associated with developmental processes, whereas low-CpG genes, predicted to be hypermethylated in germlines, are enriched with functions associated with basic biological processes. Furthermore, genes more highly expressed in one caste than another are overrepresented among high-CpG genes. Our results highlight the potential significance of epigenetic modifications, such as DNA methylation, in developmental processes in social insects. In particular, the pervasiveness of DNA methylation in the genome of A. mellifera provides fertile ground for future studies of phenotypic plasticity and genomic imprinting
    corecore