51 research outputs found

    The Influence of Recombination on Human Genetic Diversity

    Get PDF
    In humans, the rate of recombination, as measured on the megabase scale, is positively associated with the level of genetic variation, as measured at the genic scale. Despite considerable debate, it is not clear whether these factors are causally linked or, if they are, whether this is driven by the repeated action of adaptive evolution or molecular processes such as double-strand break formation and mismatch repair. We introduce three innovations to the analysis of recombination and diversity: fine-scale genetic maps estimated from genotype experiments that identify recombination hotspots at the kilobase scale, analysis of an entire human chromosome, and the use of wavelet techniques to identify correlations acting at different scales. We show that recombination influences genetic diversity only at the level of recombination hotspots. Hotspots are also associated with local increases in GC content and the relative frequency of GC-increasing mutations but have no effect on substitution rates. Broad-scale association between recombination and diversity is explained through covariance of both factors with base composition. To our knowledge, these results are the first evidence of a direct and local influence of recombination hotspots on genetic variation and the fate of individual mutations. However, that hotspots have no influence on substitution rates suggests that they are too ephemeral on an evolutionary time scale to have a strong influence on broader scale patterns of base composition and long-term molecular evolution

    A genome-wide study of preferential amplification/hybridization in microarray-based pooled DNA experiments

    Get PDF
    Microarray-based pooled DNA methods overcome the cost bottleneck of simultaneously genotyping more than 100 000 markers for numerous study individuals. The success of such methods relies on the proper adjustment of preferential amplification/hybridization to ensure accurate and reliable allele frequency estimation. We performed a hybridization-based genome-wide single nucleotide polymorphisms (SNPs) genotyping analysis to dissect preferential amplification/hybridization. The majority of SNPs had less than 2-fold signal amplification or suppression, and the lognormal distributions adequately modeled preferential amplification/hybridization across the human genome. Comparative analyses suggested that the distributions of preferential amplification/hybridization differed among genotypes and the GC content. Patterns among different ethnic populations were similar; nevertheless, there were striking differences for a small proportion of SNPs, and a slight ethnic heterogeneity was observed. To fulfill appropriate and gratuitous adjustments, databases of preferential amplification/hybridization for African Americans, Caucasians and Asians were constructed based on the Affymetrix GeneChip Human Mapping 100 K Set. The robustness of allele frequency estimation using this database was validated by a pooled DNA experiment. This study provides a genome-wide investigation of preferential amplification/hybridization and suggests guidance for the reliable use of the database. Our results constitute an objective foundation for theoretical development of preferential amplification/hybridization and provide important information for future pooled DNA analyses

    The Diploid Genome Sequence of an Individual Human

    Get PDF
    Presented here is a genome sequence of an individual human. It was produced from ∼32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information

    The complex genetics of multiple sclerosis: pitfalls and prospects

    Get PDF
    The genetics of complex disease is entering a new and exciting era. The exponentially growing knowledge and technological capabilities emerging from the human genome project have finally reached the point where relevant genes can be readily and affordably identified. As a result, the last 12 months has seen a virtual explosion in new knowledge with reports of unequivocal association to relevant genes appearing almost weekly. The impact of these new discoveries in Neuroscience is incalculable at this stage but potentially revolutionary. In this review, an attempt is made to illuminate some of the mysteries surrounding complex genetics. Although focused almost exclusively on multiple sclerosis all the points made are essentially generic and apply equally well, with relatively minor addendums, to any other complex trait, neurological or otherwise

    Bayesian semiparametric meta-analysis for genetic association studies.

    No full text
    We present a Bayesian semiparametric model for the meta-analysis of candidate gene studies with a binary outcome. Such studies often report results from association tests for different, possibly study-specific and non-overlapping genetic markers in the same genetic region. Meta-analyses of the results at each marker in isolation are seldom appropriate as they ignore the correlation that may exist between markers due to linkage disequilibrium (LD) and cannot assess the relative importance of variants at each marker. Also such marker-wise meta-analyses are restricted to only those studies that have typed the marker in question, with a potential loss of power. A better strategy is one which incorporates information about the LD between markers so that any combined estimate of the effect of each variant is corrected for the effect of other variants, as in multiple regression. Here we develop a Bayesian semiparametric model which models the observed genotype group frequencies conditional to the case/control status and uses pairwise LD measurements between markers as prior information to make posterior inference on adjusted effects. The approach allows borrowing of strength across studies and across markers. The analysis is based on a mixture of Dirichlet processes model as the underlying semiparametric model. Full posterior inference is performed through Markov chain Monte Carlo algorithms. The approach is demonstrated on simulated and real data
    corecore