18 research outputs found

    The Influence of Recombination on Human Genetic Diversity

    Get PDF
    In humans, the rate of recombination, as measured on the megabase scale, is positively associated with the level of genetic variation, as measured at the genic scale. Despite considerable debate, it is not clear whether these factors are causally linked or, if they are, whether this is driven by the repeated action of adaptive evolution or molecular processes such as double-strand break formation and mismatch repair. We introduce three innovations to the analysis of recombination and diversity: fine-scale genetic maps estimated from genotype experiments that identify recombination hotspots at the kilobase scale, analysis of an entire human chromosome, and the use of wavelet techniques to identify correlations acting at different scales. We show that recombination influences genetic diversity only at the level of recombination hotspots. Hotspots are also associated with local increases in GC content and the relative frequency of GC-increasing mutations but have no effect on substitution rates. Broad-scale association between recombination and diversity is explained through covariance of both factors with base composition. To our knowledge, these results are the first evidence of a direct and local influence of recombination hotspots on genetic variation and the fate of individual mutations. However, that hotspots have no influence on substitution rates suggests that they are too ephemeral on an evolutionary time scale to have a strong influence on broader scale patterns of base composition and long-term molecular evolution

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Evaluation of IRX Genes and Conserved Noncoding Elements in a Region on 5p13.3 Linked to Families with Familial Idiopathic Scoliosis and Kyphosis

    No full text
    Because of genetic heterogeneity present in idiopathic scoliosis, we previously defined clinical subsets (a priori) from a sample of families with idiopathic scoliosis to find genes involved with spinal curvature. Previous genome-wide linkage analysis of seven families with at least two individuals with kyphoscoliosis found linkage (P-value = 0.002) in a 3.5-Mb region on 5p13.3 containing only three known genes, IRX1, IRX2, and IRX4. In this study, the exons of IRX1, IRX2, and IRX4, the conserved noncoding elements in the region, and the exons of a nonprotein coding RNA, LOC285577, were sequenced. No functional sequence variants were identified. An intrafamilial test of association found several associated noncoding single nucleotide variants. The strongest association was with rs12517904 (P = 0.00004), located 6.5 kb downstream from IRX1. In one family, the genotypes of nine variants differed from the reference allele in all individuals with kyphoscoliosis, and two of three individuals with scoliosis, but did not differ from the reference allele in all other genotyped individuals. One of these variants, rs117273909, was located in a conserved noncoding region that functions as an enhancer in mice. To test whether the variant allele at rs117273909 had an effect on enhancer activity, zebrafish transgenesis was performed with overlapping fragments of 198 and 687 bp containing either the wild type or the variant allele. Our data suggests that this region acts as a regulatory element; however, its size and target gene(s) need to be identified to determine its role in idiopathic scoliosis

    Marginal Significance (−log<sub>10</sub> p-value as Determined by <i>t-</i>Test) of the Wavelet Coefficients from Four Annotations as Predictors of the Coefficients of the Decomposition of Human-Chimpanzee Divergence

    No full text
    <div><p>Red boxes highlight significant positive linear relationships, and blue boxes, negative. The intensity of the colour is proportional to the degree of significance.</p><p>(A) Smoothed coefficients.</p><p>(B) Detail coefficients.</p></div

    Wavelet Transformation of Genome Annotations

    No full text
    <div><p>(A) To illustrate the purpose of wavelet transformation, we show the original traces and continuous wavelet transformations using the derivative of Gaussian wavelet basis for gene content and divergence over a 2-Mb stretch of Chromosome 20. Colours indicate the magnitude (blue = low, red = high, white = zero) of the wavelet coefficients at each scale and location, with each level being normalised to have equal variance.</p><p>(B) Analysis of the correlation between the smoothed and detailed coefficients at each scale (see Text S2). The height of each bar is the value of the correlation coefficient and the boxes are the contributions from broader scales (top is the broadest scale), with colour intensity related to the magnitude of the effect (blue is negative, red is positive) and size proportional to the fraction of variance explained by a given level. The correlation between divergence and constraint in the original signal (−0.0823) can be decomposed into positive contributions from correlations between detail coefficients at broad scales and negative contributions from correlations between detail coefficients at fine scales.</p></div

    Quantile-Quantile Plots Showing the Difference in Allele Frequency Spectrum for AT→GC Mutations and GC→AT Mutations in Regions of Low and High Recombination

    No full text
    <p>If the two types of mutation were to have the same allele frequency distribution, we would expect to see a straight line. In both cases, AT→GC mutations are typically at higher frequencies than GC→AT mutations; however, the effect is more pronounced in regions of high recombination [(A), low recombination; (B), high recombination]. A quantification of the difference can be found in the text and supporting material.</p

    Power Spectra and Pairwise Correlations of Detail Wavelet Coefficients

    No full text
    <p>Diagonal plots show the power spectrum of the wavelet decomposition of each factor on the long (red) and short (blue) arms of Chromosome 20. Off-diagonal plots show the rank correlation coefficient between pairs of detail wavelet coefficients at each scale on the long (top right) and short (bottom left) arms. Red crosses indicate significant correlations (<i>p</i>-value < 0.01; Kendall's rank correlation). Scale is shown in kilobases.</p

    Marginal Significance (−log<sub>10</sub> p-value as Determined by <i>t-</i>Test) of the Wavelet Coefficients from Four Annotations as Predictors of the Coefficients of the Decomposition of Ascertainment Panel Diversity

    No full text
    <div><p>Red boxes highlight significant positive linear relationships and blue negative. The intensity of the colour is proportional to the degree of significance.</p><p>(A) Smoothed coefficients.</p><p>(B) Detail coefficients.</p><p>Also shown is the adjusted <i>r</i><sup>2</sup>, which can be interpreted as the proportion of the variance in the signal explained by the linear model.</p></div
    corecore