31 research outputs found

    Population genetic parameters of the <i>CFTR</i> Met470Val locus.

    No full text
    <p>(A) Haplotype blocks +/− 500 kb around Met470Val locus in HapMap CEU (phase II) samples. The arrow indicates the location of Met470Val; the blue vertical line shows the ancestral Met470 allele and the red vertical line shows the derived Val470 allele. A continuous block of the same color represents the haplotypes shared between individuals. Haplotypes on the Met470 background are shorter and more variable compared to those on Val470 background. (B) Decay of extended haplotype homozygosity (EHH) around the Met470Val locus in the same data as in (A). The blue plot represents the decay of haplotypes on the ancestral (Met) allele background; the red plot represents the decay of haplotypes on the derived (Val) allele background. The Y-axis shows the EHH, defined as the probability that two randomly chosen chromosomes are homozygous at all SNPs for the entire interval from the core SNP at distance <i>x </i><a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1000974#pgen.1000974-Sabeti2" target="_blank">[37]</a>. EHH probability drops below 0.5 at approximately 300 kb around Met470Val on haplotypes carrying the Val470 allele, compared to <20 kb on haplotypes carrying the Met allele. The iHS corresponds to the natural logarithm of the ratio of areas under the ancestral and derived allele EHH curves, standardized to be independent of the allele frequencies. A negative iHS implies that the haplotypes on derived allele background are longer than those on ancestral background <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1000974#pgen.1000974-Voight1" target="_blank">[23]</a>. (C–E) Genome-wide distributions of (C) Fst values, (D) iHS scores and (E) Fst and iHS scores for SNPs in HapMap phase II data. Black lines (and filled circle in E) show the location of the Met470Val SNP in each distribution. Proportions of SNPs with more extreme values are shown on the plots as empirical <i>P</i>-values.</p

    Met470Val genotypes and fertility in Hutterite men.

    No full text
    <p>(A) Cumulative plot of the number of years from marriage (y-axis) to each birth (x-axis) by genotype. Black horizontal lines show the means and whiskers show standard errors. (B) Box plot and distribution of the number of births (y-axis) among men by genotype (x-axis) for men who are married at least 11.5 years (mean number of years from marriage to last birth for the men in this sample). (C) Survival curves showing the proportion of men reaching the 6<sup>th</sup> birth (equal to the mean and median of number of births among men in this sample) (y-axis) by 6 to 20 years of marriage (x-axis) by genotype. The distributions of total length of marriages are similar for men in both genotype groups (not shown).</p

    Results of association tests with birth rate in Hutterite men.

    No full text
    1<p>All three genotypes were tested individually.</p>2<p>People carrying Met/Val and Val/Val genotypes were combined and tested against Met/Met homozygotes.</p

    Geographic distribution of the Met470Val polymorphism in HGDP samples.

    No full text
    <p>The relative frequencies of each allele are shown as blue (ancestral Met470 allele) and orange (derived Val470 allele) pie slices.</p

    <i>CFTR</i> Met470Val genotypes and birth rate in Hutterites men.

    No full text
    <p>The residuals of birth rate, corrected for relatedness is shown on the y-axis, and the number of Val470 alleles at the <i>CFTR</i> Met470Val locus is shown on the x-axis. Sample sizes for each genotype group are shown under x-axis. Each grey circle corresponds to an individual subject. Red horizontal lines show the means and black whiskers show the standard errors. (A) Met470Val genotype (Model 1); grouped by 0 (Met/Met), 1 (Met/Val) or 2 (Val/Val) copies of the Val allele; (B) Met470Val genotype (Model 2) with Val/Val and Met/Val men combined.</p

    Illumina TruSeq Synthetic Long-Reads Empower <i>De Novo</i> Assembly and Resolve Complex, Highly-Repetitive Transposable Elements

    No full text
    <div><p>High-throughput DNA sequencing technologies have revolutionized genomic analysis, including the <i>de novo</i> assembly of whole genomes. Nevertheless, assembly of complex genomes remains challenging, in part due to the presence of dispersed repeats which introduce ambiguity during genome reconstruction. Transposable elements (TEs) can be particularly problematic, especially for TE families exhibiting high sequence identity, high copy number, or complex genomic arrangements. While TEs strongly affect genome function and evolution, most current <i>de novo</i> assembly approaches cannot resolve long, identical, and abundant families of TEs. Here, we applied a novel Illumina technology called TruSeq synthetic long-reads, which are generated through highly-parallel library preparation and local assembly of short read data and which achieve lengths of 1.5–18.5 Kbp with an extremely low error rate (0.03% per base). To test the utility of this technology, we sequenced and assembled the genome of the model organism <i>Drosophila melanogaster</i> (reference genome strain <i>y; cn, bw, sp</i>) achieving an N50 contig size of 69.7 Kbp and covering 96.9% of the euchromatic chromosome arms of the current reference genome. TruSeq synthetic long-read technology enables placement of individual TE copies in their proper genomic locations as well as accurate reconstruction of TE sequences. We entirely recovered and accurately placed 4,229 (77.8%) of the 5,434 annotated transposable elements with perfect identity to the current reference genome. As TEs are ubiquitous features of genomes of many species, TruSeq synthetic long-reads, and likely other methods that generate long-reads, offer a powerful approach to improve <i>de novo</i> assemblies of whole genomes.</p></div

    Characteristics of TruSeq synthetic long-reads.

    No full text
    <p><b>A</b>: Read length distribution. <b>B, C, & D</b>: Position-dependent profiles of <b>B</b>: mismatches, <b>C</b>: insertions, and <b>D</b>: deletions compared to the reference genome. Error rates presented in these figures represent all differences with the reference genome, and can be due to errors in the reads, mapping errors, errors in the reference genome, or accurate sequencing of residual polymorphism.</p
    corecore