139 research outputs found

    Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation

    Get PDF
    High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as many samples as possible, many genetic studies therefore employ lower coverage sequencing or SNP array genotyping coupled to statistical imputation. To compare these approaches individually and in conjunction, we developed a statistical framework to estimate genotypes jointly from sequence reads, array intensities, and imputation. In European samples, we find similar sensitivity (89%) and specificity (99.6%) from imputation with either 1× sequencing or 1 M SNP arrays. Sensitivity is increased, particularly for low-frequency polymorphisms (MAF <5%), when low coverage sequence reads are added to dense genome-wide SNP arrays — the converse, however, is not true. At sites where sequence reads and array intensities produce different sample genotypes, joint analysis reduces genotype errors and identifies novel error modes. Our joint framework informs the use of next-generation sequencing in genome wide association studies and supports development of improved methods for genotype calling

    Genetic and Computational Identification of a Conserved Bacterial Metabolic Module

    Get PDF
    We have experimentally and computationally defined a set of genes that form a conserved metabolic module in the α-proteobacterium Caulobacter crescentus and used this module to illustrate a schema for the propagation of pathway-level annotation across bacterial genera. Applying comprehensive forward and reverse genetic methods and genome-wide transcriptional analysis, we (1) confirmed the presence of genes involved in catabolism of the abundant environmental sugar myo-inositol, (2) defined an operon encoding an ABC-family myo-inositol transmembrane transporter, and (3) identified a novel myo-inositol regulator protein and cis-acting regulatory motif that control expression of genes in this metabolic module. Despite being encoded from non-contiguous loci on the C. crescentus chromosome, these myo-inositol catabolic enzymes and transporter proteins form a tightly linked functional group in a computationally inferred network of protein associations. Primary sequence comparison was not sufficient to confidently extend annotation of all components of this novel metabolic module to related bacterial genera. Consequently, we implemented the Graemlin multiple-network alignment algorithm to generate cross-species predictions of genes involved in myo-inositol transport and catabolism in other α-proteobacteria. Although the chromosomal organization of genes in this functional module varied between species, the upstream regions of genes in this aligned network were enriched for the same palindromic cis-regulatory motif identified experimentally in C. crescentus. Transposon disruption of the operon encoding the computationally predicted ABC myo-inositol transporter of Sinorhizobium meliloti abolished growth on myo-inositol as the sole carbon source, confirming our cross-genera functional prediction. Thus, we have defined regulatory, transport, and catabolic genes and a cis-acting regulatory sequence that form a conserved module required for myo-inositol metabolism in select α-proteobacteria. Moreover, this study describes a forward validation of gene-network alignment, and illustrates a strategy for reliably transferring pathway-level annotation across bacterial species

    Burden of Rare Sarcomere Gene Variants in the Framingham and Jackson Heart Study Cohorts

    Get PDF
    Rare sarcomere protein variants cause dominant hypertrophic and dilated cardiomyopathies. To evaluate whether allelic variants in eight sarcomere genes are associated with cardiac morphology and function in the community, we sequenced 3,600 individuals from the Framingham Heart Study (FHS) and Jackson Heart Study (JHS) cohorts. Out of the total, 11.2% of individuals had one or more rare nonsynonymous sarcomere variants. The prevalence of likely pathogenic sarcomere variants was 0.6%, twice the previous estimates; however, only four of the 22 individuals had clinical manifestations of hypertrophic cardiomyopathy. Rare sarcomere variants were associated with an increased risk for adverse cardiovascular events (hazard ratio: 2.3) in the FHS cohort, suggesting that cardiovascular risk assessment in the general population can benefit from rare variant analysis

    Targeted 'Next-Generation' sequencing in anophthalmia and microphthalmia patients confirms SOX2, OTX2 and FOXE3 mutations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Anophthalmia/microphthalmia (A/M) is caused by mutations in several different transcription factors, but mutations in each causative gene are relatively rare, emphasizing the need for a testing approach that screens multiple genes simultaneously. We used next-generation sequencing to screen 15 A/M patients for mutations in 9 pathogenic genes to evaluate this technology for screening in A/M.</p> <p>Methods</p> <p>We used a pooled sequencing design, together with custom single nucleotide polymorphism (SNP) calling software. We verified predicted sequence alterations using Sanger sequencing.</p> <p>Results</p> <p>We verified three mutations - c.542delC in S<it>OX2</it>, resulting in p.Pro181Argfs*22, p.Glu105X in <it>OTX2 </it>and p.Cys240X in <it>FOXE3</it>. We found several novel sequence alterations and SNPs that were likely to be non-pathogenic - p.Glu42Lys in <it>CRYBA4</it>, p.Val201Met in <it>FOXE3 </it>and p.Asp291Asn in <it>VSX2</it>. Our analysis methodology gave one false positive result comprising a mutation in <it>PAX6 </it>(c.1268A > T, predicting p.X423LeuextX*15) that was not verified by Sanger sequencing. We also failed to detect one 20 base pair (bp) deletion and one 3 bp duplication in <it>SOX2</it>.</p> <p>Conclusions</p> <p>Our results demonstrated the power of next-generation sequencing with pooled sample groups for the rapid screening of candidate genes for A/M as we were correctly able to identify disease-causing mutations. However, next-generation sequencing was less useful for small, intragenic deletions and duplications. We did not find mutations in 10/15 patients and conclude that there is a need for further gene discovery in A/M.</p
    corecore