9 research outputs found

    Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies

    Get PDF
    <div><p>Exome sequencing is becoming a standard tool for mapping Mendelian disease-causing (or pathogenic) non-synonymous single nucleotide variants (nsSNVs). Minor allele frequency (MAF) filtering approach and functional prediction methods are commonly used to identify candidate pathogenic mutations in these studies. Combining multiple functional prediction methods may increase accuracy in prediction. Here, we propose to use a logit model to combine multiple prediction methods and compute an unbiased probability of a rare variant being pathogenic. Also, for the first time we assess the predictive power of seven prediction methods (including SIFT, PolyPhen2, CONDEL, and logit) in predicting pathogenic nsSNVs from other rare variants, which reflects the situation after MAF filtering is done in exome-sequencing studies. We found that a logit model combining all or some original prediction methods outperforms other methods examined, but is unable to discriminate between autosomal dominant and autosomal recessive disease mutations. Finally, based on the predictions of the logit model, we estimate that an individual has around 5% of rare nsSNVs that are pathogenic and carries ∼22 pathogenic derived alleles at least, which if made homozygous by consanguineous marriages may lead to recessive diseases.</p> </div

    Theoretical inbreeding coefficient (<i>F</i>) and corresponding number of homozygous pathogenic variants in the children of various relationships, given that on average each individual carries 22 pathogenic derived alleles.

    Full text link
    <p>Theoretical inbreeding coefficient (<i>F</i>) and corresponding number of homozygous pathogenic variants in the children of various relationships, given that on average each individual carries 22 pathogenic derived alleles.</p

    The numbers and proportion of nsSNVs removed by hard-filtering and functional prediction by the logit model in 3 Mendelian-disease patients with in-house exome sequencing data.

    Full text link
    a<p>Related cases with autosomal dominant spinocerebellar ataxia.</p>b<p>Case with neonatal-onset Crohn's disease.</p>c<p>nsSNVs in which prediction is unavailable due to missing scores.</p

    The relationship between prior and posterior probabilities of a rare nsSNV being pathogenic, given the prediction scores from SIFT, PolyPhen2, and MutationTaster.

    Full text link
    <p>The white dashed lines indicate the estimated range of the prior (5%). We assume that there is no difference in prediction scores from the three methods for the same variant. The <i>α</i>, <i>β</i><sub>SIFT</sub>, <i>β</i><sub>Polyphen2</sub> and <i>β</i><sub>MutationTaster</sub> in a selected sample evaluated in the ExoVar dataset are used in the calculation of posteriors (See <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003143#pgen.1003143.e002" target="_blank">Eq. 2</a> and <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003143#pgen.1003143.e003" target="_blank">3</a> in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003143#s4" target="_blank">Materials and Methods</a>) and take the values of −3.53, 1.64, 1.48, and 2.47 respectively. The prior and posterior are equivalent to the quantity <i>P</i><sub>disease</sub> in an individual genome in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003143#pgen.1003143.e003" target="_blank">Eq. 3</a> and P(<i>Y</i> = 1|<i>X</i>) in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003143#pgen.1003143.e002" target="_blank">Eq. 2</a> respectively.</p
    corecore