34 research outputs found
The percentage of <i>de novo</i> mutations occurring in the most intolerant quartile (25<sup>th</sup> percentile) across the severe ID, autistic, epileptic encephalopathy, and control siblings, for the different variant effect types.
<p>LGD = Likely Gene Disrupting (including nonsense, coding indels, and splice acceptor/donor site mutations). *Taking the CCDS of RVIS genes, 38% reflects the total real estate occupied by the 25<sup>th</sup> percentile most intolerant genes. P-values reflect binomial exact tests where the probability of success is adjusted to 0.38, accounting for the gene sizes of the 25% most intolerant genes.</p
[A] Cumulative percentage plots for the residual variation intolerance scores among six OMIM lists. [B] ROC curves of the residual variation intolerance scores' capacity to predict the corresponding OMIM list.
<p>[A] Cumulative percentage plots for the residual variation intolerance scores among six OMIM lists. [B] ROC curves of the residual variation intolerance scores' capacity to predict the corresponding OMIM list.</p
ROC curves of the residual variation intolerance scores' capacity to predict the corresponding independent gene-list.
<p>ROC curves of the residual variation intolerance scores' capacity to predict the corresponding independent gene-list.</p
A regression plot illustrating the regression of Y on X.
<p>The plot is annotated for the 2% extremes: red = 2% most intolerant, blue = 2% most tolerant. Five outlier genes with >140 common functional variant sites (y-axis) are not shown.</p
(A) Scatterplot of RVIS-sum (RVIS-CHGV + ncRVIS) and RVIS-diff (RVIS-CHGV–ncRVIS) scores. Each dot represents a gene. The grey dots represent the background genome-wide distribution. The red dots highlight the 82 OMIM haploinsufficiency genes with reported causal de novo mutations. A higher (positive Y-axis value) RVIS-diff score indicates genes where we might have a greater expectation of gene dosage aberrations being important compared with protein structure aberrations. A lower RVIS-sum (X-axis value) highlights genes that are increasingly intolerant in both their noncoding and protein-coding sequence. (B) A cumulative percentage plot for the RVIS-sum percentile accommodating the 82 OMIM halpoinsufficiency genes.
<p>At any given point on the X-axis (RVIS-sum percentile) we can determine what percentage of the 82 OMIM haploinsufficiency genes are accounted for.</p
Receiver operating characteristic (ROC) curves to measure the ability of RVIS-CHGV, ncRVIS, pcGERP, ncGERP, ncCADD, ncGWAVA scores and two joint models to discriminate genes reported among ClinGen’s dosage sensitivity map from the rest of the human genome.
<p>Here, for a given score, all assessable genes were used. To obtain the presented levels of significance, we use a logistic regression model to regress the presence or absence of a gene among the ClinGen dosage sensitivity map list on each of the genic scores.</p
A regression plot that shows the regression of noncoding polymorphisms (Y) on an estimate of the noncoding sequence mutability (X) (S1 Data).
<p>Each dot represents the position of a gene in the regression plot and the corresponding regression line is provided. Annotations are made for the 5% extremes: red = 5% most intolerant, blue = 5% most tolerant.</p
(A) Distribution of ncRVIS scores for the 1,235 loss-of-function deficient genes (left) compared to the 1,762 loss-of-function control genes (right). Median 37.95% vs. 58.09%; Mann-Whitney U test, p = 6.6x10<sup>-34</sup>. (B) Receiver operating characteristic (ROC) curves measuring the ability of RVIS, ncRVIS, pcGERP and ncGERP to discriminate between loss-of-function deficient and loss-of-function control genes.
<p>To obtain the presented levels of significance in <b>(B)</b>, we used a logistic regression model to regress loss-of-function deficient or control gene status for the combined 2,997 genes on each of the four genic scores.</p
Overlaid histograms of ncGERP (blue) and pcGERP (red).
<p>These data show that the two form very different genome-wide distributions (medians: ncGERP -0.02 versus pcGERP 2.64). Moreover, pcGERP tends to present with a slightly platykurtic, left-skewed distribution (Îł<sub>2</sub> = -0.10, Îł<sub>1</sub> = -0.66) compared to ncGERP, which reflects a more leptokurtic, right-skewed distribution (Îł<sub>2</sub> = 0.97, Îł<sub>1</sub> = 0.96).</p
Comparing protein-coding and noncoding genic intolerance scores.
<p>To enable a matched comparison, the estimates in this table are based on a set of 14,567 CCDS genes with assessable scores across RVIS-CHGV, ncRVIS and ncGERP formulations. Both RVIS-CHGV and ncRVIS are based on the same population of 690 whole-genome sequenced samples from the CHGV.</p><p><sup>a</sup>HI = Haploinsufficiency. To obtain the presented levels of significance, we used a logistic regression model to regress the presence or absence of a gene within the corresponding gene list on each of the genic scores.</p><p>Joint Model: The AUC of a combined logistic regression model that uses all three features. Correlation plots for the pairs of scores are available in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005492#pgen.1005492.s001" target="_blank">S1 Fig</a>.</p><p>Comparing protein-coding and noncoding genic intolerance scores.</p