13 research outputs found
The genetic parameters of ASD.
<p>(A) The relationship between the number of ASD risk genes () and the average relative risk (). stands for the total number of genes in the human genome, and for the fold enrichment of the <i>de novo</i> LoF mutations in probands vs. siblings (about 2 in our data). (B) The expected number of multi-hit genes () in families, as a function of the number of ASD risk genes (). The observed is 5, and we define the plausible range of as the values corresponding to to 6. The model assumes the relative risks of ASD risk genes follow a gamma distribution with the scale parameter . The variance of the relative risk () across genes equals ( is the average of of all ASD risk genes), which limits the range of plausible values for the model. The estimated value of the average is approximately 20. (C) For each gene, we compute the empirical allele frequency () of LoFs as the number of LoF variants divided by the sample size. The histogram of the LoF frequencies of all genes is shown. Also shown are the estimated distributions of under the null (red, solid line) and the alternative (blue, dashed line) models, respectively.</p
A probabilistic model for a family trio with an affected child.
<p>Genotype probabilities are computed as the marginal probability of parental genotypes times the conditional probability of the child, given the parents. The parameters and represent the mutation rate, and the population frequency of the genotype, respectively. Phenotype probabilities for the child, given genotype, are a function of , the penetrance of the genotype, and the relative risk of the mutation . Rate is the (approximate) rate of observing counts , and from the latter 3 types of trios, respectively.</p
Bayesian hierarchical model of TADA.
<p>A fraction of the genes are associated with the phenotype under investigation and follow model , and the remainder follow model . The prior distribution of gene-specific parameters, relative risk () and allele frequency (), can vary under the competing models, or . Priors are specified by the hyperparameters, and , respectively, which are estimated from the data. Counts of events for the <i>i</i>-th gene follow a Poisson distribution, parameterized by and under , and under .</p
The power per gene of competing tests.
<p>The results of three tests are shown: novo (red), meta (blue), and TADA (purple). Results are shown for various values of , and with type I error fixed at 0.001. Parameter values are chosen to cover plausible parameter values according to our model estimation: (A) ; (B) ; and (C) .</p
Top predicted ASD risk genes from the TADA analysis of combined ASD data (<i>de novo</i>, inherited and case-control).
<p>The column shows the <i>p</i>-values using the <i>De Novo Test</i> from the <i>de novo</i> LoF mutations alone. The column shows the <i>p</i>-values from the TADA test using all LoF data. The column shows the <i>p</i>-values from the TADA test using both LoF and Mis3 data. The star symbols mark the double-hit genes that were reported in earlier publications. C1orf95 also has <i>q</i>-value<.2, however this signal is based entirely on 11 identical Mis3 variants in cases and 0 in controls. This allele is common in African populations <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003671#pgen.1003671-Tennessen1" target="_blank">[40]</a>. While the AASC sample is of European ancestry, a portion of it, largely from Portugal, carries some sub-Saharan alleles <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003671#pgen.1003671-Liu1" target="_blank">[7]</a>. Thus, this signal is likely due to population substructure. Similarly, the 3 LoF variants seen in S100G are copies of a splice variant that is common in African populations, so this result should be viewed with caution.</p
Application of TADA to the genetic data of ASD.
<p>(A) <i>De novo</i> LoF and “probably damaging” missense mutations are enriched in ASD probands (red) compared with unaffected siblings (blue), based on a comparison including all trio and quad families. The other types of missense mutations are not enriched. To make the numbers comparable, the number of mutations in siblings is scaled by a constant multiplier (214/124) so that the numbers of silent mutations is equal in probands and in siblings. The annotations of missense mutations are based on PolyPhen. (B) Q-Q plot (log. scale) of the values for all genes in the ASD dataset based on a combined analysis of LoF and severe missense mutations.</p
Properties of the Multiplicity Test.
<p>(A) The probability a risk gene has two or more <i>de novo</i> LoF mutations in families (i.e., the power) depends on the mutation rate . Power per gene of the Multiplicity Test as a function of is shown for 4 mutation rates, which were chosen based on percentiles (25'th, 50'th, 75'th, 90'th) of the distribution of obtained from the full gene set. (B) The expected number of risk genes discovered by the Multiplicity Test at (red, solid) or 3 (blue, dashed) as a function of the sample size . The barplot shows the FDR at . The simulation assumes 1000 diseases genes out of 18,000, each with relative risk ; these parameters were estimated in the section on Genetic Architecture of ASD.</p
No Evidence for Association of Autism with Rare Heterozygous Point Mutations in Contactin-Associated Protein-Like 2 (<i>CNTNAP2</i>), or in Other Contactin-Associated Proteins or Contactins
<div><p>Contactins and Contactin-Associated Proteins, and Contactin-Associated Protein-Like 2 (<i>CNTNAP2</i>) in particular, have been widely cited as autism risk genes based on findings from homozygosity mapping, molecular cytogenetics, copy number variation analyses, and both common and rare single nucleotide association studies. However, data specifically with regard to the contribution of heterozygous single nucleotide variants (SNVs) have been inconsistent. In an effort to clarify the role of rare point mutations in <i>CNTNAP2</i> and related gene families, we have conducted targeted next-generation sequencing and evaluated existing sequence data in cohorts totaling 2704 cases and 2747 controls. We find no evidence for statistically significant association of rare heterozygous mutations in any of the <i>CNTN</i> or <i>CNTNAP</i> genes, including <i>CNTNAP2</i>, placing marked limits on the scale of their plausible contribution to risk.</p></div
Overall rare<sup>*</sup> variant mutation burden: all genes.
<p>*rare variants were defined as follows: seen only in either cases or controls exclusively, missense, nonsense, splice site, or start or stop codon disruptions with a frequency of less than 2% in this data, and less than 1% in all populations in the Exome Variant Server[<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004852#pgen.1004852.ref038" target="_blank">38</a>] and SeattleSNP[<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004852#pgen.1004852.ref039" target="_blank">39</a>] databases</p><p>Overall rare<sup><a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004852#t001fn001" target="_blank">*</a></sup> variant mutation burden: all genes.</p
Rates of singleton<sup>*</sup> mutations: all genes.
<p>*singleton mutations met the following criteria: seen only once in either cases or controls exclusively, missense, nonsense, splice site, or start or stop codon disruptions with a frequency of less than 2% in this data, and less than 1% in all populations in the Exome Variant Server[<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004852#pgen.1004852.ref038" target="_blank">38</a>] and SeattleSNP[<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004852#pgen.1004852.ref039" target="_blank">39</a>] databases</p><p>Rates of singleton<sup><a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004852#t002fn001" target="_blank">*</a></sup> mutations: all genes.</p