3 research outputs found
Coding sequence (CDS) lengths of genes with <i>de novo</i> variants.
<p>(A) āAll genesā denotes all translated human genes, āSiblingsā denotes genes with <i>de novo</i> mutations in non-autistic siblings of ASD cases published by O'Roak <i>et al.</i> and Sanders <i>et al.</i> Even the genes mutated in the healthy siblings are significantly longer than all coding genes (MannāWhitney U test, P<2Ć10<sup>ā16</sup>). The box plots depict the values between the 1<sup>st</sup> and 3<sup>rd</sup> quartile of a distribution, the 2<sup>nd</sup> quartile (thick band) represents the median. (B) Mutational burden strongly correlates with coding sequence length in the Exome Variant Server (Spearman's Ļā=ā0.710, P<2Ć10<sup>ā16</sup>; <a href="http://evs.gs.washington.edu/EVS" target="_blank">http://evs.gs.washington.edu/EVS</a>). All nonsynonymous mutations were considered across all human chromosomes. (C) The median CDS length of a gene's connections correlates with its CDS length (Spearman's Ļā=ā0.508, P<2Ć10<sup>ā16</sup>). We considered the strongest 100,000 links from the integrated phenotypic-linkage network.</p
Processing and comparison of functional genomics data.
<p>(A) Terms in a phenotype ontology have an information content (IC) which is inversely proportional to the number of genes annotated with them. The semantic similarity between any two terms equals to the IC of their closest common ancestor term(s). (B) Geneāgene linkages derived from a data type are assessed and rescored according to the semantic similarity of the linked genes' mouse phenotype annotations. (C) The similarity in human phenotype annotations from the HPO is a benchmark on which all the data types can be compared, revealing their relative accuracy and coverage.</p
Clustering of genes hit by <i>de novo</i> nonsynonymous substitutions.
<p>(A) We have examined the network properties of whole sets of genes with nonsynonymous mutations implicated by recent exome-sequencing studies in autism (ASD), severe intellectual disability (ID), epilepsy or schizophrenia (S). We calculated the sum of link weights among genes from a set and compared this sum to that calculated for randomized gene sets in order to assess the degree of functional clustering. (B and C) The implicated genes are significantly more strongly interconnected with each other by means of functional genomics data than random gene sets of the same size, but controlling for coding sequence (CDS) length considerably affects the p-values. The genes mutated in the same disease cluster most significantly in the integrated phenotypic-linkage network, while genes mutated in healthy controls do not cluster.</p