27 research outputs found

    Phenotype ontologies.

    No full text
    <p>Phenotype ontologies (an excerpt from the Human Phenotype Ontology is shown here) consist of thousands of terms describing phenotypes arranged in a hierarchical system of subclasses and superclasses. The structure of an ontology enables annotation propagation whereby more specific phenotypic terms are also described by more general parent terms, and thus all ancestral terms. The terms are related to one another by subclass (ā€œis aā€) relations, such that the ontology can be represented as a so-called directed acyclic graph. The terms themselves do not describe any specific disease. Instead, annotations to terms are used to state that a certain disease is characterised by a certain phenotypic feature.</p

    Predicting human genotype-phenotype relations from functional genomics data.

    No full text
    <p>The mouse phenotypes associated with the orthologues of human genes are a better predictor of genes that share human phenotypes than other popular gene annotations of the same genes, such as GO or KEGG. As both GO and KEGG include information derived from multiple sources, including annotations from the mouse, the success of the mouse phenotypes is likely due both to the genetic relevance of the mouse models and the fact that human and mouse phenotypic annotations both describe abnormalities (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004268#pgen-1004268-g001" target="_blank">Figure 1C</a>). Resnik's <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004268#pgen.1004268-Resnik1" target="_blank">[78]</a> measure, together with the GraSM approach <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004268#pgen.1004268-Couto1" target="_blank">[79]</a>, was used to calculate the similarity of terms organised in these hierarchical ontologies, defining the semantic similarity between any two terms as the average information content of their disjunct common-ancestor terms. Gene pairs were ordered by their semantic similarity scores based on either the human KEGG pathway annotations (pink circles), human GO biological process (grey circles), or MPO annotations to genes (blue circles). For each of KEGG, GO, and MPO annotations, gene pairs were ordered in decreasing annotation similarity and grouped into bins of 2,000, and then the median semantic similarity score between gene pairs' Human Phenotype Ontology annotations was calculated. The dashed line marks the degree of similarity expected from pairs of random genes.</p

    Coding sequence (CDS) lengths of genes with <i>de novo</i> variants.

    No full text
    <p>(A) ā€˜All genes’ denotes all translated human genes, ā€˜Siblings’ denotes genes with <i>de novo</i> mutations in non-autistic siblings of ASD cases published by O'Roak <i>et al.</i> and Sanders <i>et al.</i> Even the genes mutated in the healthy siblings are significantly longer than all coding genes (Mann–Whitney U test, P<2Ɨ10<sup>āˆ’16</sup>). The box plots depict the values between the 1<sup>st</sup> and 3<sup>rd</sup> quartile of a distribution, the 2<sup>nd</sup> quartile (thick band) represents the median. (B) Mutational burden strongly correlates with coding sequence length in the Exome Variant Server (Spearman's Ļā€Š=ā€Š0.710, P<2Ɨ10<sup>āˆ’16</sup>; <a href="http://evs.gs.washington.edu/EVS" target="_blank">http://evs.gs.washington.edu/EVS</a>). All nonsynonymous mutations were considered across all human chromosomes. (C) The median CDS length of a gene's connections correlates with its CDS length (Spearman's Ļā€Š=ā€Š0.508, P<2Ɨ10<sup>āˆ’16</sup>). We considered the strongest 100,000 links from the integrated phenotypic-linkage network.</p

    Clustering of genes hit by <i>de novo</i> nonsynonymous substitutions.

    No full text
    <p>(A) We have examined the network properties of whole sets of genes with nonsynonymous mutations implicated by recent exome-sequencing studies in autism (ASD), severe intellectual disability (ID), epilepsy or schizophrenia (S). We calculated the sum of link weights among genes from a set and compared this sum to that calculated for randomized gene sets in order to assess the degree of functional clustering. (B and C) The implicated genes are significantly more strongly interconnected with each other by means of functional genomics data than random gene sets of the same size, but controlling for coding sequence (CDS) length considerably affects the p-values. The genes mutated in the same disease cluster most significantly in the integrated phenotypic-linkage network, while genes mutated in healthy controls do not cluster.</p

    Sets of known and candidate T2D-risk genes and their functional associations within the T2D-PLN Community 5.

    No full text
    <p>Each named dot represents a known (panel <b>A</b>) or candidate (panels <b>B-G</b>) T2D-risk genes. Panel <b>A</b>: 13 monogenic and syndromic (Mono). Panel <b>B</b>: 71 genes residing within 72 T2D-risk GWAS intervals. Panels C-G: 40, 42, 51, 45 and 168 genes impacted by T2D-risk PT-variants in the African-American, Hispanic, East-Asian, South-Asian samples and then all samples except the [non-significant] European sample respectively. The conserved juxtapositions of the subnetworks contributed by each of the sets of known and candidate genes (panels <b>A-F</b>) to the combined network (panel <b>H</b>) are shown in <b><a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005816#pcbi.1005816.s004" target="_blank">S4 Fig</a></b>. The colour of the link connecting two genes indicates the strongest information source supporting the functional association. (TIF 6.7 Mb).</p

    Additional functional annotations of known and candidate T2D-risk genes.

    No full text
    <p><b>(A)</b> 10 tissues in which the average expression of genes within at least one of T2D risk set was found to be significantly high. Gene expression with 53 tissues was examined using recently-released data from the GTEx project [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005816#pcbi.1005816.ref026" target="_blank">26</a>] and transcriptomic profiles in pancreatic islets of 11 individuals [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005816#pcbi.1005816.ref027" target="_blank">27</a>]. Dotted line represents the significance threshold (FDR < 0.05). Results for all 53 tissues shown in <b><a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005816#pcbi.1005816.s006" target="_blank">S6 Fig</a></b>. <b>(B)</b> Phenotypes enriched following the disruption of the unique mouse orthologues of Monogenic and Syndromic Candidate T2D-risk genes and their corresponding enrichments amongst T2D-risk candidate gene sets. Dotted line represents the significance threshold (FDR < 0.05). Representative phenotypes are shown; All phenotypes shown in <b><a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005816#pcbi.1005816.s007" target="_blank">S7 Fig</a></b>. (TIF 560 kb).</p

    Processing and comparison of functional genomics data.

    No full text
    <p>(A) Terms in a phenotype ontology have an information content (IC) which is inversely proportional to the number of genes annotated with them. The semantic similarity between any two terms equals to the IC of their closest common ancestor term(s). (B) Gene–gene linkages derived from a data type are assessed and rescored according to the semantic similarity of the linked genes' mouse phenotype annotations. (C) The similarity in human phenotype annotations from the HPO is a benchmark on which all the data types can be compared, revealing their relative accuracy and coverage.</p

    Synergistic interaction in <i>Drosophila</i> between <i>Dlg</i> and <i>Pak</i>, the orthologues of ASD-candidate genes from a <i>de novo</i> loss CNV 11079_chr3_197208363.

    No full text
    <p><b>A.</b> The Locus of the CNV with mapped <i>Drosophila</i> orthologues (Candidates, green; controls, red). <b>B.</b> Representative pictures of NMJs from <i>dlg</i>/+ (using <i>dlg</i><sup><i>1</i></sup>), <i>pak</i>/+ (using <i>pak</i><sup><i>6</i></sup>), and <i>dlg/pak</i> 3<sup>rd</sup> instar larvae; Scale bar = 20μm. <b>C.</b> Synaptic alterations were characterised by NMJ bouton number. Individual heterozygous mutants of candidate gene orthologues <i>dlg</i> and <i>pak</i> (<i>dlg/+</i> and <i>pak/+)</i> gave no significant change in NMJ morphology over <i>w</i><sup><i>1118</i></sup> controls. However, <i>dlg</i>/<i>pak</i> transheterozygotes have reduced bouton numbers. (n>20, Kruskal-Wallis test, ** P<0.01). <b>D.</b> Non-candidate gene controls <i>fsn</i> (using <i>Fsn</i><sup><i>KG08128</i></sup>) and CG5359 (using <i>CG5359</i><sup><i>e03976</i></sup>) selected from genes found within CNV gave no significant NMJ phenotype singularly or when crossed to form transheterozygotes with <i>dlg</i> or <i>pak</i>. <b>E.</b> and <b>F.</b> Circadian rhythm analysis of candidate genes. All negative control <b>F.</b> and single mutants displayed normal light/dark differences in sleeping patterns. However, transheterozygote <i>dlg</i>/<i>pak</i> flies lost the dark bias, and displayed no significant difference between light/dark sleeping patterns (<b>t</b>).</p

    Clustering analyses of different T2D-risk candidate gene sets within Community 5.

    No full text
    <p><b>(A)</b> Clustering analyses between different genes set impacted by PT-variants within Community 5. We considered in four different cohorts (South Asian, East Asian, African-American and Hispanic) the 300 most associated genes based on SKAT-O test p-values and that belonged to Community 5. <b>(B)</b> Clustering analyses of gene sets associated with different T2D genetics risk factors: (1) monogenic and syndromic diabetes genes (Mono) (2) genes harbored by 72 reference T2D GWAS intervals (GWAS) (3) genes impacted by PT-variants in four different population cohorts (genes considered in A but taken together). Empirical p-values were obtained by comparing the sum of link weights between genes within the two gene-sets combined as compared to randomly sampled gene sets from the Community 5 matched in number, coding length and gene connectivity (degree). Four-point stars denote significant functional clustering between respective variant sets (FDR-corrected). 6-point stars denote individual variant sets that were found to be enriched with Community 5 members, 2 stars denoting FDR-corrected significance and 1 star denoting nominal significance. (TIF 523 kb).</p

    Type 2 diabetes phenotypic linkage network (T2D-PLN) construction.

    No full text
    <p>The T2D-PLN was constructed by evaluating the abilities of different data types to predict the similarity in the T2D-relevant phenotypes following the determined disruption of pairs of genes in the mouse[<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005816#pcbi.1005816.ref013" target="_blank">13</a>]. Different data types provide information of characteristic accuracy over different sets of genes. For each of the data types, we have ordered the gene pairs by their scores, divided them to bins of 1000 pairs and calculated the median cumulative semantic similarity between gene pairs’ mouse knockout phenotypes (Y-axis) at different levels of coverage (X-axis). The pink curve represents the final T2D-PLN. The Y-axis gives the semantic similarity of the phenotypes from the pairwise mouse model comparisons, while the X-axis gives the number of gene-gene links covered. (TIF 1.3 Mb).</p
    corecore