18 research outputs found

    Human Monogenic Disease Genes Have Frequently Functionally Redundant Paralogs

    Get PDF
    <div><p>Mendelian disorders are often caused by mutations in genes that are not lethal but induce functional distortions leading to diseases. Here we study the extent of gene duplicates that might compensate genes causing monogenic diseases. We provide evidence for pervasive functional redundancy of human monogenic disease genes (MDs) by duplicates by manifesting 1) genes involved in human genetic disorders are enriched in duplicates and 2) duplicated disease genes tend to have higher functional similarities with their closest paralogs in contrast to duplicated non-disease genes of similar age. We propose that functional compensation by duplication of genes masks the phenotypic effects of deleterious mutations and reduces the probability of purging the defective genes from the human population; this functional compensation could be further enhanced by higher purification selection between disease genes and their duplicates as well as their orthologous counterpart compared to non-disease genes. However, due to the intrinsic expression stochasticity among individuals, the deleterious mutations could still be present as genetic diseases in some subpopulations where the duplicate copies are expressed at low abundances. Consequently the defective genes are linked to genetic disorders while they continue propagating within the population. Our results provide insight into the molecular basis underlying the spreading of duplicated disease genes.</p></div

    Evidence for pervasive functional redundancy in duplicated disease genes based on Gene Ontology annotations.

    No full text
    <p>Compared with duplicated non-disease genes (ND) of similar duplication age (represented by branch length, see Methods), monogenic disease genes (MD) tend to have A) higher functional similarity according to Gene Ontology annotations with their most recent duplications (MRDs; p-value = 7.77×10<sup>−5</sup>, Hypergeometric Distribution test); B) the same are also true when duplication ages are omitted (Wilcoxon Rank Sum Test).</p

    Enrichment of MDs in old SSDs and distinct characteristics of the old SSDs as compared with the young ones.

    No full text
    <p>Statistics using data from Singh et al. <i>P</i>-values and ORs (odd ratios) are calculated using Fisher's Exact Test (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003758#pcbi.1003758.s001" target="_blank">Dataset S1</a> for the R code). A. MDs genes are enriched in old duplicates; left: percentage of old SSDs in MDs, right: percentage of old SSDs in all genes. B. Recessive MDs genes are enriched in old SSDs; left: percentage of old SSDs in recessive MDs, right: percentage of old SSDs in all MDs. C. Essential genes are depleted in young SSDs; left: percentage of young SSDs that are essential, right: percentage of young SSDs in tested genes. D. Essential genes are enriched in old SSDs; left: percentage of old SSDs that are essential, right: percentage of old SSDs in tested genes.</p

    Duplicated genes are enriched in monogenic disease genes.

    No full text
    <p>A) percentages of duplicates in monogenic disease genes (MD) and non-disease genes (ND). B) percentages of monogenic disease genes as function of number of duplicates in human; 0 indicates that genes are singletons. Here duplicates were defined using TreeFam. P-value shown in panel A was calculated using Fisher's Exact Test; level of significance: *** <0.001, ** <0.01, * <0.05. Numbers shown within the bars are gene counts (subset/total).</p

    Evidence for functional redundancy in duplicated disease genes.

    No full text
    <p>Comparing with duplicated non-disease genes (ND) of similar duplication age (represented by branch length, see Methods), monogenic disease genes (MD) tend have A) higher co-expression co-efficient (p-value = 1.69×10<sup>−3</sup>, Hypergeometric Distribution test), C) higher sequence similarity (p-value = 1.66×10<sup>−3</sup>, Hypergeometric Distribution test). Results in A) can be repeated using another set of gene expression data (<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003073#pcbi.1003073.s003" target="_blank">Figure S3</a>). P-values shown in the boxplots (B and D) were calculated using two-sample Wilcoxon Rank Sum Test; see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003073#s4" target="_blank">Materials and Methods</a> for more details regarding the statistical tests. Numbers shown next the boxplots are the numbers of valid samples (after removing samples with missing values).</p

    A model for the effect of functional compensation on the propagation of duplicated disease genes in the human population.

    No full text
    <p>This model is based on two previous experimental studies. The first showed that genes with identical promoters could have very different expression abundances in individual <i>E. coli</i> cells <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003073#pcbi.1003073-Elowitz1" target="_blank">[33]</a>. The second showed different <i>C. elegans</i> individuals carrying the defect gene could demonstrate varying phenotypes ranging from wild type to stalled development on embryogenesis, depending on the expression abundance of a duplicate gene <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003073#pcbi.1003073-Burga1" target="_blank">[34]</a>. We therefore propose that in cases where a duplicate (A1_human) exists (panel A), the functional impairment caused by mutations on a disease gene (A2_human) could be compensated; however due to intrinsic expression stochasticity of the duplicate copy, some individuals would appear to be normal while some others show reduced fitness (panel B). Consequently this gene A2 is linked to genetic disorders while the deleterious mutations it carries continue to spread instead of being removed in the human population. On the other hand, if a disease gene (B_human; panel C) is a singlet without any paralogs, its mutations then would be more likely to be purged from the population (panel D) since compensation by non-duplicates via genetic interactions is relatively rare <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003073#pcbi.1003073-Dean1" target="_blank">[16]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003073#pcbi.1003073-Li1" target="_blank">[17]</a>.</p

    Higher purifying selections on duplicated disease genes.

    No full text
    <p>Compared with non-disease genes (NDs), disease genes tend to have lower dN values with their mouse- (A) and Macaca- (B) one-to-one orthologs. Furthermore, compared with disease singletons (singlet genes or singletons refer to those that do not share significant protein sequence similarities with other human genes), duplicated disease genes tend to have lower dN values with their mouse- (C) and Macaca- (D) orthologs. The higher selective constraints on duplicated disease genes can be also seen within the human genome; for example, compared with duplicated non-disease genes (ND) of similar duplication age, disease genes tend to have lower dN values with their closest paralogs within human (E; p-value = 4×10<sup>−7</sup>, Hypergeometric Distribution test). However the same isn't true when age is omitted (F), highlighting the importance of dividing gene pairs according to their duplication age. P-values shown in the boxplots (A∼D and F) were calculated using two-sample Wilcoxon Rank Sum Test. A similar plot showing no outliers is also available in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003073#pcbi.1003073.s006" target="_blank">Figure S6</a>.</p

    iBrick: A New Standard for Iterative Assembly of Biological Parts with Homing Endonucleases

    No full text
    <div><p>The BioBricks standard has made the construction of DNA modules easier, quicker and cheaper. So far, over 100 BioBricks assembly schemes have been developed and many of them, including the original standard of BBF RFC 10, are now widely used. However, because the restriction endonucleases employed by these standards usually recognize short DNA sequences that are widely spread among natural DNA sequences, and these recognition sites must be removed before the parts construction, there is much inconvenience in dealing with large-size DNA parts (<i>e.g</i>., more than couple kilobases in length) with the present standards. Here, we introduce a new standard, namely iBrick, which uses two homing endonucleases of I-SceI and PI-PspI. Because both enzymes recognize long DNA sequences (>18 bps), their sites are extremely rare in natural DNA sources, thus providing additional convenience, especially in handling large pieces of DNA fragments. Using the iBrick standard, the carotenoid biosynthetic cluster (>4 kb) was successfully assembled and the actinorhodin biosynthetic cluster (>20 kb) was easily cloned and heterologously expressed. In addition, a corresponding nomenclature system has been established for the iBrick standard.</p></div

    iBrick elements used in this study.

    No full text
    <p>iBrick elements used in this study.</p

    Ligation efficiency of iBrick assembly.

    No full text
    <p>* Clones on plates were evenly divided into 4 even sections with only one counted and the total number was then roughly calculated as four times of the number.</p><p>Ligation efficiency of iBrick assembly.</p
    corecore