5 research outputs found

    Intrinsic Structural Disorder Confers Cellular Viability on Oncogenic Fusion Proteins

    Get PDF
    Chromosomal translocations, which often generate chimeric proteins by fusing segments of two distinct genes, represent the single major genetic aberration leading to cancer. We suggest that the unifying theme of these events is a high level of intrinsic structural disorder, enabling fusion proteins to evade cellular surveillance mechanisms that eliminate misfolded proteins. Predictions in 406 translocation-related human proteins show that they are significantly enriched in disorder (43.3% vs. 20.7% in all human proteins), they have fewer Pfam domains, and their translocation breakpoints tend to avoid domain splitting. The vicinity of the breakpoint is significantly more disordered than the rest of these already highly disordered fusion proteins. In the unlikely event of domain splitting in fusion it usually spares much of the domain or splits at locations where the newly exposed hydrophobic surface area approximates that of an intact domain. The mechanisms of action of fusion proteins suggest that in most cases their structural disorder is also essential to the acquired oncogenic function, enabling the long-range structural communication of remote binding and/or catalytic elements. In this respect, there are three major mechanisms that contribute to generating an oncogenic signal: (i) a phosphorylation site and a tyrosine-kinase domain are fused, and structural disorder of the intervening region enables intramolecular phosphorylation (e.g., BCR-ABL); (ii) a dimerisation domain fuses with a tyrosine kinase domain and disorder enables the two subunits within the homodimer to engage in permanent intermolecular phosphorylations (e.g., TFG-ALK); (iii) the fusion of a DNA-binding element to a transactivator domain results in an aberrant transcription factor that causes severe misregulation of transcription (e.g. EWS-ATF). Our findings also suggest novel strategies of intervention against the ensuing neoplastic transformations

    Identifying potential cancer driver genes by genomic data integration

    Get PDF
    Cancer is a genomic disease associated with a plethora of gene mutations resulting in a loss of control over vital cellular functions. Among these mutated genes, driver genes are defined as being causally linked to oncogenesis, while passenger genes are thought to be irrelevant for cancer development. With increasing numbers of large-scale genomic datasets available, integrating these genomic data to identify driver genes from aberration regions of cancer genomes becomes an important goal of cancer genome analysis and investigations into mechanisms responsible for cancer development. A computational method, MAXDRIVER, is proposed here to identify potential driver genes on the basis of copy number aberration (CNA) regions of cancer genomes, by integrating publicly available human genomic data. MAXDRIVER employs several optimization strategies to construct a heterogeneous network, by means of combining a fused gene functional similarity network, gene-disease associations and a disease phenotypic similarity network. MAXDRIVER was validated to effectively recall known associations among genes and cancers. Previously identified as well as novel driver genes were detected by scanning CNAs of breast cancer, melanoma and liver carcinoma. Three predicted driver genes (CDKN2A, AKT1, RNF139) were found common in these three cancers by comparative analysis

    Characterization of ecologically diverse viruses infecting co-occurring strains of cosmopolitan hyperhalophilic bacteroidetes

    Get PDF
    Hypersaline environments close to saturation harbor the highest density of virus-like particles reported for aquatic systems as well as low microbial diversity. Thus, they offer unique settings for studying virus-host interactions in nature. However, no viruses have been isolated so far infecting the two most abundant inhabitants of these systems (that is, the euryarchaeon Haloquadratum walsbyi and the bacteroidetes Salinibacter ruber). Here, using three different co-occurring strains, we have isolated eight viruses infecting the ubiquitous S. ruber that constitute three new different genera (named as 'Holosalinivirus', 'Kryptosalinivirus' and 'Kairosalinivirus') according to their genomic traits, different host range, virus-host interaction capabilities and abundances in natural systems worldwide. Furthermore, to get a more complete and comprehensive view of S. ruber virus assemblages in nature, a microcosm experiment was set with a mixture of S. ruber strains challenged with a brine virus concentrate, and changes of viral populations were monitored by viral metagenomics. Only viruses closely related to kairosalinivirus (strictly lytic and wide host range) were enriched, despite their low initial abundance in the natural sample. Metagenomic analyses of the mesocosms allowed the complete recovery of kairosalinivirus genomes using an ad hoc assembly strategy as common viral metagenomic assembly tools failed despite their abundance, which underlines the limitations of current approaches. The increase of this type of viruses was accompanied by an increase in the diversity of the group, as shown by contig recruitment. These results are consistent with a scenario in which host range, not only virus and host abundances, is a key factor in determining virus fate in nature.This research was supported by the Spanish Ministry of Economy projects CLG2015_66686-C3-1 (to JA) and CLG2015_66686-C3-3 (to RRM), which were also supported with European Regional Development Fund (FEDER) funds

    Integrating human omics data to prioritize candidate genes

    Get PDF
    Background: The identification of genes involved in human complex diseases remains a great challenge in computational systems biology. Although methods have been developed to use disease phenotypic similarities with a protein-protein interaction network for the prioritization of candidate genes, other valuable omics data sources have been largely overlooked in these methods. Methods: With this understanding, we proposed a method called BRIDGE to prioritize candidate genes by integrating disease phenotypic similarities with such omics data as protein-protein interactions, gene sequence similarities, gene expression patterns, gene ontology annotations, and gene pathway memberships. BRIDGE utilizes a multiple regression model with lasso penalty to automatically weight different data sources and is capable of discovering genes associated with diseases whose genetic bases are completely unknown. Results: We conducted large-scale cross-validation experiments and demonstrated that more than 60% known disease genes can be ranked top one by BRIDGE in simulated linkage intervals, suggesting the superior performance of this method. We further performed two comprehensive case studies by applying BRIDGE to predict novel genes and transcriptional networks involved in obesity and type II diabetes. Conclusion: The proposed method provides an effective and scalable way for integrating multi omics data to infer disease genes. Further applications of BRIDGE will be benefit to providing novel disease genes and underlying mechanisms of human diseases
    corecore