52 research outputs found

    Integration of Random Forest Classifiers and Deep Convolutional Neural Networks for Classification and Biomolecular Modeling of Cancer Driver Mutations

    Get PDF
    Development of machine learning solutions for prediction of functional and clinical significance of cancer driver genes and mutations are paramount in modern biomedical research and have gained a significant momentum in a recent decade. In this work, we integrate different machine learning approaches, including tree based methods, random forest and gradient boosted tree (GBT) classifiers along with deep convolutional neural networks (CNN) for prediction of cancer driver mutations in the genomic datasets. The feasibility of CNN in using raw nucleotide sequences for classification of cancer driver mutations was initially explored by employing label encoding, one hot encoding, and embedding to preprocess the DNA information. These classifiers were benchmarked against their tree-based alternatives in order to evaluate the performance on a relative scale. We then integrated DNA-based scores generated by CNN with various categories of conservational, evolutionary and functional features into a generalized random forest classifier. The results of this study have demonstrated that CNN can learn high level features from genomic information that are complementary to the ensemble-based predictors often employed for classification of cancer mutations. By combining deep learning-generated score with only two main ensemble-based functional features, we can achieve a superior performance of various machine learning classifiers. Our findings have also suggested that synergy of nucleotide-based deep learning scores and integrated metrics derived from protein sequence conservation scores can allow for robust classification of cancer driver mutations with a limited number of highly informative features. Machine learning predictions are leveraged in molecular simulations, protein stability, and network-based analysis of cancer mutations in the protein kinase genes to obtain insights about molecular signatures of driver mutations and enhance the interpretability of cancer-specific classification models

    Computer Simulations and Network-Based Profiling of Binding and Allosteric Interactions of SARS-CoV-2 Spike Variant Complexes and the Host Receptor: Dissecting the Mechanistic Effects of the Delta and Omicron Mutations

    Get PDF
    In this study, we combine all-atom MD simulations and comprehensive mutational scanning of S-RBD complexes with the angiotensin-converting enzyme 2 (ACE2) host receptor in the native form as well as the S-RBD Delta and Omicron variants to (a) examine the differences in the dynamic signatures of the S-RBD complexes and (b) identify the critical binding hotspots and sensitivity of the mutational positions. We also examined the differences in allosteric interactions and communications in the S-RBD complexes for the Delta and Omicron variants. Through the perturbation-based scanning of the allosteric propensities of the SARS-CoV-2 S-RBD residues and dynamics-based network centrality and community analyses, we characterize the global mediating centers in the complexes and the nature of local stabilizing communities. We show that a constellation of mutational sites (G496S, Q498R, N501Y and Y505H) correspond to key binding energy hotspots and also contribute decisively to the key interfacial communities that mediate allosteric communications between S-RBD and ACE2. These Omicron mutations are responsible for both favorable local binding interactions and long-range allosteric interactions, providing key functional centers that mediate the high transmissibility of the virus. At the same time, our results show that other mutational sites could provide a “flexible shield” surrounding the stable community network, thereby allowing the Omicron virus to modulate immune evasion at different epitopes, while protecting the integrity of binding and allosteric interactions in the RBD–ACE2 complexes. This study suggests that the SARS-CoV-2 S protein may exploit the plasticity of the RBD to generate escape mutants, while engaging a small group of functional hotspots to mediate efficient local binding interactions and long-range allosteric communications with ACE2

    Computational Analysis of Protein Stability and Allosteric Interaction Networks in Distinct Conformational Forms of the SARS CoV 2 Spike D614G Mutant: Reconciling Functional Mechanisms through Allosteric Model of Spike Regulation

    Get PDF
    In this study, we used an integrative computational approach to examine molecular mechanisms underlying functional effects of the D614G mutation by exploring atomistic modeling of the SARS-CoV-2 spike proteins as allosteric regulatory machines. We combined coarse-grained simulations, protein stability and dynamic fluctuation communication analysis with network-based community analysis to examine structures of the native and mutant SARS-CoV-2 spike proteins in different functional states. Through distance fluctuations communication analysis, we probed stability and allosteric communication propensities of protein residues in the native and mutant SARS-CoV-2 spike proteins, providing evidence that the D614G mutation can enhance long-range signaling of the allosteric spike engine. By combining functional dynamics analysis and ensemble-based alanine scanning of the SARS-CoV-2 spike proteins we found that the D614G mutation can improve stability of the spike protein in both closed and open forms, but shifting thermodynamic preferences towards the open mutant form. Our results revealed that the D614G mutation can promote the increased number of stable communities and allosteric hub centers in the open form by reorganizing and enhancing the stability of the S1-S2 inter-domain interactions and restricting mobility of the S1 regions. This study provides atomistic-based view of allosteric communications in the SARS-CoV-2 spike proteins, suggesting that the D614G mutation can exert its primary effect through allosterically induced changes on stability and communications in the residue interaction networks

    Allosteric Regulation at the Crossroads of New Technologies: Multiscale Modeling, Networks, and Machine Learning

    Get PDF
    Allosteric regulation is a common mechanism employed by complex biomolecular systems for regulation of activity and adaptability in the cellular environment, serving as an effective molecular tool for cellular communication. As an intrinsic but elusive property, allostery is a ubiquitous phenomenon where binding or disturbing of a distal site in a protein can functionally control its activity and is considered as the “second secret of life.” The fundamental biological importance and complexity of these processes require a multi-faceted platform of synergistically integrated approaches for prediction and characterization of allosteric functional states, atomistic reconstruction of allosteric regulatory mechanisms and discovery of allosteric modulators. The unifying theme and overarching goal of allosteric regulation studies in recent years have been integration between emerging experiment and computational approaches and technologies to advance quantitative characterization of allosteric mechanisms in proteins. Despite significant advances, the quantitative characterization and reliable prediction of functional allosteric states, interactions, and mechanisms continue to present highly challenging problems in the field. In this review, we discuss simulation-based multiscale approaches, experiment-informed Markovian models, and network modeling of allostery and information-theoretical approaches that can describe the thermodynamics and hierarchy allosteric states and the molecular basis of allosteric mechanisms. The wealth of structural and functional information along with diversity and complexity of allosteric mechanisms in therapeutically important protein families have provided a well-suited platform for development of data-driven research strategies. Data-centric integration of chemistry, biology and computer science using artificial intelligence technologies has gained a significant momentum and at the forefront of many cross-disciplinary efforts. We discuss new developments in the machine learning field and the emergence of deep learning and deep reinforcement learning applications in modeling of molecular mechanisms and allosteric proteins. The experiment-guided integrated approaches empowered by recent advances in multiscale modeling, network science, and machine learning can lead to more reliable prediction of allosteric regulatory mechanisms and discovery of allosteric modulators for therapeutically important protein targets

    Interpretable Machine Learning Models for Molecular Design of Tyrosine Kinase Inhibitors Using Variational Autoencoders and Perturbation-Based Approach of Chemical Space Exploration

    Get PDF
    In the current study, we introduce an integrative machine learning strategy for the autonomous molecular design of protein kinase inhibitors using variational autoencoders and a novel cluster-based perturbation approach for exploration of the chemical latent space. The proposed strategy combines autoencoder-based embedding of small molecules with a cluster-based perturbation approach for efficient navigation of the latent space and a feature-based kinase inhibition likelihood classifier that guides optimization of the molecular properties and targeted molecular design. In the proposed generative approach, molecules sharing similar structures tend to cluster in the latent space, and interpolating between two molecules in the latent space enables smooth changes in the molecular structures and properties. The results demonstrated that the proposed strategy can efficiently explore the latent space of small molecules and kinase inhibitors along interpretable directions to guide the generation of novel family-specific kinase molecules that display a significant scaffold diversity and optimal biochemical properties. Through assessment of the latent-based and chemical feature-based binary and multiclass classifiers, we developed a robust probabilistic evaluator of kinase inhibition likelihood that is specifically tailored to guide the molecular design of novel SRC kinase molecules. The generated molecules originating from LCK and ABL1 kinase inhibitors yielded ~40% of novel and valid SRC kinase compounds with high kinase inhibition likelihood probability values (p \u3e 0.75) and high similarity (Tanimoto coefficient \u3e 0.6) to the known SRC inhibitors. By combining the molecular perturbation design with the kinase inhibition likelihood analysis and similarity assessments, we showed that the proposed molecular design strategy can produce novel valid molecules and transform known inhibitors of different kinase families into potential chemical probes of the SRC kinase with excellent physicochemical profiles and high similarity to the known SRC kinase drugs. The results of our study suggest that task-specific manipulation of a biased latent space may be an important direction for more effective task-oriented and target-specific autonomous chemical design models

    Integrating Conformational Dynamics and Perturbation-Based Network Modeling for Mutational Profiling of Binding and Allostery in the SARS-CoV-2 Spike Variant Complexes with Antibodies: Balancing Local and Global Determinants of Mutational Escape Mechanisms

    Get PDF
    n this study, we combined all-atom MD simulations, the ensemble-based mutational scanning of protein stability and binding, and perturbation-based network profiling of allosteric interactions in the SARS-CoV-2 spike complexes with a panel of cross-reactive and ultra-potent single antibodies (B1-182.1 and A23-58.1) as well as antibody combinations (A19-61.1/B1-182.1 and A19-46.1/B1-182.1). Using this approach, we quantify the local and global effects of mutations in the complexes, identify protein stability centers, characterize binding energy hotspots, and predict the allosteric control points of long-range interactions and communications. Conformational dynamics and distance fluctuation analysis revealed the antibody-specific signatures of protein stability and flexibility of the spike complexes that can affect the pattern of mutational escape. A network-based perturbation approach for mutational profiling of allosteric residue potentials revealed how antibody binding can modulate allosteric interactions and identified allosteric control points that can form vulnerable sites for mutational escape. The results show that the protein stability and binding energetics of the SARS-CoV-2 spike complexes with the panel of ultrapotent antibodies are tolerant to the effect of Omicron mutations, which may be related to their neutralization efficiency. By employing an integrated analysis of conformational dynamics, binding energetics, and allosteric interactions, we found that the antibodies that neutralize the Omicron spike variant mediate the dominant binding energy hotpots in the conserved stability centers and allosteric control points in which mutations may be restricted by the requirements of the protein folding stability and binding to the host receptor. This study suggested a mechanism in which the patterns of escape mutants for the ultrapotent antibodies may not be solely determined by the binding interaction changes but are associated with the balance and tradeoffs of multiple local and global factors, including protein stability, binding affinity, and long-range interactions

    Probing Mechanisms of Binding and Allostery in the SARS-CoV-2 Spike Omicron Variant Complexes with the Host Receptor: Revealing Functional Roles of the Binding Hotspots in Mediating Epistatic Effects and Communication with Allosteric Pockets

    Get PDF
    In this study, we performed all-atom MD simulations of RBD–ACE2 complexes for BA.1, BA.1.1, BA.2, and BA.3 Omicron subvariants, conducted a systematic mutational scanning of the RBD–ACE2 binding interfaces and analysis of electrostatic effects. The binding free energy computations of the Omicron RBD–ACE2 complexes and comprehensive examination of the electrostatic interactions quantify the driving forces of binding and provide new insights into energetic mechanisms underlying evolutionary differences between Omicron variants. A systematic mutational scanning of the RBD residues determines the protein stability centers and binding energy hotpots in the Omicron RBD–ACE2 complexes. By employing the ensemble-based global network analysis, we propose a community-based topological model of the Omicron RBD interactions that characterized functional roles of the Omicron mutational sites in mediating non-additive epistatic effects of mutations. Our findings suggest that non-additive contributions to the binding affinity may be mediated by R493, Y498, and Y501 sites and are greater for the Omicron BA.1.1 and BA.2 complexes that display the strongest ACE2 binding affinity among the Omicron subvariants. A network-centric adaptation model of the reversed allosteric communication is unveiled in this study, which established a robust connection between allosteric network hotspots and potential allosteric binding pockets. Using this approach, we demonstrated that mediating centers of long-range interactions could anchor the experimentally validated allosteric binding pockets. Through an array of complementary approaches and proposed models, this comprehensive and multi-faceted computational study revealed and quantified multiple functional roles of the key Omicron mutational site R493, R498, and Y501 acting as binding energy hotspots, drivers of electrostatic interactions as well as mediators of epistatic effects and long-range communications with the allosteric pockets

    Comparative Perturbation-Based Modeling of the SARS-CoV-2 Spike Protein Binding with Host Receptor and Neutralizing Antibodies: Structurally Adaptable Allosteric Communication Hotspots Define Spike Sites Targeted by Global Circulating Mutations

    Get PDF
    In this study, we used an integrative computational approach to examine molecular mechanisms and determine functional signatures underlying the role of functional residues in the SARS-CoV-2 spike protein that are targeted by novel mutational variants and antibody-escaping mutations. Atomistic simulations and functional dynamics analysis are combined with alanine scanning and mutational sensitivity profiling of the SARS-CoV-2 spike protein complexes with the ACE2 host receptor and the REGN-COV2 antibody cocktail(REG10987+REG10933). Using alanine scanning and mutational sensitivity analysis, we have shown that K417, E484, and N501 residues correspond to key interacting centers with a significant degree of structural and energetic plasticity that allow mutants in these positions to afford the improved binding affinity with ACE2. Through perturbation-based network modeling and community analysis of the SARS-CoV-2 spike protein complexes with ACE2, we demonstrate that E406, N439, K417, and N501 residues serve as effector centers of allosteric interactions and anchor major intermolecular communities that mediate long-range communication in the complexes. The results provide support to a model according to which mutational variants and antibody-escaping mutations constrained by the requirements for host receptor binding and preservation of stability may preferentially select structurally plastic and energetically adaptable allosteric centers to differentially modulate collective motions and allosteric interactions in the complexes with the ACE2 enzyme and REGN-COV2 antibody combination. This study suggests that the SARS-CoV-2 spike protein may function as a versatile and functionally adaptable allosteric machine that exploits the plasticity of allosteric regulatory centers to fine-tune response to antibody binding without compromising the activity of the spike protein

    Landscape-Based Mutational Sensitivity Cartography and Network Community Analysis of the SARS-CoV-2 Spike Protein Structures: Quantifying Functional Effects of the Circulating D614G Variant

    Get PDF
    We developed and applied a computational approach to simulate functional effects of the global circulating mutation D614G of the SARS-CoV-2 spike protein. All-atom molecular dynamics simulations are combined with deep mutational scanning and analysis of the residue interaction networks to investigate conformational landscapes and energetics of the SARS-CoV-2 spike proteins in different functional states of the D614G mutant. The results of conformational dynamics and analysis of collective motions demonstrated that the D614 site plays a key regulatory role in governing functional transitions between open and closed states. Using mutational scanning and sensitivity analysis of protein residues, we identified the stability hotspots in the SARS-CoV-2 spike structures of the mutant trimers. The results suggest that the D614G mutation can induce the increased stability of the open form acting as a driver of conformational changes, which may result in the increased exposure to the host receptor and promote infectivity of the virus. The network community analysis of the SARS-CoV-2 spike proteins showed that the D614G mutation can enhance long-range couplings between domains and strengthen the interdomain interactions in the open form, supporting the reduced shedding mechanism. This study provides the landscape-based perspective and atomistic view of the allosteric interactions and stability hotspots in the SARS-CoV-2 spike proteins, offering a useful insight into the molecular mechanisms underpinning functional effects of the global circulating mutations

    Atomistic Simulations and In Silico Mutational Profiling of Protein Stability and Binding in the SARS-CoV-2 Spike Protein Complexes with Nanobodies: Molecular Determinants of Mutational Escape Mechanisms

    Get PDF
    Structure-functional studies have recently revealed a spectrum of diverse high-affinity nanobodies with efficient neutralizing capacity against SARS-CoV-2 virus and resilience against mutational escape. In this study, we combine atomistic simulations with the ensemble-based mutational profiling of binding for the SARS-CoV-2 S-RBD complexes with a wide range of nanobodies to identify dynamic and binding affinity fingerprints and characterize the energetic determinants of nanobody-escaping mutations. Using an in silico mutational profiling approach for probing the protein stability and binding, we examine dynamics and energetics of the SARS-CoV-2 complexes with single nanobodies Nb6 and Nb20, VHH E, a pair combination VHH E + U, a biparatopic nanobody VHH VE, and a combination of the CC12.3 antibody and VHH V/W nanobodies. This study characterizes the binding energy hotspots in the SARS-CoV-2 protein and complexes with nanobodies providing a quantitative analysis of the effects of circulating variants and escaping mutations on binding that is consistent with a broad range of biochemical experiments. The results suggest that mutational escape may be controlled through structurally adaptable binding hotspots in the receptor-accessible binding epitope that are dynamically coupled to the stability centers in the distant binding epitope targeted by VHH U/V/W nanobodies. This study offers a plausible mechanism in which through cooperative dynamic changes, nanobody combinations and biparatopic nanobodies can elicit the increased binding affinity response and yield resilience to common escape mutants
    • …
    corecore