9,078 research outputs found

    Refining interaction search through signed iterative Random Forests

    Full text link
    Advances in supervised learning have enabled accurate prediction in biological systems governed by complex interactions among biomolecules. However, state-of-the-art predictive algorithms are typically black-boxes, learning statistical interactions that are difficult to translate into testable hypotheses. The iterative Random Forest algorithm took a step towards bridging this gap by providing a computationally tractable procedure to identify the stable, high-order feature interactions that drive the predictive accuracy of Random Forests (RF). Here we refine the interactions identified by iRF to explicitly map responses as a function of interacting features. Our method, signed iRF, describes subsets of rules that frequently occur on RF decision paths. We refer to these rule subsets as signed interactions. Signed interactions share not only the same set of interacting features but also exhibit similar thresholding behavior, and thus describe a consistent functional relationship between interacting features and responses. We describe stable and predictive importance metrics to rank signed interactions. For each SPIM, we define null importance metrics that characterize its expected behavior under known structure. We evaluate our proposed approach in biologically inspired simulations and two case studies: predicting enhancer activity and spatial gene expression patterns. In the case of enhancer activity, s-iRF recovers one of the few experimentally validated high-order interactions and suggests novel enhancer elements where this interaction may be active. In the case of spatial gene expression patterns, s-iRF recovers all 11 reported links in the gap gene network. By refining the process of interaction recovery, our approach has the potential to guide mechanistic inquiry into systems whose scale and complexity is beyond human comprehension

    Evolutionary Multi-Objective Design of SARS-CoV-2 Protease Inhibitor Candidates

    Full text link
    Computational drug design based on artificial intelligence is an emerging research area. At the time of writing this paper, the world suffers from an outbreak of the coronavirus SARS-CoV-2. A promising way to stop the virus replication is via protease inhibition. We propose an evolutionary multi-objective algorithm (EMOA) to design potential protease inhibitors for SARS-CoV-2's main protease. Based on the SELFIES representation the EMOA maximizes the binding of candidate ligands to the protein using the docking tool QuickVina 2, while at the same time taking into account further objectives like drug-likeliness or the fulfillment of filter constraints. The experimental part analyzes the evolutionary process and discusses the inhibitor candidates.Comment: 15 pages, 7 figures, submitted to PPSN 202

    Measuring the energy landscape roughness and the transition state location of biomolecules using single molecule mechanical unfolding experiments

    Full text link
    Single molecule mechanical unfolding experiments are beginning to provide profiles of the complex energy landscape of biomolecules. In order to obtain reliable estimates of the energy landscape characteristics it is necessary to combine the experimental measurements with sound theoretical models and simulations. Here, we show how by using temperature as a variable in mechanical unfolding of biomolecules in laser optical tweezer or AFM experiments the roughness of the energy landscape can be measured without making any assumptions about the underlying reaction oordinate. The efficacy of the formalism is illustrated by reviewing experimental results that have directly measured roughness in a protein-protein complex. The roughness model can also be used to interpret experiments on forced-unfolding of proteins in which temperature is varied. Estimates of other aspects of the energy landscape such as free energy barriers or the transition state (TS) locations could depend on the precise model used to analyze the experimental data. We illustrate the inherent difficulties in obtaining the transition state location from loading rate or force-dependent unfolding rates. Because the transition state moves as the force or the loading rate is varied it is in general difficult to invert the experimental data unless the curvature at the top of the one dimensional free energy profile is large, i.e the barrier is sharp. The independence of the TS location on force holds good only for brittle or hard biomolecules whereas the TS location changes considerably if the molecule is soft or plastic. We also comment on the usefulness of extension of the molecule as a surrogate reaction coordinate especially in the context of force-quench refolding of proteins and RNA.Comment: 44 pages, 7 figure

    Onsager-Machlup action-based path sampling and its combination with replica exchange for diffusive and multiple pathways

    Full text link
    For sampling multiple pathways in a rugged energy landscape, we propose a novel action-based path sampling method using the Onsager-Machlup action functional. Inspired by the Fourier-path integral simulation of a quantum mechanical system, a path in Cartesian space is transformed into that in Fourier space, and an overdamped Langevin equation is derived for the Fourier components to achieve a canonical ensemble of the path at a finite temperature. To avoid "path trapping" around an initially guessed path, the path sampling method is further combined with a powerful sampling technique, the replica exchange method. The principle and algorithm of our method is numerically demonstrated for a model two-dimensional system with a bifurcated potential landscape. The results are compared with those of conventional transition path sampling and the equilibrium theory, and the error due to path discretization is also discussed.Comment: 20 pages, 5 figures, submitted to J. Chem. Phy
    • …
    corecore