2,423 research outputs found

    Adaptive local learning in sampling based motion planning for protein folding

    Get PDF
    BACKGROUND: Simulating protein folding motions is an important problem in computational biology. Motion planning algorithms, such as Probabilistic Roadmap Methods, have been successful in modeling the folding landscape. Probabilistic Roadmap Methods and variants contain several phases (i.e., sampling, connection, and path extraction). Most of the time is spent in the connection phase and selecting which variant to employ is a difficult task. Global machine learning has been applied to the connection phase but is inefficient in situations with varying topology, such as those typical of folding landscapes. RESULTS: We develop a local learning algorithm that exploits the past performance of methods within the neighborhood of the current connection attempts as a basis for learning. It is sensitive not only to different types of landscapes but also to differing regions in the landscape itself, removing the need to explicitly partition the landscape. We perform experiments on 23 proteins of varying secondary structure makeup with 52–114 residues. We compare the success rate when using our methods and other methods. We demonstrate a clear need for learning (i.e., only learning methods were able to validate against all available experimental data) and show that local learning is superior to global learning producing, in many cases, significantly higher quality results than the other methods. CONCLUSIONS: We present an algorithm that uses local learning to select appropriate connection methods in the context of roadmap construction for protein folding. Our method removes the burden of deciding which method to use, leverages the strengths of the individual input methods, and it is extendable to include other future connection methods

    REinforcement learning based Adaptive samPling: REAPing Rewards by Exploring Protein Conformational Landscapes

    Full text link
    One of the key limitations of Molecular Dynamics simulations is the computational intractability of sampling protein conformational landscapes associated with either large system size or long timescales. To overcome this bottleneck, we present the REinforcement learning based Adaptive samPling (REAP) algorithm that aims to efficiently sample conformational space by learning the relative importance of each reaction coordinate as it samples the landscape. To achieve this, the algorithm uses concepts from the field of reinforcement learning, a subset of machine learning, which rewards sampling along important degrees of freedom and disregards others that do not facilitate exploration or exploitation. We demonstrate the effectiveness of REAP by comparing the sampling to long continuous MD simulations and least-counts adaptive sampling on two model landscapes (L-shaped and circular), and realistic systems such as alanine dipeptide and Src kinase. In all four systems, the REAP algorithm consistently demonstrates its ability to explore conformational space faster than the other two methods when comparing the expected values of the landscape discovered for a given amount of time. The key advantage of REAP is on-the-fly estimation of the importance of collective variables, which makes it particularly useful for systems with limited structural information

    A scalable method for parallelizing sampling-based motion planning algorithms

    Full text link
    Abstract—This paper describes a scalable method for paral-lelizing sampling-based motion planning algorithms. It subdi-vides configuration space (C-space) into (possibly overlapping) regions and independently, in parallel, uses standard (sequen-tial) sampling-based planners to construct roadmaps in each region. Next, in parallel, regional roadmaps in adjacent regions are connected to form a global roadmap. By subdividing the space and restricting the locality of connection attempts, we reduce the work and inter-processor communication associated with nearest neighbor calculation, a critical bottleneck for scalability in existing parallel motion planning methods. We show that our method is general enough to handle a variety of planning schemes, including the widely used Probabilistic Roadmap (PRM) and Rapidly-exploring Random Trees (RRT) algorithms. We compare our approach to two other existing parallel algorithms and demonstrate that our approach achieves better and more scalable performance. Our approach achieves almost linear scalability on a 2400 core LINUX cluster and on a 153,216 core Cray XE6 petascale machine. I

    Improved Sampling Based Motion Planning Through Local Learning

    Get PDF
    Every motion made by a moving object is either planned implicitly, e.g., human natural movement from one point to another, or explicitly, e.g., pre-planned information about where a robot should move in a room to effectively avoid colliding with obstacles. Motion planning is a well-studied concept in robotics and it involves moving an object from a start to goal configuration. Motion planning arises in many application domains such as robotics, computer animation (digital actors), intelligent CAD (virtual prototyping and training) and even computational biology (protein folding and drug design). Interestingly, a single class of planners, sampling-based planners have proven effective in all these domains. Probabilistic Roadmap Methods (PRMs) are one type of sampling-based planners that sample robot configurations (nodes) and connect them via viable local paths (edges) to form a roadmap containing representative feasible trajectories. The roadmap is then queried to find solution paths between start and goal configurations. Different PRM strategies perform differently given different input parameters, e.g., workspace environments and robot definitions. Motion planning, however, is computationally hard – it requires geometric path planning which has been shown to be PSPACE hard, complex representational issues for robots with known physical, geometric and temporal constraints, and challenging mapping/representing requirements for the workspace environment. Many important environments, e.g., houses, factories and airports, are heterogeneous, i.e., contain free, cluttered and narrow spaces. Heterogeneous environments, however, introduce a new set of problems for motion planning and PRM strategies because there is no ideal method suitable for all regions in the environment. In this work we introduce a technique that can adapt and apply PRM methods suitable for local regions in an environment. The basic strategy is to first identify a local region of the environment suitable for the current action based on identified neighbors. Next, based on past performance of methods in this region, adapt and pick a method to use at this time. This selection and adaptation is done by applying machine learning. By performing the local region creation in this dynamic fashion, we remove the need to explicitly partition the environment as was done in previous methods and which is difficult to do, slows down performance and includes the difficult process of determining what strategy to use even after making an explicit partitioning. Our method handles and removes these overheads. We show benefits of this approach in both planning robot motions and in protein folding simulations. We perform experiments on robots in simulation with different degrees of freedom and varying levels of heterogeneity in the environment and show an improvement in performance when our local learning method is applied. Protein folding simulations were performed on 23 proteins and we note an improvement in the quality of pathways produced with comparable performance in terms of time needed to build the roadmap

    High Performance Computing Techniques to Better Understand Protein Conformational Space

    Get PDF
    This thesis presents an amalgamation of high performance computing techniques to get better insight into protein molecular dynamics. Key aspects of protein function and dynamics can be learned from their conformational space. Datasets that represent the complex nuances of a protein molecule are high dimensional. Efficient dimensionality reduction becomes indispensable for the analysis of such exorbitant datasets. Dimensionality reduction forms a formidable portion of this work and its application has been explored for other datasets as well. It begins with the parallelization of a known non-liner feature reduction algorithm called Isomap. The code for the algorithm was re-written in C with portions of it parallelized using OpenMP. Next, a novel data instance reduction method was devised which evaluates the information content offered by each data point, which ultimately helps in truncation of the dataset with much fewer data points to evaluate. Once a framework has been established to reduce the number of variables representing a dataset, the work is extended to explore algebraic topology techniques to extract meaningful information from these datasets. This step is the one that helps in sampling the conformations of interest of a protein molecule. The method employs the notion of hierarchical clustering to identify classes within a molecule, thereafter, algebraic topology is used to analyze these classes. Finally, the work is concluded by presenting an approach to solve the open problem of protein folding. A Monte-Carlo based tree search algorithm is put forth to simulate the pathway that a certain protein conformation undertakes to reach another conformation. The dissertation, in its entirety, offers solutions to a few problems that hinder the progress of solution for the vast problem of understanding protein dynamics. The motion of a protein molecule is guided by changes in its energy profile. In this course the molecule gradually slips from one energy class to another. Structurally, this switch is transient spanning over milliseconds or less and hence is difficult to be captured solely by the work in wet laboratories

    Intelligent Motion Planning and Analysis with Probabilistic Roadmap Methods for the Study of Complex and High-Dimensional Motions

    Get PDF
    At first glance, robots and proteins have little in common. Robots are commonly thought of as tools that perform tasks such as vacuuming the floor, while proteins play essential roles in many biochemical processes. However, the functionality of both robots and proteins is highly dependent on their motions. In order to study motions in these two divergent domains, the same underlying algorithmic framework can be applied. This method is derived from probabilistic roadmap methods (PRMs) originally developed for robotic motion planning. It builds a graph, or roadmap, where configurations are represented as vertices and transitions between configurations are edges. The contribution of this work is a set of intelligent methods applied to PRMs. These methods facilitate both the modeling and analysis of motions, and have enabled the study of complex and high-dimensional problems in both robotic and molecular domains. In order to efficiently study biologically relevant molecular folding behaviors we have developed new techniques based on Monte Carlo solution, master equation calculation, and non-linear dimensionality reduction to run simulations and analysis on the roadmap. The first method, Map-based master equation calculation (MME), extracts global properties of the folding landscape such as global folding rates. On the other hand, another method, Map-based Monte Carlo solution (MMC), can be used to extract microscopic features of the folding process. Also, the application of dimensionality reduction returns a lower-dimensional representation that still retains the principal features while facilitating both modeling and analysis of motion landscapes. A key contribution of our methods is the flexibility to study larger and more complex structures, e.g., 372 residue Alpha-1 antitrypsin and 200 nucleotide ColE1 RNAII. We also applied intelligent roadmap-based techniques to the area of robotic motion. These methods take advantage of unsupervised learning methods at all stages of the planning process and produces solutions in complex spaces with little cost and less manual intervention compared to other adaptive methods. Our results show that our methods have low overhead and that they out-perform two existing adaptive methods in all complex cases studied

    Motion planning for geometric models in data visualization

    Get PDF
    Interaktivní geometrické modely pro simulaci přírodních jevů (LH11006)Pokročilé grafické a počítačové systémy (SGS-2016-013)A finding of path is an important task in many research areas and it is a common problem solved in a wide range of applications. New problems of finding path appear and complex problems persist, such as a real-time plan- ning of paths for huge crowds in dynamic environments, where the properties according to which the cost of a path is evaluated as well as the topology of paths may change. The task of finding a path can be divided into path planning and motion planning, which implicitly respects the collision with surroundings in the environment. Within the first group this thesis focuses on path planning on graphs for crowds. The main idea is to group members of the crowd by their common initial and target positions and then plan the path for one representative member of each group. These representative members can be navigated by classic approaches and the rest of the group will follow them. If the crowd can be divided into a few groups this way, the proposed approach will save a huge amount of computational and memory demands in dynamic environments. In the second area, motion planning, we are dealing with another problem. The task is to navigate the ligand through the protein or into the protein, which turns out to be a challenging problem because it needs to be solved in 3D with the collision detection
    • …
    corecore