9 research outputs found

    CRANKITE: a fast polypeptide backbone conformation sampler

    Get PDF
    Background: CRANKITE is a suite of programs for simulating backbone conformations of polypeptides and proteins. The core of the suite is an efficient Metropolis Monte Carlo sampler of backbone conformations in continuous three-dimensional space in atomic details. Methods: In contrast to other programs relying on local Metropolis moves in the space of dihedral angles, our sampler utilizes local crankshaft rotations of rigid peptide bonds in Cartesian space. Results: The sampler allows fast simulation and analysis of secondary structure formation and conformational changes for proteins of average length

    Path planning on manifolds using randomized higher-dimensional continuation

    Get PDF
    Despite the significant advances in path planning methods, problems involving highly constrained spaces are still challenging. In particular, in many situations the configuration space is a non-parametrizable variety implicitly defined by constraints, which complicates the successful generalization of sampling-based path planners. In this paper, we present a new path planning algorithm specially tailored for highly constrained systems. It builds on recently developed tools for Higher-dimensional Continuation, which provide numerical procedures to describe an implicitly defined variety using a set of local charts. We propose to extend these methods to obtain an efficient path planner on varieties, handling highly constrained problems. The advantage of this planner comes from that it directly operates into the configuration space and not into the higher-dimensional ambient space, as most of the existing methods do.Postprint (author’s final draft

    Path planning with loop closure constraints using an atlas-based RRT

    Get PDF
    In many relevant path planning problems, loop closure constraints reduce the configuration space to a manifold embedded in the higher-dimensional joint ambient space. Whereas many progresses have been done to solve path planning problems in the presence of obstacles, only few work consider loop closure constraints. In this paper we present the AtlasRRT algorithm, a planner specially tailored for such constrained systems that builds on recently developed tools for higher-dimensional continuation. These tools provide procedures to define charts that locally parametrize manifolds and to coordinate them forming an atlas. AtlasRRT simultaneously builds an atlas and a Rapidly-Exploring Random Tree (RRT), using the atlas to sample relevant configurations for the RRT, and the RRT to devise directions of expansion for the atlas. The new planner is advantageous since samples obtained from the atlas allow a more efficient extension of the RRT than state of the art approaches, where samples are generated in the joint ambient space.Peer ReviewedPostprint (author’s final draft

    Protein Loop Prediction by Fragment Assembly

    Get PDF
    If the primary sequence of a protein is known, what is its three-dimensional structure? This is one of the most challenging problems in molecular biology and has many applications in proteomics. During the last three decades, this issue has been extensively researched. Techniques such as the protein folding approach have been demonstrated to be promising in predicting the core areas of proteins - α-helices and β-strands. However, loops that contain no regular units of secondary structure elements remain the most difficult regions for prediction. The protein loop prediction problem is to predict the spatial structure of a loop given the primary sequence of a protein and the spatial structures of all the other regions. There are two major approaches used to conduct loop prediction – the ab initio folding and database searching methods. The loop prediction accuracy is unsatisfactory because of the hypervariable property of the loops. The key contribution proposed by this thesis is a novel fragment assembly algorithm using branch-and-cut to tackle the loop prediction problem. We present various pruning rules to reduce the search space and to speed up the finding of good loop candidates. The algorithm has the advantages of the database-search approach and ensures that the predicted loops are physically reasonable. The algorithm also benefits from ab initio folding since it enumerates all the possible loops in the discrete approximation of the conformation space. We implemented the proposed algorithm as a protein loop prediction tool named LoopLocker. A test set from CASP6, the world wide protein structure prediction competition, was used to evaluate the performance of LoopLocker. Experimental results showed that LoopLocker is capable of predicting loops of 4, 8, 11-12, 13-15 residues with average RMSD errors of 0.452, 1.410, 1.741 and 1.895 A respectively. In the PDB, more than 90% loops are fewer than 15 residues. This concludes that our fragment assembly algorithm is successful in tackling the loop prediction problem

    Distance-based formulations for the position analysis of kinematic chains

    Get PDF
    This thesis addresses the kinematic analysis of mechanisms, in particular, the position analysis of kinematic chains, or linkages, that is, mechanisms with rigid bodies (links) interconnected by kinematic pairs (joints). This problem, of completely geometrical nature, consists in finding the feasible assembly modes that a kinematic chain can adopt. An assembly mode is a possible relative transformation between the links of a kinematic chain. When an assignment of positions and orientations is made for all links with respect to a given reference frame, an assembly mode is called a configuration. The methods reported in the literature for solving the position analysis of kinematic chains can be classified as graphical, analytical, or numerical. The graphical approaches are mostly geometrical and designed to solve particular problems. The analytical and numerical methods deal, in general, with kinematic chains of any topology and translate the original geometric problem into a system of kinematic analysis of all the Assur kinematic chains resulting from replacing some of its revolute joints by slider joints. Thus, it is concluded that the polynomials of all fully-parallel planar robots can be derived directly from that of the widely known 3-RPR robot. In addition to these results, this thesis also presents an efficient procedure, based on distance and oriented area constraints, and geometrical arguments, to trace coupler curves of pin-jointed Gr¨ubler kinematic chains. All these techniques and results together are contributions to theoretical kinematics of mechanisms, robot kinematics, and distance plane geometry. equations that defines the location of each link based, mainly, on independent loop equations. In the analytical approaches, the system of kinematic equations is reduced to a polynomial, known as the characteristic polynomial of the linkage, using different elimination methods —e.g., Gr¨obner bases or resultant techniques. In the numerical approaches, the system of kinematic equations is solved using, for instance, polynomial continuation or interval-based procedures. In any case, the use of independent loop equations to solve the position analysis of kinematic chains, almost a standard in kinematics of mechanisms, has seldom been questioned despite the resulting system of kinematic equations becomes quite involved even for simple linkages. Moreover, stating the position analysis of kinematic chains directly in terms of poses, with or without using independent loop equations, introduces two major disadvantages: arbitrary reference frames has to be included, and all formulas involve translations and rotations simultaneously. This thesis departs from this standard approach by, instead of directly computing Cartesian locations, expressing the original position problem as a system of distance-based constraints that are then solved using analytical and numerical procedures adapted to their particularities. In favor of developing the basics and theory of the proposed approach, this thesis focuses on the study of the most fundamental planar kinematic chains, namely, Baranov trusses, Assur kinematic chains, and pin-jointed Gr¨ubler kinematic chains. The results obtained have shown that the novel developed techniques are promising tools for the position analysis of kinematic chains and related problems. For example, using these techniques, the characteristic polynomials of most of the cataloged Baranov trusses can be obtained without relying on variable eliminations or trigonometric substitutions and using no other tools than elementary algebra. An outcome in clear contrast with the complex variable eliminations require when independent loop equations are used to tackle the problem. The impact of the above result is actually greater because it is shown that the characteristic polynomial of a Baranov truss, derived using the proposed distance-based techniques, contains all the necessary and sufficient information for solving the positionEsta tesis aborda el problema de análisis de posición de cadenas cinemáticas, mecanismos con cuerpos rígidos (enlaces) interconectados por pares cinemáticos (articulaciones). Este problema, de naturaleza geométrica, consiste en encontrar los modos de ensamblaje factibles que una cadena cinemática puede adoptar. Un modo de ensamblaje es una transformación relativa posible entre los enlaces de una cadena cinemática. Los métodos reportados en la literatura para la solución del análisis de posición de cadenas cinemáticas se pueden clasificar como gráficos, analíticos o numéricos. Los enfoques gráficos son geométricos y se diseñan para resolver problemas particulares. Los métodos analíticos y numéricos tratan con cadenas cinemáticas de cualquier topología y traducen el problema geométrico original en un sistema de ecuaciones cinemáticas que define la ubicación de cada enlace, basado generalmente en ecuaciones de bucle independientes. En los enfoques analíticos, el sistema de ecuaciones cinemáticas se reduce a un polinomio, conocido como el polinomio característico de la cadena cinemática, utilizando diferentes métodos de eliminación. En los métodos numéricos, el sistema se resuelve utilizando, por ejemplo, la continuación polinomial o procedimientos basados en intervalos. En cualquier caso, el uso de ecuaciones de bucle independientes, un estándar en cinemática de mecanismos, rara vez ha sido cuestionado a pesar de que el sistema resultante de ecuaciones es bastante complicado, incluso para cadenas simples. Por otra parte, establecer el análisis de la posición de cadenas cinemáticas directamente en términos de poses, con o sin el uso de ecuaciones de bucle independientes, presenta dos inconvenientes: sistemas de referencia arbitrarios deben ser introducidos, y todas las fórmulas implican traslaciones y rotaciones de forma simultánea. Esta tesis se aparta de este enfoque estándar expresando el problema de posición original como un sistema de restricciones basadas en distancias, en lugar de directamente calcular posiciones cartesianas. Estas restricciones son posteriormente resueltas con procedimientos analíticos y numéricos adaptados a sus particularidades. Con el propósito de desarrollar los conceptos básicos y la teoría del enfoque propuesto, esta tesis se centra en el estudio de las cadenas cinemáticas planas más fundamentales, a saber, estructuras de Baranov, cadenas cinemáticas de Assur, y cadenas cinemáticas de Grübler. Los resultados obtenidos han demostrado que las técnicas desarrolladas son herramientas prometedoras para el análisis de posición de cadenas cinemáticas y problemas relacionados. Por ejemplo, usando dichas técnicas, los polinomios característicos de la mayoría de las estructuras de Baranov catalogadas se puede obtener sin realizar eliminaciones de variables o sustituciones trigonométricas, y utilizando solo álgebra elemental. Un resultado en claro contraste con las complejas eliminaciones de variables que se requieren cuando se utilizan ecuaciones de bucle independientes. El impacto del resultado anterior es mayor porque se demuestra que el polinomio característico de una estructura de Baranov, derivado con las técnicas propuestas, contiene toda la información necesaria y suficiente para resolver el análisis de posición de las cadenas cinemáticas de Assur que resultan de la sustitución de algunas de sus articulaciones de revolución por articulaciones prismáticas. De esta forma, se concluye que los polinomios de todos los robots planares totalmente paralelos se pueden derivar directamente del polinomio característico del conocido robot 3-RPR. Adicionalmente, se presenta un procedimiento eficaz, basado en restricciones de distancias y áreas orientadas, y argumentos geométricos, para trazar curvas de acoplador de cadenas cinemáticas de Grübler. En conjunto, todas estas técnicas y resultados constituyen contribuciones a la cinemática teórica de mecanismos, la cinemática de robots, y la geometría plana de distancias. Barcelona 13

    De Novo Protein Structure Modeling from Cryoem Data Through a Dynamic Programming Algorithm in the Secondary Structure Topology Graph

    Get PDF
    Proteins are the molecules carry out the vital functions and make more than the half of dry weight in every cell. Protein in nature folds into a unique and energetically favorable 3-Dimensional (3-D) structure which is critical and unique to its biological function. In contrast to other methods for protein structure determination, Electron Cryorricroscopy (CryoEM) is able to produce volumetric maps of proteins that are poorly soluble, large and hard to crystallize. Furthermore, it studies the proteins in their native environment. Unfortunately, the volumetric maps generated by current advances in CryoEM technique produces protein maps at medium resolution about (~5 to 10Å) in which it is hard to determine the atomic-structure of the protein. However, the resolution of the volumetric maps is improving steadily, and recent works could obtain atomic models at higher resolutions (~3Å). De novo protein modeling is the process of building the structure of the protein using its CryoEM volumetric map. Thereupon, the volumetric maps at medium resolution generated by CryoEM technique proposed a new challenge. At the medium resolution, the location and orientation of secondary structure elements (SSE) can be visually and computationally identified. However, the order and direction (called protein topology) of the SSEs detected from the CryoEM volumetric map are not visible. In order to determine the protein structure, the topology of the SSEs has to be figured out and then the backbone can be built. Consequently, the topology problem has become a bottle neck for protein modeling using CryoEM In this dissertation, we focus to establish an effective computational framework to derive the atomic structure of a protein from the medium resolution CryoEM volumetric maps. This framework includes a topology graph component to rank effectively the topologies of the SSEs and a model building component. In order to generate the small subset of candidate topologies, the problem is translated into a layered graph representation. We developed a dynamic programming algorithm (TopoDP) for the new representation to overcome the problem of large search space. Our approach shows the improved accuracy, speed and memory use when compared with existing methods. However, the generating of such set was infeasible using a brute force method. Therefore, the topology graph component effectively reduces the topological space using the geometrical features of the secondary structures through a constrained K-shortest paths method in our layered graph. The model building component involves the bending of a helix and the loop construction using skeleton of the volumetric map. The forward-backward CCD is applied to bend the helices and model the loops

    Estudio de la evolución estructural en familias de proteínas y su aplicación al refinado de modelos obtenidos por homología

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Facultad de Ciencias, Departamento de Biología Molecular. Fecha de lectura: 20 de octubre de 2006El refinado estructural de proteínas continúa siendo un reto importante en el campo de la predicción estructural. La mayoría de los intentos de refinar modelos conducen a su degradación, en lugar de a la mejora de su calidad, de manera que muchos protocolos omiten este paso final. Incluso en ausencia de errores en los alineamientos y usando las plantillas óptimas, se ha demostrado que los métodos de modelado basados en patrones tienen limitaciones intrínsecas, lo que sugiere la necesidad de desarrollar otras metodologías si el objetivo es mejorar la calidad final de los modelos propuestos. Las dificultades del refinado estructural se derivan del delicado balance de fuerzas en el estado nativo de las proteínas, que todavía no es reproducible en toda su extensión mediante los campos de fuerza actuales, y de la necesidad de muestrear un gran número de conformaciones alternativas en la búsqueda del mínimo global de energía. En esta tesis se aborda esta segunda cuestión. Se presenta un nuevo algoritmo de alineamiento estructural múltiple, MAMMOTH-mult, que permite detectar las regiones estructuralmente conservadas en familias de proteínas y se estudia su plasticidad mediante análisis de componentes principales y de modos normales. Esto permite caracterizar las deformaciones más importantes que experimentan las estructuras a lo largo de la evolución y las debidas a su propia topología. Se observa que cada familia de proteínas homólogas presenta un patrón de evolución estructural característico, que está fundamentalmente relacionado con la propia topología de la estructura y no con los detalles de la secuencia. Estos patrones de deformación se utilizan para ayudar a facilitar el problema del muestreo en el refinado. Se observa que se puede resolver este problema de manera esencial para la cadena principal de las estructuras definiendo un subespacio pequeño, de unas 50 dimensiones, consistente en una combinación de direcciones favorecidas por la evolución, definidas por los componentes principales de la variación estructural dentro de las familias de proteínas homólogas, y las direcciones de vibración derivadas del análisis de sus modos normales. La mayoría de los centros estructurales de las proteínas en este subespacio combinado se puede representar con menos de 1 Å de RMSD con respecto a sus posiciones correctas. También se muestra que las optimizaciones de intercambio de réplicas de Monte Carlo son muy eficientes para encontrar el mínimo global en este subespacio. Finalmente, se discuten las aplicaciones de esta metodología.Structural refinement of protein models remains as a particularly challenging problem in protein structure prediction. Most attempts to refining comparative models lead to degradation rather than improvement in model quality, so most current comparative modelling procedures omit the refinement step. However, it has been shown that even in absence of alignment errors and using optimal templates, template-only methods have intrinsic limitations, suggesting that other methodologies must be developed if accuracy is ultimately to be improved. It is thought that these difficulties originate from the delicate balance of forces in the native state and the requirement to sample a large number of alternative tightly packed conformations in the search for the global minimum. Here we address this second issue. We present a new algorithm, MAMMOTH-mult, for multiple structural alignment, that allows to detect structural conserved regions in protein families. Applying principal components and normal mode analysis to these regions allows the caracterization of the most important deformations that structures experiment along the evolution and those which are due to their own topologies. We find that each family of homologous proteins has a characteristic template of structural evolution related to its own structure topology rather than to sequence details. We use this information for helping to solve the sampling problem. We show this problem can be essentially solved at the backbone level by defining a small sampling subspace, of 50 dimensions at most, consisting on a combination of evolutionarily favoured directions defined by the principal components of structural variation within a family of homologous proteins and their topological vibrational directions derived from normal mode analyses. Most protein cores in this combined space can be represented within 1 Å accuracy. We also show that Replica Exchange Monte Carlo optimizations in this subspace are very efficient at finding the global minimum neighbourhood in realistic conditions of roughness of the energy landscape. Applications of this methodology are finally discussed

    Protein structure prediction: improving and automating knowledge-based approaches

    Full text link
    This work presents a computational approach to improve the automatic prediction of protein structures from sequence. Its main focus was twofold. An automated method for guiding the modeling process was first developed. This was tested and found to be state of the art in the CASP4 structure prediction contest in 2000. The second focus was the development of a novel divide and conquer algorithm for modeling flexible loops in proteins. Implementation of the search procedure and subsequent ranking is presented. The results are again compared with state of the art methods
    corecore