14 research outputs found

    Reconstructing protein structure from solvent exposure using tabu search

    Get PDF
    BACKGROUND: A new, promising solvent exposure measure, called half-sphere-exposure (HSE), has recently been proposed. Here, we study the reconstruction of a protein's C(α )trace solely from structure-derived HSE information. This problem is of relevance for de novo structure prediction using predicted HSE measure. For comparison, we also consider the well-established contact number (CN) measure. We define energy functions based on the HSE- or CN-vectors and minimize them using two conformational search heuristics: Monte Carlo simulation (MCS) and tabu search (TS). While MCS has been the dominant conformational search heuristic in literature, TS has been applied only a few times. To discretize the conformational space, we use lattice models with various complexity. RESULTS: The proposed TS heuristic with a novel tabu definition generally performs better than MCS for this problem. Our experiments show that, at least for small proteins (up to 35 amino acids), it is possible to reconstruct the protein backbone solely from the HSE or CN information. In general, the HSE measure leads to better models than the CN measure, as judged by the RMSD and the angle correlation with the native structure. The angle correlation, a measure of structural similarity, evaluates whether equivalent residues in two structures have the same general orientation. Our results indicate that the HSE measure is potentially very useful to represent solvent exposure in protein structure prediction, design and simulation

    Soft Computing Techiniques for the Protein Folding Problem on High Performance Computing Architectures

    Get PDF
    The protein-folding problem has been extensively studied during the last fifty years. The understanding of the dynamics of global shape of a protein and the influence on its biological function can help us to discover new and more effective drugs to deal with diseases of pharmacological relevance. Different computational approaches have been developed by different researchers in order to foresee the threedimensional arrangement of atoms of proteins from their sequences. However, the computational complexity of this problem makes mandatory the search for new models, novel algorithmic strategies and hardware platforms that provide solutions in a reasonable time frame. We present in this revision work the past and last tendencies regarding protein folding simulations from both perspectives; hardware and software. Of particular interest to us are both the use of inexact solutions to this computationally hard problem as well as which hardware platforms have been used for running this kind of Soft Computing techniques.This work is jointly supported by the FundaciónSéneca (Agencia Regional de Ciencia y Tecnología, Región de Murcia) under grants 15290/PI/2010 and 18946/JLI/13, by the Spanish MEC and European Commission FEDER under grant with reference TEC2012-37945-C02-02 and TIN2012-31345, by the Nils Coordinated Mobility under grant 012-ABEL-CM-2014A, in part financed by the European Regional Development Fund (ERDF). We also thank NVIDIA for hardware donation within UCAM GPU educational and research centers.Ingeniería, Industria y Construcció

    Algorithms for Protein Structure Prediction

    Get PDF

    Lower-energy conformers search of TPP-1 polypeptide via hybrid particle swarm optimization and genetic algorithm

    Get PDF
    Low-energy conformation search on biological macromolecules remains a challenge in biochemical experiments and theoretical studies. Finding efficient approaches to minimize the energy of peptide structures is critically needed for researchers either studying peptide-protein interactions or designing peptide drugs. In this study, we aim to develop a heuristic-based algorithm to efficiently minimize a promising PD-L1 inhibiting polypeptide, TPP-1, and build its low-energy conformer pool to advance its subsequent structure optimization and molecular docking studies. Through our study, we find that, using backbone dihedral angles as the decision variables, both PSO and GA can outperform other existing heuristic approaches in optimizing the structure of Met-enkephalin, a benchmarking pentapeptide for evaluating the efficiency of conformation optimizers. Using the established algorithm pipeline, hybridizing PSO and GA minimized TPP-1 structure efficiently and a low-energy pool was built with an acceptable computational cost (a couple days using a single laptop). Remarkably, the efficiency of hybrid PSO-GA is hundreds-fold higher than the conventional Molecular Dynamic simulations running under the force filed. Meanwhile, the stereo-chemical quality of the minimized structures was validated using Ramachandran plot. In summary, hybrid PSO-GA minimizes TPP-1 structure efficiently and yields a low-energy conformer pool within a reasonably short time period. Overall, our approach can be extended to biochemical research to speed up the peptide conformation determinations and hence can facilitate peptide-involved drug development

    Optimización de algoritmos bioinspirados en sistemas heterogéneos CPU-GPU.

    Get PDF
    Los retos científicos del siglo XXI precisan del tratamiento y análisis de una ingente cantidad de información en la conocida como la era del Big Data. Los futuros avances en distintos sectores de la sociedad como la medicina, la ingeniería o la producción eficiente de energía, por mencionar sólo unos ejemplos, están supeditados al crecimiento continuo en la potencia computacional de los computadores modernos. Sin embargo, la estela de este crecimiento computacional, guiado tradicionalmente por la conocida “Ley de Moore”, se ha visto comprometido en las últimas décadas debido, principalmente, a las limitaciones físicas del silicio. Los arquitectos de computadores han desarrollado numerosas contribuciones multicore, manycore, heterogeneidad, dark silicon, etc, para tratar de paliar esta ralentización computacional, dejando en segundo plano otros factores fundamentales en la resolución de problemas como la programabilidad, la fiabilidad, la precisión, etc. El desarrollo de software, sin embargo, ha seguido un camino totalmente opuesto, donde la facilidad de programación a través de modelos de abstracción, la depuración automática de código para evitar efectos no deseados y la puesta en producción son claves para una viabilidad económica y eficiencia del sector empresarial digital. Esta vía compromete, en muchas ocasiones, el rendimiento de las propias aplicaciones; consecuencia totalmente inadmisible en el contexto científico. En esta tesis doctoral tiene como hipótesis de partida reducir las distancias entre los campos hardware y software para contribuir a solucionar los retos científicos del siglo XXI. El desarrollo de hardware está marcado por la consolidación de los procesadores orientados al paralelismo masivo de datos, principalmente GPUs Graphic Processing Unit y procesadores vectoriales, que se combinan entre sí para construir procesadores o computadores heterogéneos HSA. En concreto, nos centramos en la utilización de GPUs para acelerar aplicaciones científicas. Las GPUs se han situado como una de las plataformas con mayor proyección para la implementación de algoritmos que simulan problemas científicos complejos. Desde su nacimiento, la trayectoria y la historia de las tarjetas gráficas ha estado marcada por el mundo de los videojuegos, alcanzando altísimas cotas de popularidad según se conseguía más realismo en este área. Un hito importante ocurrió en 2006, cuando NVIDIA (empresa líder en la fabricación de tarjetas gráficas) lograba hacerse con un hueco en el mundo de la computación de altas prestaciones y en el mundo de la investigación con el desarrollo de CUDA “Compute Unified Device Arquitecture. Esta arquitectura posibilita el uso de la GPU para el desarrollo de aplicaciones científicas de manera versátil. A pesar de la importancia de la GPU, es interesante la mejora que se puede producir mediante su utilización conjunta con la CPU, lo que nos lleva a introducir los sistemas heterogéneos tal y como detalla el título de este trabajo. Es en entornos heterogéneos CPU-GPU donde estos rendimientos alcanzan sus cotas máximas, ya que no sólo las GPUs soportan el cómputo científico de los investigadores, sino que es en un sistema heterogéneo combinando diferentes tipos de procesadores donde podemos alcanzar mayor rendimiento. En este entorno no se pretende competir entre procesadores, sino al contrario, cada arquitectura se especializa en aquella parte donde puede explotar mejor sus capacidades. Donde mayor rendimiento se alcanza es en estos clústeres heterogéneos, donde múltiples nodos son interconectados entre sí, pudiendo dichos nodos diferenciarse no sólo entre arquitecturas CPU-GPU, sino también en las capacidades computacionales dentro de estas arquitecturas. Con este tipo de escenarios en mente, se presentan nuevos retos en los que lograr que el software que hemos elegido como candidato se ejecuten de la manera más eficiente y obteniendo los mejores resultados posibles. Estas nuevas plataformas hacen necesario un rediseño del software para aprovechar al máximo los recursos computacionales disponibles. Se debe por tanto rediseñar y optimizar los algoritmos existentes para conseguir que las aportaciones en este campo sean relevantes, y encontrar algoritmos que, por su propia naturaleza sean candidatos para que su ejecución en dichas plataformas de alto rendimiento sea óptima. Encontramos en este punto una familia de algoritmos denominados bioinspirados, que utilizan la inteligencia colectiva como núcleo para la resolución de problemas. Precisamente esta inteligencia colectiva es la que les hace candidatos perfectos para su implementación en estas plataformas bajo el nuevo paradigma de computación paralela, puesto que las soluciones pueden ser construidas en base a individuos que mediante alguna forma de comunicación son capaces de construir conjuntamente una solución común. Esta tesis se centrará especialmente en uno de estos algoritmos bioinspirados que se engloba dentro del término metaheurísticas bajo el paradigma del Soft Computing, el Ant Colony Optimization “ACO”. Se realizará una contextualización, estudio y análisis del algoritmo. Se detectarán las partes más críticas y serán rediseñadas buscando su optimización y paralelización, manteniendo o mejorando la calidad de sus soluciones. Posteriormente se pasará a implementar y testear las posibles alternativas sobre diversas plataformas de alto rendimiento. Se utilizará el conocimiento adquirido en el estudio teórico-práctico anterior para su aplicación a casos reales, más en concreto se mostrará su aplicación sobre el plegado de proteínas. Todo este análisis es trasladado a su aplicación a un caso concreto. En este trabajo, aunamos las nuevas plataformas hardware de alto rendimiento junto al rediseño e implementación software de un algoritmo bioinspirado aplicado a un problema científico de gran complejidad como es el caso del plegado de proteínas. Es necesario cuando se implementa una solución a un problema real, realizar un estudio previo que permita la comprensión del problema en profundidad, ya que se encontrará nueva terminología y problemática para cualquier neófito en la materia, en este caso, se hablará de aminoácidos, moléculas o modelos de simulación que son desconocidos para los individuos que no sean de un perfil biomédico.Ingeniería, Industria y Construcció

    Novel Strategies for Model-Building of G Protein-Coupled Receptors

    Get PDF
    The G protein-coupled receptors constitute still the most densely populated proteinfamily encompassing numerous disease-relevant drug targets. Consequently, medicinal chemistry is expected to pursue targets from that protein family in that hits need to be generated and subsequently optimized towards viable clinical candidates for a variety of therapeutic areas. For the purpose of rationalizing structure-activity relationships within such optimization programs, structural information derived from the ligand's as well as the macromolecule's perspective is essential. While it is relatively straightforward to define pharmacophore hypotheses based on comparative modelling of structurally and biologically characterized low-molecular weight ligands, a deeper understanding of the molecular recognition event underlying, remains challenging, since the principally available amount of experimentally derived structural data on GPCRs is extremely scarse when compared to, e.g., soluble enzymes. In this context, the protein modelling methodologies introduced, developed, optimized, and applied in this thesis provide structural models that are capable of assisting in the development of structural hypotheses on ligand-receptor complexes. As such they provide a valuable structural framework not only for a more detailed insight into ligand-GPCR interaction, but also for guiding the design process towards next-generation compounds which should display enhanced affinity. The model building procedure developed in this thesis systematically follows a hierarchical approach, sequentially generating a 1D topology, followed by a 2D topology that is finally converted into a 3D topology. The determination of a 1D topology is based on a compartmentalization of the linear amino acid sequence of a GPCR of interest into the extracellular, intracellular, and transmembrane sequence stretches. The entire chapter 3 of this study elaborates on the strengths and weaknesses of applying automated prediction tools for the purpose of identifying the transmembrane sequence domains. Based on an once derived 1D topology, a type of in-plane projection structure for the seven transmembrane helices can be derived with the aide of calculated vectorial property moments, yielding the 2D topology. Thorough bioinformatics studies revealed that only a consensus approach based on a conceptual combination of different methods employing a carefully made selection of parameter sets gave reliable results, emphasizing the danger to fully automate a GPCR modelling procedure. Chapter 4 describes a procedure to further expand the 2D topological findings into 3D space, exemplified on the human CCK-B receptor protein. This particular GPCR was chosen as the receptor of interest, since an enormous experimentally derived and structurally relevant data-set was available. Within the computational refinement procedure of constructed GPCR models, major emphasis was laid on the explicit treatment of a non-isotropic solvent environment during molecular mechanics (i.e. energy minimization and molecular dynamics simulations) calculations. The majority of simulations was therefore carried out in a tri-phasic solvent box accounting for a central lipid environment, flanked by two aqueous compartments, mimicking the extracellular and cytoplasmic space. Chapter 5 introduces the reference compound set, comprising low-molecular weight compounds modulating CCK receptors, that was used for validation purposes of the generated models of the receptor protein. Chapter 6 describes how the generated model of the CCK-B receptor was subjected to intensive docking studies employing compound series introduced in chapter 5. It turned out that by applying the DRAGHOME methodology viable structural hypotheses on putative receptor-ligand complexes could be generated. Based on the methodology pursued in this thesis a detailed model of the receptor binding site could be devised that accounts for known structure-activity relationships as well as for results obtained by site-directed mutagenesis studies in a qualitative manner. The overall study presented in this thesis is primarily aimed to deliver a feasibility study on generating model structures of GPCRs by a conceptual combination of tailor-made bioinformatics techniques with the toolbox of protein modelling, exemplified on the human CCK-B receptor. The generated structures should be envisioned as models only, not necessarily providing a detailed image of reality. However, consistent models, when verified and refined against experimental data, deliver an extremely useful structural contextual platform on which different scientific disciplines such as medicinal chemistry, molecular biology, and biophysics can effectively communicate

    Novel Strategies for Model-Building of G Protein-Coupled Receptors

    Get PDF
    The G protein-coupled receptors constitute still the most densely populated proteinfamily encompassing numerous disease-relevant drug targets. Consequently, medicinal chemistry is expected to pursue targets from that protein family in that hits need to be generated and subsequently optimized towards viable clinical candidates for a variety of therapeutic areas. For the purpose of rationalizing structure-activity relationships within such optimization programs, structural information derived from the ligand's as well as the macromolecule's perspective is essential. While it is relatively straightforward to define pharmacophore hypotheses based on comparative modelling of structurally and biologically characterized low-molecular weight ligands, a deeper understanding of the molecular recognition event underlying, remains challenging, since the principally available amount of experimentally derived structural data on GPCRs is extremely scarse when compared to, e.g., soluble enzymes. In this context, the protein modelling methodologies introduced, developed, optimized, and applied in this thesis provide structural models that are capable of assisting in the development of structural hypotheses on ligand-receptor complexes. As such they provide a valuable structural framework not only for a more detailed insight into ligand-GPCR interaction, but also for guiding the design process towards next-generation compounds which should display enhanced affinity. The model building procedure developed in this thesis systematically follows a hierarchical approach, sequentially generating a 1D topology, followed by a 2D topology that is finally converted into a 3D topology. The determination of a 1D topology is based on a compartmentalization of the linear amino acid sequence of a GPCR of interest into the extracellular, intracellular, and transmembrane sequence stretches. The entire chapter 3 of this study elaborates on the strengths and weaknesses of applying automated prediction tools for the purpose of identifying the transmembrane sequence domains. Based on an once derived 1D topology, a type of in-plane projection structure for the seven transmembrane helices can be derived with the aide of calculated vectorial property moments, yielding the 2D topology. Thorough bioinformatics studies revealed that only a consensus approach based on a conceptual combination of different methods employing a carefully made selection of parameter sets gave reliable results, emphasizing the danger to fully automate a GPCR modelling procedure. Chapter 4 describes a procedure to further expand the 2D topological findings into 3D space, exemplified on the human CCK-B receptor protein. This particular GPCR was chosen as the receptor of interest, since an enormous experimentally derived and structurally relevant data-set was available. Within the computational refinement procedure of constructed GPCR models, major emphasis was laid on the explicit treatment of a non-isotropic solvent environment during molecular mechanics (i.e. energy minimization and molecular dynamics simulations) calculations. The majority of simulations was therefore carried out in a tri-phasic solvent box accounting for a central lipid environment, flanked by two aqueous compartments, mimicking the extracellular and cytoplasmic space. Chapter 5 introduces the reference compound set, comprising low-molecular weight compounds modulating CCK receptors, that was used for validation purposes of the generated models of the receptor protein. Chapter 6 describes how the generated model of the CCK-B receptor was subjected to intensive docking studies employing compound series introduced in chapter 5. It turned out that by applying the DRAGHOME methodology viable structural hypotheses on putative receptor-ligand complexes could be generated. Based on the methodology pursued in this thesis a detailed model of the receptor binding site could be devised that accounts for known structure-activity relationships as well as for results obtained by site-directed mutagenesis studies in a qualitative manner. The overall study presented in this thesis is primarily aimed to deliver a feasibility study on generating model structures of GPCRs by a conceptual combination of tailor-made bioinformatics techniques with the toolbox of protein modelling, exemplified on the human CCK-B receptor. The generated structures should be envisioned as models only, not necessarily providing a detailed image of reality. However, consistent models, when verified and refined against experimental data, deliver an extremely useful structural contextual platform on which different scientific disciplines such as medicinal chemistry, molecular biology, and biophysics can effectively communicate

    The development of sialidase inhibitors using structure-based drug design

    Get PDF
    The sialidases/neuraminidases represent a family of enzymes whose function is important in the pathogenicity of bacteria and the virulence of influenza. Relenza and Tamiflu represent two drugs that were developed using structure-based drug design (SBDD) and computational-assisted drug design (CADD). These drugs target the active site of the influenza neuraminidase A and B (GH-34 family). Sialidases in the GH-33 family could represent novel drug targets for the treatment of bacterial or parasitic infection. SBDD was employed to develop chemical tools of two GH-33 sialidases, NanB and TcTS. NanB is a potential drug target for S. pneumoniae. The chemical tool developed for NanB follows on from work within the Taylor and Westwood research groups, in which a molecule of CHES and a glycerol were found serendipitously bound within a water channel at an allosteric site. Using this information as a basis for SBDD an allosteric inhibitor of NanB, Optactin was developed. Within this work, synthesis of this inhibitor was achieved and optimised. Optactin was then modified to improve potency. This proceeded through an amide analogue and addition of an arene resulting in a mid- micromolar inhibitor (IC₅₀: 55.4±2.5 µM). Addition of polar substituents improved potency further resulting in a low micromolar inhibitor of NanB, Optactamide (IC₅₀: 3.0±1.7 µM). Application of this tool in vitro demonstrated that NanB and NanA have a role in invasion of S. pneumoniae into lung epithelial cells. TcTS is a potential drug target for the treatment of Chagas disease. A CADD approach using a fragment library was unsuccessful at identifying an allosteric inhibitor of TcTS despite structural similarity with NanB. A re-task of the CADD approach towards the active site was successful in identifying an inhibitor of TcTS and a fragment useful for further development. This work sets the groundwork for the development of a chemical tool targeting TcTS
    corecore