628 research outputs found

    TMCrys: predict propensity of success for transmembrane protein crystallization

    Get PDF
    Motivation Transmembrane proteins (TMPs) are crucial in the life of the cells. As they have special properties, their structure is hard to determine––the PDB database consists of 2% TMPs, despite the fact that they are predicted to make up to 25% of the human proteome. Crystallization prediction methods were developed to aid the target selection for structure determination, however, there is a need for a TMP specific service. Results Here, we present TMCrys, a crystallization prediction method that surpasses existing prediction methods in performance thanks to its specialization for TMPs. We expect TMCrys to improve target selection of TMPs

    QCSPScore: a new scoring function for driving protein-ligand docking with quantitative chemical shifts perturbations

    Get PDF
    Through the use of information about the biological target structure, the optimization of potential drugs can be improved. In this work I have developed a procedure that uses the quantitative change in the chemical perturbations (CSP) in the protein from NMR experiments for driving protein-ligand docking. The approach is based on a hybrid scoring function (QCSPScore) which combines traditional DrugScore potentials, which describe the interaction between protein and ligand, with Kendall’s rank correlation coefficient, which evaluates docking poses in terms of their agreement with experimental CSP. Prediction of the CSP for a specific ligand pose is done efficiently with an empirical model, taking into account only ring current effects. QCSPScore has been implemented in the AutoDock software package. Compared to previous methods, this approach shows that the use of rank correlation coefficient is robust to outliers. In addition, the prediction of native-like complex geometries improved because the CSP are already being used during the docking process, and not only in a post-filtering setting for generated docking poses. Since the experimental information is guaranteed to be quantitatively used, CSP effectively contribute to align the ligand in the binding pocket. The first step in the development of QCSPScore was the analysis of 70 protein-ligand complexes for which reference CSP were computed. The success rate in the docking increased from 71% without involvement of CSP to 100% if CSP were considered at the highest weighting scheme. In a second step QCSPScore was used in re-docking three test cases, for which reference experimental CSP data was available. Without CSP, i.e. in the use of conventional DrugScore potentials, none of the three test cases could be successfully re-docked. The integration of CSP with the same weighting factor as described above resulted in all three cases successfully re-docked. For two of the three complexes, native-like solutions were only produced if CSP were considered.Conformational changes in the binding pockets of up to 2 Å RMSD did not affect the success of the docking. QCSPScore will be particularly interesting in difficult protein-ligand complexes. They are in particular those cases in which the shape of the binding pocket does not provide sufficient steric restraints such as in flat protein-protein interfaces and in the virtual screening of small chemical fragments.Durch die Verwendung von Information über die biologische Zielstruktur kann die Optimierung potentieller Wirkstoffe verbessert werden. Im Rahmen dieser Arbeit habe ich ein Verfahren entwickelt, das quantitativ die Veränderung der Chemischen Verschieben (CSP) im Protein aus NMR-Experimenten für das Protein-Ligand-Docking verwendet. Der Ansatz basiert auf einer Hybridbewertungsfunktion (QCSPScore) und kombiniert herkömmliche DrugScore-Potentiale, welche die Wechselwirkung zwischen Protein und Ligand beschreiben, mit dem Rangkorrelationskoeffizienten nach Kendall, der die Dockingposen hinsichtlich ihrer Übereinstimmung mit experimentellen CSP. Die Vorhersage der CSP für einen bestimmten Liganden geschieht effizient mit einem empirischen Modell, wobei nur Ringstromeffekte berücksichtigt werden. QCSPScore wurde in das AutoDock Softwarepaket implementiert. Im Vergleich zu früheren Verfahren zeigt dieser Ansatz, dass die Verwendung des Rangkorrelationskoeffizienten robuster ist gegenüber Ausreißern in den vorhergesagten CSP. Außerdem ist die Vorhersage nativ-ähnlicher Komplexgeometrien verbessert, da die CSP bereits während des Docking-Prozesses eingesetzt werden, und nicht erst in einem nachträglichen Filter für generierte Dockingposen. Da die experimentelle Informationen quantitativ benutzt werden wird sichergestellt, dass die CSP effektiv dazu beitragen, den Liganden in der Bindetasche auszurichten. Der erste Schritt bei der Entwicklung des QCSPScore war die Analyse von 70 Protein-Ligand-Komplexen, für die als Referenz CSP vorhergesagt wurden. Die Erfolgsrate im Docking erhöhte sich von 71 %, ohne Einbeziehung von CSP, auf 100 %, wenn CSP mit höchster Gewichtung mit einbezogen wurden. Die globale Optimierung auf der kombinierten Docking-Energiehyperfläche ist also erfolgreich. In einem zweiten Schritt wurde QCSPScore zum Docking dreier Testfälle verwendet, für die als Referenz experimentelle CSP zur Verfügung standen. Ohne CSP, d.h. bei der Verwendung von herkömmlichen DrugScore-Potentialen, konnte keiner der drei Testfälle erfolgreich gedockt werden. Die Einbeziehung von CSP mit dem selben hohen Gewichtungsfaktor wie oben führte in allen drei Fällen zu erfolgreichen Docking-Ergebnissen. Für zwei der drei Komplexe wurden zudem nur bei Einbeziehung der experimentellen Information nativ-ähnliche Geometrien vorhergesagt. Konformationelle Änderungen der Bindetasche bis zu 2 Å RMSD beeinträchtigen den Erfolg des Dockings nicht. Ich bin davon überzeugt, dass mein Verfahren besonders für Protein-Ligand-Komplexe interessant sein wird, für die die Vorhersage nativ-ähnlicher Komplexe bislang schwierig war. Das sind insbesondere solche Fälle, in denen die Form der Bindetasche zur Vorhersage des Komplexes nicht ausreichend, wie das bei flachen Protein-Protein-Wechselwirkungsregionen oder beim virtuellen Screening kleiner Fragmente der Fall ist

    Probabilistic Protein Design, Comparative Modeling, and the Structure of a Multidomain P53 Oligomer Bound to DNA

    Get PDF
    Proteins are the main functional components of all cellular processes, and most of them fold into unique three-dimensional shapes guided by their amino-acid sequence. Discovering the structure of a protein, or protein complexes, can provide important clues about how they perform their function. However, the chemical, physical or architectural properties of many proteins impede traditional approaches to structure determination. Two such proteins, the tumor suppressor p53 and the cholesterol processing enzyme endothelial lipase, are prime examples of problematic proteins that defy structural investigation via crystallographic methods. Therefore, new techniques must be developed to gain valuable structural insights, such as: computationally assisted protein design strategies, more efficient crystal screening, or a combination of both. We applied a statistical computationally assisted design strategy to stabilize a p53 variant consisting of two independently folding domains. The re-engineered variant retained normal DNA-binding activities, and allowed us to experimentally determine the first structure of a physiologically active multi-domain p53 tetramer bound to a full-length DNA response element. We then demonstrated how computational methodology can be used to gain functional detail of proteins in the absence of experimentally determined structures. By creating comparative models of endothelial lipase, we discovered structural features that describe function and regulation, and gained a better understanding of the mechanisms conferring substrate specificity. Additionally, traditional methods for protein structure determination, such as X-ray crystallography, require relatively large amounts of purified sample in order to screen a sufficient variety of conditions. To improve this process, we developed a novel method for protein crystal screening using a microfluidics platform. We show how it is possible to use smaller quantities of protein to screen larger varieties of conditions, in turn increasing the probability of success in obtaining crystals. Furthermore, in contrast to current crystallographic approaches, all steps from screening to crystal growth to data collection were performed within the same reaction chamber, without any manipulation of the crystal, dramatically increasing the efficiency of both time and sample required to realize the structure. Collectively, these results demonstrate how advances in computational and experimental approaches can provide structural detail for proteins in circumstances where traditional methodology fails

    CryoProtect: A Web Server for Classifying Antifreeze Proteins from Nonantifreeze Proteins

    Get PDF

    Automated Detection of Anomalous Patterns in Validation Scores for Protein X-Ray Structure Models

    Get PDF
    Structural bioinformatics is a subdomain of data mining focused on identifying structural patterns relevant to functional attributes in repositories of biological macromolecular structure models. This research focused on structures determined via x-ray crystallography and deposited in the Protein Data Bank (PDB). Protein structures deposited in the PDB are products of experimental processes, and only approximately model physical reality. Structural biologists address accuracy and precision concerns via community-enforced consensus standards of accepted practice for proper building, refinement, and validation of models. Validation scores are quantitative partial indicators of the likelihood that a model contains serious systematic errors. The PDB recently convened a panel of experts, which placed renewed emphasis on troubling anomalies among deposited structure models. This study set out to detect such anomalies. I hypothesized that community consensus standards would be evident in patterns of validation scores, and deviations from those standards would appear as unusual combinations of validation scores. Validation attributes were extracted from PDB entry headers and multiple software tools (e.g., WhatCheck, SFCheck, and MolProbity). Independent component analysis (ICA) was used for attribute transformation to increase contrast between inliers and outliers. Unusual patterns were sought in regions of locally low density in the space of validation score profiles, using a novel standardization of Local Outlier Factor (LOF) scores. Validation score profiles associated with the most extreme outlier scores were demonstrably anomalous according to domain theory. Among these were documented fabrications, possible annotation errors, and complications in the underlying experimental data. Analysis of deep inliers revealed promising support for the hypothesized link between consensus standard practices and common validation score values. Unfortunately, with numerical anomaly detection methods that operate simultaneously on numerous continuous-valued attributes, it is often quite difficult to know why a case gets a particular outlier score. Therefore, I hypothesized that IF-THEN rules could be used to post-process outlier scores to make them comprehensible and explainable. Inductive rule extraction was performed using RIPPER. Results were mixed, but they represent a promising proof of concept. The methods explored are general and applicable beyond this problem. Indeed, they could be used to detect structural anomalies using physical attributes

    Biological Systems Workbook: Data modelling and simulations at molecular level

    Get PDF
    Nowadays, there are huge quantities of data surrounding the different fields of biology derived from experiments and theoretical simulations, where results are often stored in biological databases that are growing at a vertiginous rate every year. Therefore, there is an increasing research interest in the application of mathematical and physical models able to produce reliable predictions and explanations to understand and rationalize that information. All these investigations are helping to overcome biological questions pushing forward in the solution of problems faced by our society. In this Biological Systems Workbook, we aim to introduce the basic pieces allowing life to take place, from the 3D structural point of view. We will start learning how to look at the 3D structure of molecules from studying small organic molecules used as drugs. Meanwhile, we will learn some methods that help us to generate models of these structures. Then we will move to more complex natural organic molecules as lipid or carbohydrates, learning how to estimate and reproduce their dynamics. Later, we will revise the structure of more complex macromolecules as proteins or DNA. Along this process, we will refer to different computational tools and databases that will help us to search, analyze and model the different molecular systems studied in this course
    • …
    corecore