5,463 research outputs found

    Exploring the potential of 3D Zernike descriptors and SVM for protein\u2013protein interface prediction

    Get PDF
    Abstract Background The correct determination of protein–protein interaction interfaces is important for understanding disease mechanisms and for rational drug design. To date, several computational methods for the prediction of protein interfaces have been developed, but the interface prediction problem is still not fully understood. Experimental evidence suggests that the location of binding sites is imprinted in the protein structure, but there are major differences among the interfaces of the various protein types: the characterising properties can vary a lot depending on the interaction type and function. The selection of an optimal set of features characterising the protein interface and the development of an effective method to represent and capture the complex protein recognition patterns are of paramount importance for this task. Results In this work we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0 and compared to other state-of-the-art protein interface predictors (SPPIDER, PrISE and NPS-HomPPI). Conclusions The 3D Zernike descriptors are able to capture the similarity among patterns of physico-chemical and biochemical properties mapped on the protein surface arising from the various spatial arrangements of the underlying residues, and their usage can be easily extended to other sets of amino acid properties. The results suggest that the choice of a proper set of features characterising the protein interface is crucial for the interface prediction task, and that optimality strongly depends on the class of proteins whose interface we want to characterise. We postulate that different protein classes should be treated separately and that it is necessary to identify an optimal set of features for each protein class

    Knowledge-based energy functions for computational studies of proteins

    Full text link
    This chapter discusses theoretical framework and methods for developing knowledge-based potential functions essential for protein structure prediction, protein-protein interaction, and protein sequence design. We discuss in some details about the Miyazawa-Jernigan contact statistical potential, distance-dependent statistical potentials, as well as geometric statistical potentials. We also describe a geometric model for developing both linear and non-linear potential functions by optimization. Applications of knowledge-based potential functions in protein-decoy discrimination, in protein-protein interactions, and in protein design are then described. Several issues of knowledge-based potential functions are finally discussed.Comment: 57 pages, 6 figures. To be published in a book by Springe

    Computational Approaches to Understanding the Structure, Dynamics, Functions, and Mechanisms of Various Bacterial Proteins

    Get PDF
    The 3D structure of a protein can be fundamentally useful for understanding protein function. In the absence of an experimentally determined structure, the most common way to obtain protein structures is to use homology modeling, or the mapping of the target sequence onto a closely related homolog with an available structure. However, despite recent efforts in structural biology, the 3D structures of many proteins remain unknown. Recent advances in genomic and metagenomic sequencing coupled with coevolution analysis and protein structure prediction have allowed for highly accurate models of proteins that were previously considered intractable to model due to the lack of suitable templates. Structural models obtained from homology modeling, coevolution-based modeling, or crystallography can then be used with other computational tools such as small molecule docking or molecular dynamics (MD) simulations to help understand protein function, dynamics, and mechanism.Here coevolution-based modeling was used to build a structural model of the HgcAB complex involved in mercury methylation (Chapter I). Based on the model it was proposed that conserved cysteines in HgcB are involved in shuttling mercury, methylmercury, or both. MD simulations and docking to a homology model of E. coli inosine monophosphate dehydrogenase (IMPDH) provided insights into how a single amino acid mutation could relieve inhibition by altering protein structure and dynamics (Chapter II). Coevolution-based structure prediction was also combined with docking, and experimental activity data to generate machine learning models that predict enzyme substrate scope for a series of bacterial nitrilases (Chapter III). Machine learning was also used to identify physicochemical properties that describe outer membrane permeability and efflux in E. coli and P. aeruginosa and new efflux pump inhibitors for the E. coli AcrAB-TolC efflux pump were identified using existing physicochemical guidelines in combination with small molecule docking to a homology model of AcrA (Chapter IV). Lastly, quantum mechanical/molecular mechanical simulations were used to study the mechanism of a key proton transfer step in Toho-1 beta-lactamase using experimentally determined structures of both the apo and cefotaxime-bound forms. These simulations revealed that substrate binding promotes catalysis by enhancing the favorability of this initial proton transfer step (Chapter V)

    The detection of meningococcal disease through identification of antimicrobial peptides using an in silico model creation

    Get PDF
    Philosophiae Doctor - PhDNeisseria meningitidis (the meningococcus), the causative agent of meningococcal disease (MD) was identified in 1887 and despite effective antibiotics and partially effective vaccines, Neisseria meningitidis (N. meningitidis) is the leading cause worldwide of meningitis and rapidly fatal sepsis usually in otherwise healthy individuals. Over 500 000 meningococcal cases occur every year. These numbers have made bacterial meningitis a top ten infectious cause of death worldwide. MD primarily affects children under 5 years of age, although in epidemic outbreaks there is a shift in disease to older children, adolescents and adults. MD is also associated with marked morbidity including limb loss, hearing loss, cognitive dysfunction, visual impairment, educational difficulties, developmental delays, motor nerve deficits, seizure disorders and behavioural problems. Antimicrobial peptides (AMPs) are molecules that provide protection against environmental pathogens, acting against a large number of microorganisms, including bacteria, fungi, yeast and virus. AMPs production is a major component of innate immunity against infection. The chemical properties of AMPs allow them to insert into the anionic cell wall and phospholipid membranes of microorganisms or bind to the bacteria making it easily detectable for diagnostic purposes. AMPs can be exploited for the generation of novel antibiotics, as biomarkers in the diagnosis of inflammatory conditions, for the manipulation of the inflammatory process, wound healing, autoimmunity and in the combat of tumour cells. Due to the severity of meningitis, early detection and identification of the strain of N. meningitidis is vital. Rapid and accurate diagnosis is essential for optimal management of patients and a major problem for MD is its diagnostic difficulties and experts conclude that with an early intervention the patient’ prognosis will be much improved. It is becoming increasingly difficult to confirm the diagnosis of meningococcal infection by conventional methods. Although polymerase chain reaction (PCR) has the potential advantage of providing more rapid confirmation of the presence of the bacterium than culturing, it is still time consuming as well as costly. Introduction of AMPs to bind to N. meningitidis receptors could provide a less costly and time consuming solution to the current diagnostic problems. World Health Organization (WHO) meningococcal meningitis program activities encourage laboratory strengthening to ensure prompt and accurate diagnosis to rapidly confirm the presence of MD. This study aimed to identify a list of putative AMPs showing antibacterial activity to N. meningitidis to be used as ligands against receptors uniquely expressed by the bacterium and for the identified AMPs to be used in a Lateral Flow Device (LFD) for the rapid and accurate diagnosis of MD

    Concepts to Interfere with Protein-Protein Complex Formations: Data Analysis, Structural Evidence and Strategies for Finding Small Molecule Modulators

    Get PDF
    (1) Analyzing protein-protein interactions at the atomic level is critical for our understanding of the principles governing the interactions involved in protein-protein recognition. For this purpose descriptors explaining the nature of different protein-protein complexes are desirable. In this work, we introduce Epic Protein Interface Classification (EPIC) as a framework handling the preparation, processing, and analysis of protein-protein complexes for classification with machine learning algorithms. We applied four different machine learning algorithms: Support Vector Machines (SVM), C4.5 Decision Trees, K Nearest Neighbors (KNN), and NaĂŻve Bayes (NB) algorithm in combination with three feature selection methods, Filter (Relief F), Wrapper, and Genetic Algorithms (GA) to extract discriminating features from the protein-protein complexes. To compare protein-protein complexes to each other, we represented the physicochemical characteristics of their interfaces in four different ways, using two different atomic contact vectors (ACVs), DrugScore pair potential vectors (DPV) and SFCscore descriptor vectors (SDV). We classified two different datasets: (A) 172 protein-protein complexes comprising 96 monomers, forming contacts enforced by the crystallographic packing environment (crystal contacts), and 76 biologically functional homodimer complexes; (B) 345 protein-protein complexes containing 147 permanent complexes and 198 transient complexes. We were able to classify up to 94.8% of the packing enforced/functional and up to 93.6% of the permanent/transient complexes correctly. Furthermore, we were able to extract relevant features from the different protein-protein complexes and introduce an approach for scoring the importance of the extracted features. (2) Since protein-protein interactions play pivotal role in the communication on the molecular level in virtually every biological system and process, the search and design for modulators of such interactions is of utmost interest. In recent years many inhibitors for specific protein-protein interactions have been developed, however, in only a few cases, small and druglike molecules are able to interfere the complex formation of proteins. On the other hand, there a several small molecules known to modulate protein-protein interactions by means of stabilizing an already assembled complex. To achieve this goal, a ligand is binding to a pocket, which is located rim-exposed at the interface of the interacting proteins, e.g. as the phytotoxin Fusicoccin, which stabilizes the interaction of plant H+-ATPase and 14-3-3 protein by nearly a factor of 100. To suggest alternative leads, we performed a virtual screening campaign to discover new molecules putatively stabilizing this complex. Furthermore, we screen a dataset of 198 transient recognition protein-protein complexes for cavities, which are located rim-exposed at their interfaces. We provide evidence for high similarity between such rim-exposed cavities and usual ligand accommodating active sites of enzymes. This analysis suggests that rim-exposed cavities at protein-protein interfaces are druggable targets. Therefore, the principle of stabilizing protein-protein interactions seems to be a promising alternative to the approach of the competitive inhibition of such interactions by small molecules. (3) AffinDB is a database of affinity data for structurally resolved protein-ligand complexes from the PDB. It is freely accessible at http://www.agklebe.de/affinity. Affinity data are collected from the scientific literature, both from primary sources describing the original experimental work of affinity determination and from secondary references which report affinity values determined by others. AffinDB currently contains over 730 affinity entries covering more than 450 different protein-ligand complexes. Besides the affinity value, PDB summary information and additional data are provided, including the experimental conditions of the affinity measurement (if available in the corresponding reference); 2D drawing, SMILES code, and molecular weight of the ligand; links to other databases, and bibliographic information. AffinDB can be queried by PDB code or by any combination of affinity range, temperature and pH-value of the measurement, ligand molecular weight, and publication data (author, journal, year). Search results can be saved as tabular reports in text files. The database is supposed to be a valuable resource for researchers interested in biomolecular recognition and the development of tools for correlating structural data with affinities, as needed, for example, in structure-based drug design
    • …
    corecore