91 research outputs found

    Partitioned Least Squares

    Get PDF
    In this paper we propose a variant of the linear least squares model allowing practitioners to partition the input features into groups of variables that they require to contribute similarly to the final result. The output allows practitioners to assess the importance of each group and of each variable in the group. We formally show that the new formulation is not convex and provide two alternative methods to deal with the problem: one non-exact method based on an alternating least squares approach; and one exact method based on a reformulation of the problem using an exponential number of sub-problems whose minimum is guaranteed to be the optimal solution. We formally show the correctness of the exact method and also compare the two solutions showing that the exact solution provides better results in a fraction of the time required by the alternating least squares solution (assuming that the number of partitions is small). For the sake of completeness, we also provide an alternative branch and bound algorithm that can be used in place of the exact method when the number of partitions is too large, and a proof of NP-completeness of the optimization problem introduced in this paper

    Ligand binding site superposition and comparison based on Atomic Property Fields: identification of distant homologues, convergent evolution and PDB-wide clustering of binding sites

    Get PDF
    A new binding site comparison algorithm using optimal superposition of the continuous pharmacophoric property distributions is reported. The method demonstrates high sensitivity in discovering both, distantly homologous and convergent binding sites. Good quality of superposition is also observed on multiple examples. Using the new approach, a measure of site similarity is derived and applied to clustering of ligand binding pockets in PDB

    The Interaction Properties of the Human Rab GTPase Family – A Comparative Analysis Reveals Determinants of Molecular Binding Selectivity

    Get PDF
    Rab GTPases constitute the largest subfamily of the Ras protein superfamily. Rab proteins regulate organelle biogenesis and transport, and display distinct binding preferences for effector and activator proteins, many of which have not been elucidated yet. The underlying molecular recognition motifs, binding partner preferences and selectivities are not well understood.Comparative analysis of the amino acid sequences and the three-dimensional electrostatic and hydrophobic molecular interaction fields of 62 human Rab proteins revealed a wide range of binding properties with large differences between some Rab proteins. This analysis assists the functional annotation of Rab proteins 12, 14, 26, 37 and 41 and provided an explanation for the shared function of Rab3 and 27. Rab7a and 7b have very different electrostatic potentials, indicating that they may bind to different effector proteins and thus, exert different functions. The subfamily V Rab GTPases which are associated with endosome differ subtly in the interaction properties of their switch regions, and this may explain exchange factor specificity and exchange kinetics.We have analysed conservation of sequence and of molecular interaction fields to cluster and annotate the human Rab proteins. The analysis of three dimensional molecular interaction fields provides detailed insight that is not available from a sequence-based approach alone. Based on our results, we predict novel functions for some Rab proteins and provide insights into their divergent functions and the determinants of their binding partner selectivity

    Optimal assignment methods for ligand-based virtual screening

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Ligand-based virtual screening experiments are an important task in the early drug discovery stage. An ambitious aim in each experiment is to disclose active structures based on new scaffolds. To perform these "scaffold-hoppings" for individual problems and targets, a plethora of different similarity methods based on diverse techniques were published in the last years. The optimal assignment approach on molecular graphs, a successful method in the field of quantitative structure-activity relationships, has not been tested as a ligand-based virtual screening method so far.</p> <p>Results</p> <p>We evaluated two already published and two new optimal assignment methods on various data sets. To emphasize the "scaffold-hopping" ability, we used the information of chemotype clustering analyses in our evaluation metrics. Comparisons with literature results show an improved early recognition performance and comparable results over the complete data set. A new method based on two different assignment steps shows an increased "scaffold-hopping" behavior together with a good early recognition performance.</p> <p>Conclusion</p> <p>The presented methods show a good combination of chemotype discovery and enrichment of active structures. Additionally, the optimal assignment on molecular graphs has the advantage to investigate and interpret the mappings, allowing precise modifications of internal parameters of the similarity measure for specific targets. All methods have low computation times which make them applicable to screen large data sets.</p

    The Cysteine-Rich Protein Thimet Oligopeptidase as a Model of the Structural Requirements for S-glutathiolation and Oxidative Oligomerization

    Get PDF
    Thimet oligopeptidase (EP24.15) is a cysteine-rich metallopeptidase containing fifteen Cys residues and no intra-protein disulfide bonds. Previous work on this enzyme revealed that the oxidative oligomerization of EP24.15 is triggered by S-glutathiolation at physiological GSSG levels (10–50 µM) via a mechanism based on thiol-disulfide exchange. In the present work, our aim was to identify EP24.15 Cys residues that are prone to S-glutathiolation and to determine which structural features in the cysteinyl bulk are responsible for the formation of mixed disulfides through the reaction with GSSG and, in this particular case, the Cys residues within EP24.15 that favor either S-glutathiolation or inter-protein thiol-disulfide exchange. These studies were conducted by in silico structural analyses and simulations as well as site-specific mutation. S-glutathiolation was determined by mass spectrometric analyses and western blotting with anti-glutathione antibody. The results indicated that the stabilization of a thiolate sulfhydryl and the solvent accessibility of the cysteines are necessary for S-thiolation. The Solvent Access Surface analysis of the Cys residues prone to glutathione modification showed that the S-glutathiolated Cys residues are located inside pockets where the sulfur atom comes into contact with the solvent and that the positively charged amino acids are directed toward these Cys residues. The simulation of a covalent glutathione docking onto the same Cys residues allowed for perfect glutathione posing. A mutation of the Arg residue 263 that forms a saline bridge to the Cys residue 175 significantly decreased the overall S-glutathiolation and oligomerization of EP24.15. The present results show for the first time the structural requirements for protein S-glutathiolation by GSSG and are consistent with our previous hypothesis that EP24.15 oligomerization is dependent on the electron transfer from specific protonated Cys residues of one molecule to previously S-glutathionylated Cys residues of another one

    Novel Inhibitor Design for Hemagglutinin against H1N1 Influenza Virus by Core Hopping Method

    Get PDF
    The worldwide spread of H1N1 avian influenza and the increasing reports about its resistance to the current drugs have made a high priority for developing new anti-influenza drugs. Owing to its unique function in assisting viruses to bind the cellular surface, a key step for them to subsequently penetrate into the infected cell, hemagglutinin (HA) has become one of the main targets for drug design against influenza virus. To develop potent HA inhibitors, the ZINC fragment database was searched for finding the optimal compound with the core hopping technique. As a result, the Neo6 compound was obtained. It has been shown through the subsequent molecular docking studies and molecular dynamic simulations that Neo6 not only assumes more favorable conformation at the binding pocket of HA but also has stronger binding interaction with its receptor. Accordingly, Neo6 may become a promising candidate for developing new and more powerful drugs for treating influenza. Or at the very least, the findings reported here may provide useful insights to stimulate new strategy in this area

    Deciphering the Arginine-Binding Preferences at the Substrate-Binding Groove of Ser/Thr Kinases by Computational Surface Mapping

    Get PDF
    Protein kinases are key signaling enzymes that catalyze the transfer of γ-phosphate from an ATP molecule to a phospho-accepting residue in the substrate. Unraveling the molecular features that govern the preference of kinases for particular residues flanking the phosphoacceptor is important for understanding kinase specificities toward their substrates and for designing substrate-like peptidic inhibitors. We applied ANCHORSmap, a new fragment-based computational approach for mapping amino acid side chains on protein surfaces, to predict and characterize the preference of kinases toward Arginine binding. We focus on positions P−2 and P−5, commonly occupied by Arginine (Arg) in substrates of basophilic Ser/Thr kinases. The method accurately identified all the P−2/P−5 Arg binding sites previously determined by X-ray crystallography and produced Arg preferences that corresponded to those experimentally found by peptide arrays. The predicted Arg-binding positions and their associated pockets were analyzed in terms of shape, physicochemical properties, amino acid composition, and in-silico mutagenesis, providing structural rationalization for previously unexplained trends in kinase preferences toward Arg moieties. This methodology sheds light on several kinases that were described in the literature as having non-trivial preferences for Arg, and provides some surprising departures from the prevailing views regarding residues that determine kinase specificity toward Arg. In particular, we found that the preference for a P−5 Arg is not necessarily governed by the 170/230 acidic pair, as was previously assumed, but by several different pairs of acidic residues, selected from positions 133, 169, and 230 (PKA numbering). The acidic residue at position 230 serves as a pivotal element in recognizing Arg from both the P−2 and P−5 positions

    Bound Water at Protein-Protein Interfaces: Partners, Roles and Hydrophobic Bubbles as a Conserved Motif

    Get PDF
    Background There is a great interest in understanding and exploiting protein-protein associations as new routes for treating human disease. However, these associations are difficult to structurally characterize or model although the number of X-ray structures for protein-protein complexes is expanding. One feature of these complexes that has received little attention is the role of water molecules in the interfacial region. Methodology A data set of 4741 water molecules abstracted from 179 high-resolution (≤ 2.30 Å) X-ray crystal structures of protein-protein complexes was analyzed with a suite of modeling tools based on the HINT forcefield and hydrogen-bonding geometry. A metric termed Relevance was used to classify the general roles of the water molecules. Results The water molecules were found to be involved in: a) (bridging) interactions with both proteins (21%), b) favorable interactions with only one protein (53%), and c) no interactions with either protein (26%). This trend is shown to be independent of the crystallographic resolution. Interactions with residue backbones are consistent for all classes and account for 21.5% of all interactions. Interactions with polar residues are significantly more common for the first group and interactions with non-polar residues dominate the last group. Waters interacting with both proteins stabilize on average the proteins\u27 interaction (−0.46 kcal mol−1), but the overall average contribution of a single water to the protein-protein interaction energy is unfavorable (+0.03 kcal mol−1). Analysis of the waters without favorable interactions with either protein suggests that this is a conserved phenomenon: 42% of these waters have SASA ≤ 10 Å2 and are thus largely buried, and 69% of these are within predominantly hydrophobic environments or “hydrophobic bubbles”. Such water molecules may have an important biological purpose in mediating protein-protein interactions
    corecore