Search CORE

1,620 research outputs found

Structural descriptor database: a new tool for sequence-based functional site prediction

Author: Bernardes Juliana S
Fernandez Jorge H
Vasconcelos Ana Tereza R
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

A structural classification of protein-protein interactions for detection of convergently evolved motifs and for prediction of protein binding sites on sequence level

Author: Henschel Andreas
Publication venue: Technische Universität Dresden
Publication date: 17/10/2008
Field of study

BACKGROUND: A long-standing challenge in the post-genomic era of Bioinformatics is the prediction of protein-protein interactions, and ultimately the prediction of protein functions. The problem is intrinsically harder, when only amino acid sequences are available, but a solution is more universally applicable. So far, the problem of uncovering protein-protein interactions has been addressed in a variety of ways, both experimentally and computationally. MOTIVATION: The central problem is: How can protein complexes with solved threedimensional structure be utilized to identify and classify protein binding sites and how can knowledge be inferred from this classification such that protein interactions can be predicted for proteins without solved structure? The underlying hypothesis is that protein binding sites are often restricted to a small number of residues, which additionally often are well-conserved in order to maintain an interaction. Therefore, the signal-to-noise ratio in binding sites is expected to be higher than in other parts of the surface. This enables binding site detection in unknown proteins, when homology based annotation transfer fails. APPROACH: The problem is addressed by first investigating how geometrical aspects of domain-domain associations can lead to a rigorous structural classification of the multitude of protein interface types. The interface types are explored with respect to two aspects: First, how do interface types with one-sided homology reveal convergently evolved motifs? Second, how can sequential descriptors for local structural features be derived from the interface type classification? Then, the use of sequential representations for binding sites in order to predict protein interactions is investigated. The underlying algorithms are based on machine learning techniques, in particular Hidden Markov Models. RESULTS: This work includes a novel approach to a comprehensive geometrical classification of domain interfaces. Alternative structural domain associations are found for 40% of all family-family interactions. Evaluation of the classification algorithm on a hand-curated set of interfaces yielded a precision of 83% and a recall of 95%. For the first time, a systematic screen of convergently evolved motifs in 102.000 protein-protein interactions with structural information is derived. With respect to this dataset, all cases related to viral mimicry of human interface bindings are identified. Finally, a library of 740 motif descriptors for binding site recognition - encoded as Hidden Markov Models - is generated and cross-validated. Tests for the significance of motifs are provided. The usefulness of descriptors for protein-ligand binding sites is demonstrated for the case of &quot;ATP-binding&quot;, where a precision of 89% is achieved, thus outperforming comparable motifs from PROSITE. In particular, a novel descriptor for a P-loop variant has been used to identify ATP-binding sites in 60 protein sequences that have not been annotated before by existing motif databases

Technische Universität Dresden: Qucosa

PocketMatch: A new algorithm to compare binding sites in protein structures

Author: A Argyrou
A Stark
AG Murzin
AT Laurie
B Huang
CA Orengo
F Glaser
G Dodson
G Ramachandraiah
GJ Kleywegt
HM Berman
JA Barker
JH Choi
Kalidas Yeturu
L Holm
M Jambon
Nagasuma Chandra
ND Gold
ND Gold
R Russell
R Wang
RA Laskowski
RJ Morris
S Schmitt
T Prasad
Y Kalidas
Publication venue
Publication date: 01/01/2008
Field of study

Background: Recognizing similarities and deriving relationships among protein molecules is a fundamental
requirement in present-day biology. Similarities can be present at various levels which can be detected through comparison of protein sequences or their structural folds. In some cases similarities obscure at these levels could be present merely in the substructures at their binding sites. Inferring functional similarities between protein molecules by comparing their binding sites is still largely exploratory and not as yet a routine protocol. One of
the main reasons for this is the limitation in the choice of appropriate analytical tools that can compare binding sites with high sensitivity. To benefit from the enormous amount of structural data that is being rapidly accumulated, it is essential to have high throughput tools that enable large scale binding site comparison.

Results: Here we present a new algorithm PocketMatch for comparison of binding sites in a frame invariant
manner. Each binding site is represented by 90 lists of sorted distances capturing shape and chemical nature of the site. The sorted arrays are then aligned using an incremental alignment method and scored to obtain PMScores for pairs of sites. A comprehensive sensitivity analysis and an extensive validation of the algorithm have been carried out. Perturbation studies where the geometry of a given site was retained but the residue types were changed randomly, indicated that chance similarities were virtually non-existent. Our analysis also demonstrates that shape information alone is insufficient to discriminate between diverse binding sites, unless
combined with chemical nature of amino acids.

Conclusions: A new algorithm has been developed to compare binding sites in accurate, efficient and
high-throughput manner. Though the representation used is conceptually simplistic, we demonstrate that along
with the new alignment strategy used, it is sufficient to enable binding comparison with high sensitivity. Novel methodology has also been presented for validating the algorithm for accuracy and sensitivity with respect to geometry and chemical nature of the site. The method is also fast and takes about 1/250th second for one comparison on a single processor. A parallel version on BlueGene has also been implemented

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Nature Precedings

PocketMatch: A new algorithm to compare binding sites in protein structures

Author: Yeturu Kalidas
Nagasuma Chandra
Publication venue
Publication date: 01/01/2008
Field of study

Crossref

Nature Precedings

Kinome-wide interaction modelling using alignment-based and alignment-independent approaches for kinase description and linear and non-linear data analysis techniques

Author: A Golbraikh
A Kamb
A Linusson
A Navia-Vázquez
B Ustün
CC Chang
D Aha
E Freyhult
G Cruciani
G Manning
G Scapin
H Daub
H Drucker
HM Berman
I Dubchak
IH Witten
J Trygg
Jarl ES Wikberg
JD Griffin
JE Wikberg
JE Wikberg
K Illergård
KC Chou
KC Chou
LH Alifrangis
M Bhasin
M Bhasin
M Bhasin
M Lapinsh
M Reczko
M Sandberg
M Van Heel
MA Fabian
MA Larkin
Maris Lapins
MS Cohen
MW Karaman
NP Shah
O Devos
P Bamborough
P Geladi
QB Gao
RJ Quinlan
S Hua
S Madhusudan
S Wold
S Wold
S Wold
SD Peterson
T Lundstedt
TA Carter
V Vapnik
ZR Li
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Protein kinases play crucial roles in cell growth, differentiation, and apoptosis. Abnormal function of protein kinases can lead to many serious diseases, such as cancer. Kinase inhibitors have potential for treatment of these diseases. However, current inhibitors interact with a broad variety of kinases and interfere with multiple vital cellular processes, which causes toxic effects. Bioinformatics approaches that can predict inhibitor-kinase interactions from the chemical properties of the inhibitors and the kinase macromolecules might aid in design of more selective therapeutic agents, that show better efficacy and lower toxicity. Results We applied proteochemometric modelling to correlate the properties of 317 wild-type and mutated kinases and 38 inhibitors (12,046 inhibitor-kinase combinations) to the respective combination's interaction dissociation constant (Kd). We compared six approaches for description of protein kinases and several linear and non-linear correlation methods. The best performing models encoded kinase sequences with amino acid physico-chemical z-scale descriptors and used support vector machines or partial least- squares projections to latent structures for the correlations. Modelling performance was estimated by double cross-validation. The best models showed high predictive ability; the squared correlation coefficient for new kinase-inhibitor pairs ranging P2 = 0.67-0.73; for new kinases it ranged P2kin = 0.65-0.70. Models could also separate interacting from non-interacting inhibitor-kinase pairs with high sensitivity and specificity; the areas under the ROC curves ranging AUC = 0.92-0.93. We also investigated the relationship between the number of protein kinases in the dataset and the modelling results. Using only 10% of all data still a valid model was obtained with P2 = 0.47, P2kin = 0.42 and AUC = 0.83. Conclusions Our results strongly support the applicability of proteochemometrics for kinome-wide interaction modelling. Proteochemometrics might be used to speed-up identification and optimization of protein kinase targeted and multi-targeted inhibitors.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

QSAR model development for early stage screening of monoclonal antibody therapeutics to facilitate rapid developability

Author: Kizhedath Arathi
Publication venue: Newcastle University
Publication date: 01/01/2019
Field of study

PhD ThesisMonoclonal antibodies (mAbs) and related therapeutics are highly desirable from a biopharmaceutical perspective as they are highly target specific and well tolerated within the human system. Nevertheless, several mAbs have been discontinued or withdrawn based either on their inability to demonstrate efficacy and/or due to adverse effects. With nearly 80% of drugs failing in clinical development mainly due to lack of efficacy and safety there arises an urgent need for better understanding of biological activity, affinity, pharmacology, toxicity, immunogenicity etc. thus leading to early prediction of success/failure. In this study a hybrid modelling framework was developed that enabled early stage screening of mAbs. The applicability of the experimental methods was first tested on chemical compounds to assess the assay quality following which they were used to assess potential off target adverse effects of mAbs. Furthermore, hypersensitivity reactions were assessed using Skimune™, a non-artificial human skin explants based assay for safety and efficacy assessment of novel compounds and drugs, developed by Alcyomics Ltd. The suitability of Skimune™ for assessing the immune related adverse effects of aggregated mAbs was studied where aggregation was induced using a heat stress protocol. The aggregates were characterised by protein analysis techniques such as analytical ultra-centrifugation following which the immunogenicity tested using Skimune™ assay. Numerical features (descriptors) of mAbs were identified and generated using ProtDCal, EMBOSS Pepstat software as well as amino acid scales for different. Five independent and novel X block datasets consisting of these descriptors were generated based on the physicochemical, electronic, thermodynamic, electronic and topological properties of amino acids: Domain, Window, Substructure, Single Amino Acid, and Running Sum. This study describes the development of a hybrid QSAR based model with a structured workflow and clear evaluation metrics, with several optimisation steps, that could be beneficial for broader and more generic PLS modelling. Based on the results and observation from this study, it was demonstrated incremental improvement via selection of datasets and variables help in further optimisation of these hybrid models. Furthermore, using hypersensitivity and cross reactivity as responses and physicochemical characteristics of mAbs as descriptors, the QSAR models generated for different applicability domains allow for rapid early stage screening and developability. These models were validated with external test set comprising of proprietary compounds from industrial partners, thus paving way for enhanced developability that tackles manufacturing failures as well as attrition rates.European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie actions grant agreemen

Newcastle University eTheses

Pattern recognition methods for the prediction of chemical structures of fungal secondary metabolites

Author: Gore Sagar
Publication venue
Publication date: 01/01/2020
Field of study

Non-Ribosomal Peptide Synthetases (NRPS) are mega synthetases that are predominantly found in bacteria and fungi. They produce small peptides that serve numerous biological functions and crucial ecological roles. Adenylation (A) domains of NRPSs catalyze ATP dependent activation of substrates harboring carboxy terminus. A-domain substrates include not only natural amino acids (D and L forms) but also non-proteinogenic amino acids. As the substrate repertoire is large and specificity rules for fungi are not established well, there is a difficulty in predicting substrates for fungal A-domains. In bacteria, ten amino acid residues were established as NRPS code, which determine specificity of A-domains. To study relationships between fungal A-domains and their specificity, the cluster analysis of NRPS code residues was done. NRPS code residues were encoded by physicochemical properties essential for binding small molecules and these residues were clustered. Cluster analysis showed similar NRPS codes for α-amino adipic acid, and tryptophan, etc. between bacteria and fungi. Fungal NRPS codes for substrates such as tyrosine, and proline, did not cluster together with bacteria, which indicates an independent evolution of substrate specificity in fungi. This emphasizes the need for the development of a fungus-specific prediction tool. Currently available A-domain substrate specificity prediction tools accurately identify substrates for bacteria but fail to provide correct predictions for fungi. A novel approach for fungal A-domain substrate specificity prediction is presented here. Neural Network based A-domain substrate specificity classifier (NNassc) was developed using Keras with TensorFlow backend. NNassc was trained solely using fungal NRPS codes and combines physicochemical and structural features for specificity predictions. Internal and external validation datasets of experimentally verified NRPS codes were used to assess the performance of NNassc

Digitale Bibliothek Thüringen

Comparison of Protein Active Site Structures for Functional Annotation of Proteins and Drug Design

Author: Aloy
Altschul
Baker
Bryant
Cammer
Cammer
Christen
Denessiouk
Denessiouk
Ekins
Eswaramoorthy
Feng
Fetrow
Fetrow
Funk
Gibrat
Goldman
Goonesekere
Greaves
Guo
Gutteridge
Gutteridge
Hendlich
Henikoff
Henikoff
Holm
Johnson
Kabsch
Kinoshita
Kitson
Kleywegt
Ko
Kolker
Kubinyi
La
Labesse
Lander
Laskowski
Lim
Mendez
Mirny
Naumann
Overington
Pazos
Pembroke
Pharkya
Powers
Ringe
Schmitt
Schomburg
Schwieters
Shin
Stuart
Thompson
Tian
Traxler
Venter
Venter
Wang
Whisstock
Xu
Xu
Yakunin
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 21/07/2006
Field of study

Rapid and accurate functional assignment of novel proteins is increasing in importance, given the completion of numerous genome sequencing projects and the vastly expanding list of unannotated proteins. Traditionally, global primary-sequence and structure comparisons have been used to determine putative function. These approaches, however, do not emphasize similarities in active site configurations that are fundamental to a protein’s activity and highly conserved relative to the global and more variable structural features. The Comparison of Protein Active Site Structures (CPASS) database and software enable the comparison of experimentally identified ligand-binding sites to infer biological function and aid in drug discovery. The CPASS database comprises the ligand-defined active sites identified in the protein data bank, where the CPASS program compares these ligand-defined active sites to determine sequence and structural similarity without maintaining sequence connectivity. CPASS will compare any set of ligand-defined protein active sites, irrespective of the identity of the bound ligand

Crossref

DigitalCommons@University of Nebraska