Search CORE

32 research outputs found

Analysis of X-ray Structures of Matrix Metalloproteinases via Chaotic Map Clustering

Author: Bellotti Roberto
Carotti Angelo
De Carlo Francesco
Gargano Gianfranco
Giangreco Ilenia
Nicolotti Orazio
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Matrix metalloproteinases (MMPs) are well-known biological targets implicated in tumour progression, homeostatic regulation, innate immunity, impaired delivery of pro-apoptotic ligands, and the release and cleavage of cell-surface receptors. With this in mind, the perception of the intimate relationships among diverse MMPs could be a solid basis for accelerated learning in designing new selective MMP inhibitors. In this regard, decrypting the latent molecular reasons in order to elucidate similarity among MMPs is a key challenge. Results We describe a pairwise variant of the non-parametric chaotic map clustering (CMC) algorithm and its application to 104 X-ray MMP structures. In this analysis electrostatic potentials are computed and used as input for the CMC algorithm. It was shown that differences between proteins reflect genuine variation of their electrostatic potentials. In addition, the analysis has been also extended to analyze the protein primary structures and the molecular shapes of the MMP co-crystallised ligands. Conclusions The CMC algorithm was shown to be a valuable tool in knowledge acquisition and transfer from MMP structures. Based on the variation of electrostatic potentials, CMC was successful in analysing the MMP target family landscape and different subsites. The first investigation resulted in rational figure interpretation of both domain organization as well as of substrate specificity classifications. The second made it possible to distinguish the MMP classes, demonstrating the high specificity of the S1' pocket, to detect both the occurrence of punctual mutations of ionisable residues and different side-chain conformations that likely account for induced-fit phenomena. In addition, CMC demonstrated a potential comparable to the most popular UPGMA (Unweighted Pair Group Method with Arithmetic mean) method that, at present, represents a standard clustering bioinformatics approach. Interestingly, CMC and UPGMA resulted in closely comparable outcomes, but often CMC produced more informative and more easy interpretable dendrograms. Finally, CMC was successful for standard pairwise analysis (i.e., Smith-Waterman algorithm) of protein sequences and was used to convincingly explain the complementarity existing between the molecular shapes of the co-crystallised ligand molecules and the accessible MMP void volumes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Bari

Insights into the Complex Formed by Matrix Metalloproteinase-2 and Alloxan Inhibitors: Molecular Dynamics Simulations and Free Energy Calculations

Author: A Agrawal
A Conejo-García
A Stefanachi
A Tochowicz
AC Nicolescu
Angela Stefanachi
Angelo Carotti
Antonio Laghezza
AW Millar
B Pirard
C Altomare
C Knox
CA Lipinski
CI Bayly
CJ Malemud
CM Overall
DA Case
E Nuti
EI Deryugina
Eugene A. Permyakov
F Hemmings
Francesco Leonetti
Gianluca Lattanzi
H Brandstetter
HJC Berendsen
I Giangreco
Ilenia Giangreco
J Aqvist
J Wang
J Wang
JA Blagg
JC Phillips
JE Rundhaug
JH Hu
JT Peterson
L Pisani
LH Foley
LP Hammett
Marco Catto
MJ Frisch
ML Verdonk
N Diaz
O Nicolotti
Orazio Nicolotti
P Pospisil
RJ Radmer
RT Aimes
T Simonson
T Steinbrecher
TEI Cheatham
V Essman
Y Feng
Y Yan
YC Cheng
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Matrix metalloproteinases (MMP) are well-known biological targets implicated in tumour progression, homeostatic regulation, innate immunity, impaired delivery of pro-apoptotic ligands, and the release and cleavage of cell-surface receptors. Hence, the development of potent and selective inhibitors targeting these enzymes continues to be eagerly sought. In this paper, a number of alloxan-based compounds, initially conceived to bias other therapeutically relevant enzymes, were rationally modified and successfully repurposed to inhibit MMP-2 (also named gelatinase A) in the nanomolar range. Importantly, the alloxan core makes its debut as zinc binding group since it ensures a stable tetrahedral coordination of the catalytic zinc ion in concert with the three histidines of the HExxHxxGxxH metzincin signature motif, further stabilized by a hydrogen bond with the glutamate residue belonging to the same motif. The molecular decoration of the alloxan core with a biphenyl privileged structure allowed to sample the deep S1′ specificity pocket of MMP-2 and to relate the high affinity towards this enzyme with the chance of forming a hydrogen bond network with the backbone of Leu116 and Asn147 and the side chains of Tyr144, Thr145 and Arg149 at the bottom of the pocket. The effect of even slight structural changes in determining the interaction at the S1′ subsite of MMP-2 as well as the nature and strength of the binding is elucidated via molecular dynamics simulations and free energy calculations. Among the herein presented compounds, the highest affinity (pIC50 = 7.06) is found for BAM, a compound exhibiting also selectivity (>20) towards MMP-2, as compared to MMP-9, the other member of the gelatinases

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Bari

Defining Kawasaki disease and pediatric inflammatory multisystem syndrome-temporally associated to SARS-CoV-2 infection during SARS-CoV-2 epidemic in Italy: results from a national, multicenter survey

Author: 00165 Rome
Achille Marino: Department of Pediatrics
Adele Civino: U. O. C. Pediatria P. O. Vito Fazzi Lecce
Agostina Marolda: Pediatrics and Neonatology Dipartment
Alessandra Lazzerotti: Clinica Pediatrica
Alessandra Manerba: Child Cardiology
Alessandria Italy
Alice Dell’Anna: U. O. C. Pediatria P. O. Vito Fazzi Lecce
Alma Olivieri: Dipartimento della donna
Ancona Italy
Andrea Campana: Bambino Gesù Children’s Hospital
Angela Mauro: Department of Paediatrics
Angela Miniaci: Clinica Pediatrica
AO SS Antonio e Biago e C. Arrigo
AORN “Sant’Anna e San Sebastiano”- Caserta
AOU Meyer
ASL Romagna
ASST Bergamo-EST
ASST FBF Sacco
ASST Grande Ospedale Metropolitano Niguarda
ASST Grande Ospedale Metropolitano Niguarda
ASST Lodi
ASST Monza
ASST Ovest Milanese
ASST Spedali Civili di Brescia and University of Brescia
ASST Spedali Civili di Brescia e Università degli Studi di Brescia
Azienda Ospedaliero-Universitaria di Bologna
Barbara Teruzzi: Maternal and Child Health
Bari Italy
Bergamo Italy
Bergamo Italy
Bianca Lattanzi: SOD Pediatria
Bologna Italy
Bologna Italy
Bracaglia C
Brescia Italy
Brescia Italy
Cagliari Italy
Caorsi R
Cattalini M
Children’s Hospital V Buzzi
Cimaz R
Claudia Santagati: Dipartimento di Pediatria
Clinica Pediatrica
Clotilde Alizzi: Department of Health Promotion Sciences Maternal and Infantile Care
Consolaro A
del bambino e di chirurgia generale e specialistica
Della Paolera S
Dellepiane RM
Department of Pediatrics
Department of Woman’s and Child’s Health
Desio Hospital
Desio Italy
Dipartimento della Salute della Donna e del Bambin
Division of Paediatrics
Division of Paediatrics
Domenico Sperlì: UOC di Pediatria S. O. “Annunziata” - A. O. di Cosenza
Donato Rigante: Department of Pediatrics
Eleonora Dei Rossi: University of Trieste
Elpidio Tierno: UOC di Pediatria
Emanuela Del Giudice: Department of Maternal Infantile and Urological Sciences
Emergency Department
Enrico Felici: Pediatric and Pediatric Emergency Unit
Fabi M
Florence
Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico
Fondazione IRCCS Policlinico San Matteo
Fondazione MBBM - onlus c/o Ospedale San Gerardo
Francesca Biscaro: UOC Pediatria
Francesca Minoia: Fondazione IRCCS Cà Granda
Francesca Ricci
Francesco Licciardi: Department of Pediatrics and Public Health
Giangreco M
Gianluca Vergine
Giorgia Martini
Giovanni Conti: Nefrologia e Reumatologia Pediatrica con Dialisi Azienda Ospedaliero-Universitario “G. Martino”
Giovanni Filocamo: Fondazione IRCCS Cà Granda
Grazia Bossi: UOC Pediatria
Gubbio-Gualdo Tadino Italy
Guido Pennoni: Dipartimento Materno-Infantile
Hospital Papa Giovanni XXIII
Hospital Papa Giovanni XXIII
Ilenia Floretta: Pediatria
Internal Medicine and Medical Specialities “G. D’Alessandro”
IRCCS S. Orsola-Malpighi Hospital
IRCCS S. Orsola-Malpighi Hospital
Italy
Italy. Angelo Mazza: Paediatric Department
Italy. Maria Vincenza Mastrolia: Pediatric Rheumatology Unit
La Torre F
Laura Martelli: Paediatric Department
Lodi
Lucia Augusta Baselli: Pediatric Intermediate Care Unit
Macerata
Magenta Milan
Maggio MC
Maia De Luca: Bambino Gesù Children’s Hospital
Manzoni Lecco
Marcello Lanari: Department of Pediatrics
Marche-Nord
Marchesi A
Maria Loreta Foschini: SC Pediatria
Martina Soliani: Pediatria ASST Cremona Italy
Matilde Rossi: UOC di Pediatrai e Neonatologia
Maurizio Carone: UO Malattie Infettive
Meini A
Meneghel A
Milan Italy
Milan Italy
Milano
Milano
Milano Italy
Milano Italy
Montin D
Monza Italy
Naples
Napoli
Ospedale A
Ospedale Ca’ Foncello
Ospedale di Macerata
Ospedale di Rovigo
Ospedale Infermi
Ospedale Maggiore Policlinico
Ospedale Maggiore Policlinico
Ospedale Pediatrico ‘Giovanni XXIII’
Ospedale Santa Chiara
Ospedali Riuniti
Padua Italy
Palermo Italy
Patrizia Barone: Unità Operativa Complessa di Broncopneumologia Pediatrica AOU “Policlinico - Vittorio Emanuele Via Santa Sofia 78 Catania
Pavia Italy
Pediatria
Piazza S. Onofrio n. 4
Piero Valentini
PO SAN MICHELE AOBrotzu
Polo Pontini
Ravelli A
Reumatologia
Rimini Italy.
Rome Italy
Rome Italy
Rome Italy
Rome Italy
Rome Italy
Rossana Pignataro: UOC Pediatria e Neonatologia
Rovigo
Santobono-Pausilipon Children’s Hospital
Sapeinza University of Rome
Sara Stucchi: Maternal and Child Health
Savina Mannarino: Division of Cardiology
Seriate Bergamo
Silvia Sonego: University of Trieste
Simonini G
Sottile R
Taddio A Maria Concetta Alberelli: UOC Pediatria
Tatiana Utytatnikova: Dipartimento Materno-Infantile
The Children Hospital
Trento Italy
Treviso
Trieste Italy
Trieste Italy
Turin Italy
Univarsità Cattolica Sacro Cuore
Univarsità Cattolica Sacro Cuore
University of Bologna
University of Florence
University of Padova
University of Palermo
University of Turin
Università della Campania
Università Milano Bicocca
UOC Pediatria Rimini
Verdoni L
Veronica Bennato: U. O. Pediatria
Villani A
Zuccotti G
Zunica F
“G. Fornaroli” Hospital
“L Vanvitelli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Background: There is mounting evidence on the existence of a Pediatric Inflammatory Multisystem Syndrome-temporally associated to SARS-CoV-2 infection (PIMS-TS), sharing similarities with Kawasaki Disease (KD). The main outcome of the study were to better characterize the clinical features and the treatment response of PIMS-TS and to explore its relationship with KD determining whether KD and PIMS are two distinct entities. Methods: The Rheumatology Study Group of the Italian Pediatric Society launched a survey to enroll patients diagnosed with KD (Kawasaki Disease Group - KDG) or KD-like (Kawacovid Group - KCG) disease between February 1st 2020, and May 31st 2020. Demographic, clinical, laboratory data, treatment information, and patients' outcome were collected in an online anonymized database (RedCAP®). Relationship between clinical presentation and SARS-CoV-2 infection was also taken into account. Moreover, clinical characteristics of KDG during SARS-CoV-2 epidemic (KDG-CoV2) were compared to Kawasaki Disease patients (KDG-Historical) seen in three different Italian tertiary pediatric hospitals (Institute for Maternal and Child Health, IRCCS "Burlo Garofolo", Trieste; AOU Meyer, Florence; IRCCS Istituto Giannina Gaslini, Genoa) from January 1st 2000 to December 31st 2019. Chi square test or exact Fisher test and non-parametric Wilcoxon Mann-Whitney test were used to study differences between two groups. Results: One-hundred-forty-nine cases were enrolled, (96 KDG and 53 KCG). KCG children were significantly older and presented more frequently from gastrointestinal and respiratory involvement. Cardiac involvement was more common in KCG, with 60,4% of patients with myocarditis. 37,8% of patients among KCG presented hypotension/non-cardiogenic shock. Coronary artery abnormalities (CAA) were more common in the KDG. The risk of ICU admission were higher in KCG. Lymphopenia, higher CRP levels, elevated ferritin and troponin-T characterized KCG. KDG received more frequently immunoglobulins (IVIG) and acetylsalicylic acid (ASA) (81,3% vs 66%; p = 0.04 and 71,9% vs 43,4%; p = 0.001 respectively) as KCG more often received glucocorticoids (56,6% vs 14,6%; p < 0.0001). SARS-CoV-2 assay more often resulted positive in KCG than in KDG (75,5% vs 20%; p < 0.0001). Short-term follow data showed minor complications. Comparing KDG with a KD-Historical Italian cohort (598 patients), no statistical difference was found in terms of clinical manifestations and laboratory data. Conclusion: Our study suggests that SARS-CoV-2 infection might determine two distinct inflammatory diseases in children: KD and PIMS-TS. Older age at onset and clinical peculiarities like the occurrence of myocarditis characterize this multi-inflammatory syndrome. Our patients had an optimal response to treatments and a good outcome, with few complications and no deaths

Archivio Istituzionale della Ricerca - Università degli Studi della Campania "Luigi Vanvitelli"

Pharmacophore Binding Motifs for Nicotinamide Adenine Dinucleotide Analogues Across Multiple Protein Families: A Detailed Contact-Based Analysis of the Interaction between Proteins and NAD(P) Cofactors

Author: Ilenia Giangreco (342729)
Martin J. Packer (1926952)
Publication venue
Publication date
Field of study

We have analyzed the protein-binding pharmacophore of NAD and its close analogues in all protein–ligand structures available in the RCSB database as of February 2012; this analysis has then been used to assess the novelty of structures emerging after that date. We show that proteins have evolved diverse pharmacophore motifs for binding the adenine moiety, fewer, but still diverse, motifs for nicotinamide, and a very limited set of motifs for binding the pyrophosphate linker. Our exhaustive analysis includes a pharmacophore contact analysis for over 1900 protein–ligand structures containing NAD analogues; we have benchmarked this set of contacts against nearly 27 000 protein–ligand structures to demonstrate that the diversity of interactions seen with NAD is very similar to that seen for all other ligands. Hence, variation in binding motifs for NAD is not distinct from that observed for other ligands and they show significant variation across protein families

FigShare

Recommended from our members

Applying atomistic neural networks to bias conformer ensembles towards bioactive-like conformations.

Author: Baillif Benoit
Bender Andreas
Cole Jason
Giangreco Ilenia
McCabe Patrick
Publication venue: J Cheminform
Publication date: 22/12/2023
Field of study

Acknowledgements: The authors would like to thank the University of Cambridge and the Cambridge Crystallographic Data Centre for funding this work.Funder: Cambridge Crystallographic Data Centre; doi: http://dx.doi.org/10.13039/501100022029Funder: University of Cambridge; doi: http://dx.doi.org/10.13039/501100000735Identifying bioactive conformations of small molecules is an essential process for virtual screening applications relying on three-dimensional structure such as molecular docking. For most small molecules, conformer generators retrieve at least one bioactive-like conformation, with an atomic root-mean-square deviation (ARMSD) lower than 1 Å, among the set of low-energy conformers generated. However, there is currently no general method to prioritise these likely target-bound conformations in the ensemble. In this work, we trained atomistic neural networks (AtNNs) on 3D information of generated conformers of a curated subset of PDBbind ligands to predict the ARMSD to their closest bioactive conformation, and evaluated the early enrichment of bioactive-like conformations when ranking conformers by AtNN prediction. AtNN ranking was compared with bioactivity-unaware baselines such as ascending Sage force field energy ranking, and a slower bioactivity-based baseline ranking by ascending Torsion Fingerprint Deviation to the Maximum Common Substructure to the most similar molecule in the training set (TFD2SimRefMCS). On test sets from random ligand splits of PDBbind, ranking conformers using ComENet, the AtNN encoding the most 3D information, leads to early enrichment of bioactive-like conformations with a median BEDROC of 0.29 ± 0.02, outperforming the best bioactivity-unaware Sage energy ranking baseline (median BEDROC of 0.18 ± 0.02), and performing on a par with the bioactivity-based TFD2SimRefMCS baseline (median BEDROC of 0.31 ± 0.02). The improved performance of the AtNN and TFD2SimRefMCS baseline is mostly observed on test set ligands that bind proteins similar to proteins observed in the training set. On a more challenging subset of flexible molecules, the bioactivity-unaware baselines showed median BEDROCs up to 0.02, while AtNNs and TFD2SimRefMCS showed median BEDROCs between 0.09 and 0.13. When performing rigid ligand re-docking of PDBbind ligands with GOLD using the 1% top-ranked conformers, ComENet ranked conformers showed a higher successful docking rate than bioactivity-unaware baselines, with a rate of 0.48 ± 0.02 compared to CSD probability baseline with a rate of 0.39 ± 0.02. Similarly, on a pharmacophore searching experiment, selecting the 20% top-ranked conformers ranked by ComENet showed higher hit rate compared to baselines. Hence, the approach presented here uses AtNNs successfully to focus conformer ensembles towards bioactive-like conformations, representing an opportunity to reduce computational expense in virtual screening applications on known targets that require input conformations

Apollo (Cambridge)

Applying atomistic neural networks to bias conformer ensembles towards bioactive-like conformations

Author: Andreas Bender
Benoit Baillif
Ilenia Giangreco
Jason Cole
Patrick McCabe
Publication venue: BMC
Publication date: 01/12/2023
Field of study

Abstract Identifying bioactive conformations of small molecules is an essential process for virtual screening applications relying on three-dimensional structure such as molecular docking. For most small molecules, conformer generators retrieve at least one bioactive-like conformation, with an atomic root-mean-square deviation (ARMSD) lower than 1 Å, among the set of low-energy conformers generated. However, there is currently no general method to prioritise these likely target-bound conformations in the ensemble. In this work, we trained atomistic neural networks (AtNNs) on 3D information of generated conformers of a curated subset of PDBbind ligands to predict the ARMSD to their closest bioactive conformation, and evaluated the early enrichment of bioactive-like conformations when ranking conformers by AtNN prediction. AtNN ranking was compared with bioactivity-unaware baselines such as ascending Sage force field energy ranking, and a slower bioactivity-based baseline ranking by ascending Torsion Fingerprint Deviation to the Maximum Common Substructure to the most similar molecule in the training set (TFD2SimRefMCS). On test sets from random ligand splits of PDBbind, ranking conformers using ComENet, the AtNN encoding the most 3D information, leads to early enrichment of bioactive-like conformations with a median BEDROC of 0.29 ± 0.02, outperforming the best bioactivity-unaware Sage energy ranking baseline (median BEDROC of 0.18 ± 0.02), and performing on a par with the bioactivity-based TFD2SimRefMCS baseline (median BEDROC of 0.31 ± 0.02). The improved performance of the AtNN and TFD2SimRefMCS baseline is mostly observed on test set ligands that bind proteins similar to proteins observed in the training set. On a more challenging subset of flexible molecules, the bioactivity-unaware baselines showed median BEDROCs up to 0.02, while AtNNs and TFD2SimRefMCS showed median BEDROCs between 0.09 and 0.13. When performing rigid ligand re-docking of PDBbind ligands with GOLD using the 1% top-ranked conformers, ComENet ranked conformers showed a higher successful docking rate than bioactivity-unaware baselines, with a rate of 0.48 ± 0.02 compared to CSD probability baseline with a rate of 0.39 ± 0.02. Similarly, on a pharmacophore searching experiment, selecting the 20% top-ranked conformers ranked by ComENet showed higher hit rate compared to baselines. Hence, the approach presented here uses AtNNs successfully to focus conformer ensembles towards bioactive-like conformations, representing an opportunity to reduce computational expense in virtual screening applications on known targets that require input conformations

Directory of Open Access Journals

Alloxan Derivatives as Inhibitors of Matrix Metalloproteinase-2: Theoretical Calculations and Experimental Results

Author: Angelo Carotti
Ilenia Giangreco
LATTANZI G
Marco Catto
Orazio NIcolotti
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Matrix metalloproteinases (MMPs) are a family of structurally related zinc-containing endopeptidases involved in tissue remodelling and degradation of the extracellular matrix. The failure of common synthetic inhibitors makes the design of new selective and potent MMP inhibitors an extreme challenge in health care for the treatment of various pathological states such as inflammation, arthritis, and cancer. In this view, an over-expression of MMP-2 is supposed to be responsible for the occurrence of many different human tumours and inflammatory processes involving the hydrolysis of the type IV collagen, the main component of the basement membrane. A series of studies therefore focused on the design of new potential inhibitors biased towards MMP-2: campaigns of molecular virtual screening of several large chemical libraries resulted in a number of attractive hits. Interestingly, a shortlist of alloxan-like structures was selected with inhibition constants in the nM range. In this respect, we investigated a series of complexes of MMP-2 with alloxan inhibitors by thermodynamic integration in all atoms molecular dynamics simulations. We thus obtained quantitative differences in binding free energies for a list of alloxan compounds. On this basis, we were able to elucidate the molecular rationale for the remarkable inhibition exerted by these compounds with the ultimate aim of driving the synthesis of new more potent and selective derivatives that are at present awaiting for further experimental investigations through enzymatic assays

Elsevier - Publisher Connector

Archivio istituzionale della ricerca - Università di Bari

Mining the Cambridge Structural Database for Matched Molecular Crystal Structures: A Systematic Exploration of Isostructurality

Author: Elizabeth Thomas (137168)
Ilenia Giangreco (342729)
Jason C. Cole (2079700)
Publication venue
Publication date
Field of study

The Cambridge Structural Database (CSD) is the world leading collection of small-molecule crystal structures and represents an invaluable resource for crystal engineers. It enables structures to be readily compared and new insights to be gained from the comparison. In order to search the database for pairs of structures that are related by the same chemical transformation, and to systematically investigate the effect of this transformation on crystal packing, a repository of matched molecular crystal structures has been derived from the CSD. This makes it easy to find all pairs of structures differing by the same chemical change or, alternatively, all available chemical modifications to a given CSD entry. Our analysis shows one of the many possible applications of these data. An extensive, yet not exhaustive, exploration of isostructurality across the entire CSD has been carried out with the aim of identifying packing features within crystals that maintain isostructurality. With particular focus on terminal chemical modifications observed between single-component structures with Z′ equal to 1, packing similarity has been calculated with an enhanced version of existing software. Across the entire data set of approximately 125 000 matched molecular pairs, 4% of the pairs were isostructural. Several cases showed an enrichment with respect to this baseline value, and examples have been discussed to illustrate some of the questions which can be asked and how they can be answered using the data set. This will open up avenues of research for the future and increase our understanding of the impact of functional groups on crystal packing

FigShare

Alloxan Derivatives as Inhibitors of Matrix Metalloproteinase-2: Theoretical Calculations and Experimental Results

Author: Angelo Carotti
Gianluca Lattanzi
Ilenia Giangreco
Marco Catto
Orazio NIcolotti
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

An Extensive and Diverse Set of Molecular Overlays for the Validation of Pharmacophore Programs

Author: David A. Cosgrove (1347696)
Ilenia Giangreco (342729)
Martin J. Packer (1716334)
Publication venue
Publication date
Field of study

The pharmacophore hypothesis plays a central role in both the design and optimization of drug-like ligands. Pharmacophore patterns are invoked to explain the binding affinity of ligands and to enable the design of chemically distinct scaffolds that show affinity for a protein target of interest. The importance of pharmacophores in rationalizing ligand affinity has led to numerous algorithms that seek to overlay ligands based on their pharmacophoric features. All such algorithms must be validated with respect to known ligand overlays, usually by extracting ligand overlay sets from the Protein Data Bank (PDB). This validation step creates the problem of which of the known overlays to select and from which proteins. The large number of structures and protein families in the PDB makes it difficult to establish a definitive overlay set; as a result, validation studies have rarely employed the same data sets. We have therefore undertaken an exhaustive analysis of the RCSB PDB to identify 121 distinct ligand overlay sets. We have defined a robust protein overlay protocol, which is free from subjective interpretation over which residues to include, and we have analyzed each overlay set on the basis of whether they provide evidence for the pharmacophore hypothesis. Our final data set spans a broad range of structural types and degrees of difficulty and includes overlays that any algorithm should be able to reproduce, as well as some for which there is very weak evidence for a conserved pharmacophore at all. We provide this set in the hope that it will prove definitive, at least until the PDB is greatly enriched with further structures or with radically different protein folds and families. Upon publication, the data set will be available for free download from the Web site of the Cambridge Crystallographic Data Centre

FigShare