30 research outputs found
Protein pocket and ligand shape comparison and its application in virtual screening
Understanding molecular recognition is one major requirement for drug discovery and design. Physicochemical and shape complementarity between two binding partners is the driving force during complex formation. In this study, the impact of shape within this process is analyzed. Protein binding pockets and co-crystallized ligands are represented by normalized principal moments of inertia ratios (NPRs). The corresponding descriptor space is triangular, with its corners occupied by spherical, discoid, and elongated shapes. An analysis of a selected set of sc-PDB complexes suggests that pockets and bound ligands avoid spherical shapes, which are, however, prevalent in small unoccupied pockets. Furthermore, a direct shape comparison confirms previous studies that on average only one third of a pocket is filled by its bound ligand, supplemented by a 50 % subpocket coverage. In this study, we found that shape complementary is expressed by low pairwise shape distances in NPR space, short distances between the centers-of-mass, and small deviations in the angle between the first principal ellipsoid axes. Furthermore, it is assessed how different binding pocket parameters are related to bioactivity and binding efficiency of the co-crystallized ligand. In addition, the performance of different shape and size parameters of pockets and ligands is evaluated in a virtual screening scenario performed on four representative target
Short linear motif candidates in the cell entry system used by SARS-CoV-2 and their potential therapeutic implications
The first reported receptor for SARS-CoV-2 on host cells was the angiotensin-converting enzyme 2 (ACE2). However, the viral spike protein also has an RGD motif, suggesting that cell surface integrins may be co-receptors. We examined the sequences of ACE2 and integrins with the Eukaryotic Linear Motif (ELM) resource and identified candidate short linear motifs (SLiMs) in their short, unstructured, cytosolic tails with potential roles in endocytosis, membrane dynamics, autophagy, cytoskeleton, and cell signaling. These SLiM candidates are highly conserved in vertebrates and may interact with the ÎĽ2 subunit of the endocytosis-associated AP2 adaptor complex, as well as with various protein domains (namely, I-BAR, LC3, PDZ, PTB, and SH2) found in human signaling and regulatory proteins. Several motifs overlap in the tail sequences, suggesting that they may act as molecular switches, such as in response to tyrosine phosphorylation status. Candidate LC3-interacting region (LIR) motifs are present in the tails of integrin β3 and ACE2, suggesting that these proteins could directly recruit autophagy components. Our findings identify several molecular links and testable hypotheses that could uncover mechanisms of SARS-CoV-2 attachment, entry, and replication against which it may be possible to develop host-directed therapies that dampen viral infection and disease progression. Several of these SLiMs have now been validated to mediate the predicted peptide interactions.Fil: MĂ©száros, Bálint. European Molecular Biology Laboratory; AlemaniaFil: Sámano Sánchez, Hugo. European Molecular Biology Laboratory; AlemaniaFil: Alvarado Valverde, JesĂşs. European Molecular Biology Laboratory; Alemania. Ruprecht Karls Universitat Heidelberg; AlemaniaFil: ÄŚalyševa, Jelena. European Molecular Biology Laboratory; Alemania. Ruprecht Karls Universitat Heidelberg; AlemaniaFil: Martinez Perez, Elizabeth. FundaciĂłn Instituto Leloir; Argentina. European Molecular Biology Laboratory; Alemania. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas; ArgentinaFil: Alves, Renato. European Molecular Biology Laboratory; AlemaniaFil: Shields, Denis C.. Universidad de Dublin; IrlandaFil: Kumar, Manjeet. European Molecular Biology Laboratory; AlemaniaFil: Rippmann, Friedrich. Computational Chemistry & Biology; AlemaniaFil: Chemes, Lucia Beatriz. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas. Centro CientĂfico TecnolĂłgico Conicet - La Plata. Instituto de Investigaciones BiotecnolĂłgicas. Universidad Nacional de San MartĂn. Instituto de Investigaciones BiotecnolĂłgicas; ArgentinaFil: Gibson, Toby James. European Molecular Biology Laboratory; Alemani
Context-enriched molecule representations improve few-shot drug discovery
A central task in computational drug discovery is to construct models from
known active molecules to find further promising molecules for subsequent
screening. However, typically only very few active molecules are known.
Therefore, few-shot learning methods have the potential to improve the
effectiveness of this critical phase of the drug discovery process. We
introduce a new method for few-shot drug discovery. Its main idea is to enrich
a molecule representation by knowledge about known context or reference
molecules. Our novel concept for molecule representation enrichment is to
associate molecules from both the support set and the query set with a large
set of reference (context) molecules through a Modern Hopfield Network.
Intuitively, this enrichment step is analogous to a human expert who would
associate a given molecule with familiar molecules whose properties are known.
The enrichment step reinforces and amplifies the covariance structure of the
data, while simultaneously removing spurious correlations arising from the
decoration of molecules. Our approach is compared with other few-shot methods
for drug discovery on the FS-Mol benchmark dataset. On FS-Mol, our approach
outperforms all compared methods and therefore sets a new state-of-the art for
few-shot learning in drug discovery. An ablation study shows that the
enrichment step of our method is the key to improve the predictive quality. In
a domain shift experiment, we further demonstrate the robustness of our method.
Code is available at https://github.com/ml-jku/MHNfs
KiSSim: Predicting off-targets from structural similarities in the kinome
Protein kinases are among the most important drug targets because their dysregulation can cause cancer, inflammatory, and degenerative diseases. Developing selective inhibitors is challenging due to the highly conserved binding sites across the roughly 500 human kinases. Thus, detecting subtle similarities on a structural level can help to explain and predict off-targets among the kinase family.
Here, we present the kinase-focused and subpocket-enhanced KiSSim fingerprint (Kinase Structural Similarity). The fingerprint builds on the KLIFS pocket definition, composed of 85 residues aligned across all available protein kinase structures, which enables residue-by-residue comparison without a computationally expensive alignment. The residues\u27 physicochemical and spatial properties are encoded within their structural context including key subpockets at the hinge region, the DFG motif, and the front pocket.
Since structure was found to contain information complementary to sequence, we used the fingerprint to calculate all-against-all similarities within the structurally covered kinome. Thereby, we could identify off-targets that are unexpected if solely considering the sequence-based kinome tree grouping; for example, Erlobinib’s known kinase off-targets SLK and LOK show high similarities to the key target EGFR (TK group) though belonging to the STE group. KiSSim reflects profiling data better or at least as well as other approaches such as KLIFS pocket sequence identity, KLIFS interaction fingerprints (IFPs), or SiteAlign. To rationalize observed (dis)similarities, the fingerprint values can be visualized in 3D by coloring structures with residue and feature resolution.
We believe that the KiSSim fingerprint is a valuable addition to the kinase research toolbox to guide off-target and polypharmacology prediction. The method is distributed as an open-source Python package on GitHub and as conda package: https://github.com/volkamerlab/kissi
Avoiding hERG-liability in drug design via synergetic combinations of different (Q)SAR methodologies and data sources: a case study in an industrial setting
Abstract In this paper, we explore the impact of combining different in silico prediction approaches and data sources on the predictive performance of the resulting system. We use inhibition of the hERG ion channel target as the endpoint for this study as it constitutes a key safety concern in drug development and a potential cause of attrition. We will show that combining data sources can improve the relevance of the training set in regard of the target chemical space, leading to improved performance. Similarly we will demonstrate that combining multiple statistical models together, and with expert systems, can lead to positive synergistic effects when taking into account the confidence in the predictions of the merged systems. The best combinations analyzed display a good hERG predictivity. Finally, this work demonstrates the suitability of the SOHN methodology for building models in the context of receptor based endpoints like hERG inhibition when using the appropriate pharmacophoric descriptors
Coupling Matched Molecular Pairs with Machine Learning for Virtual Compound Optimization
Matched molecular
pair (MMP) analyses are widely used in compound
optimization projects to gain insights into structure–activity
relationships (SAR). The analysis is traditionally done via statistical
methods but can also be employed together with machine learning (ML)
approaches to extrapolate to novel compounds. The here introduced
MMP/ML method combines a fragment-based MMP implementation with different
machine learning methods to obtain automated SAR decomposition and
prediction. To test the prediction capabilities and model transferability,
two different compound optimization scenarios were designed: (1) “new
fragments” which occurs when exploring new fragments for a
defined compound series and (2) “new static core and transformations”
which resembles for instance the identification of a new compound
series. Very good results were achieved by all employed machine learning
methods especially for the new fragments case, but overall deep neural
network models performed best, allowing reliable predictions also
for the new static core and transformations scenario, where comprehensive
SAR knowledge of the compound series is missing. Furthermore, we show
that models trained on all available data have a higher generalizability
compared to models trained on focused series and can extend beyond
chemical space covered in the training data. Thus, coupling MMP with
deep neural networks provides a promising approach to make high quality
predictions on various data sets and in different compound optimization
scenarios
Metacomputing in Practice: A Distributed Compute Server for Pharmaceutical Industry
We describe a distributed high-performance compute server that has been implemented for running compute-intensive applications on a mixture of HPC systems interconnected by Inter- and Intranet. With a practical industrial background, our work focusses on high availability, efficient job load-balancing, security, and the easy integration of HPC computing into the daily work-flow at pharmaceutical companies. The work was done in the course of the ESPRIT project Phase (A Distributed Pharmaceutical Application Server). The client software is implemented in Java. All results are displayed in a web browser and can be forwarded to the next stage of applications used in the drug design cycle. The server software handles the job load-balancing between the participating HPC nodes and is capable of managing multi-site applications. Our environment currently supports four key applications that are used in rational drug design and drug target identification. They range from the automatic functiona..