Search CORE

6 research outputs found

On How AI Needs to Change to Advance the Science of Drug Discovery

Author: Didi Kieran
Zečević Matej
Publication venue
Publication date: 23/12/2022
Field of study

Research around AI for Science has seen significant success since the rise of deep learning models over the past decade, even with longstanding challenges such as protein structure prediction. However, this fast development inevitably made their flaws apparent -- especially in domains of reasoning where understanding the cause-effect relationship is important. One such domain is drug discovery, in which such understanding is required to make sense of data otherwise plagued by spurious correlations. Said spuriousness only becomes worse with the ongoing trend of ever-increasing amounts of data in the life sciences and thereby restricts researchers in their ability to understand disease biology and create better therapeutics. Therefore, to advance the science of drug discovery with AI it is becoming necessary to formulate the key problems in the language of causality, which allows the explication of modelling assumptions needed for identifying true cause-effect relationships. In this attention paper, we present causal drug discovery as the craft of creating models that ground the process of drug discovery in causal reasoning.Comment: Main paper: 6 pages, References: 1.5 pages. Main paper: 3 figure

arXiv.org e-Print Archive

Benchmarking Generated Poses: How Rational is Structure-based Drug Design with Generative Models?

Author: Blundell Tom
Didi Kieran
Harris Charles
Jamasb Arian R.
Joshi Chaitanya K.
Lio Pietro
Mathis Simon V.
Publication venue
Publication date: 14/08/2023
Field of study

Deep generative models for structure-based drug design (SBDD), where molecule generation is conditioned on a 3D protein pocket, have received considerable interest in recent years. These methods offer the promise of higher-quality molecule generation by explicitly modelling the 3D interaction between a potential drug and a protein receptor. However, previous work has primarily focused on the quality of the generated molecules themselves, with limited evaluation of the 3D molecule \emph{poses} that these methods produce, with most work simply discarding the generated pose and only reporting a "corrected" pose after redocking with traditional methods. Little is known about whether generated molecules satisfy known physical constraints for binding and the extent to which redocking alters the generated interactions. We introduce PoseCheck, an extensive analysis of multiple state-of-the-art methods and find that generated molecules have significantly more physical violations and fewer key interactions compared to baselines, calling into question the implicit assumption that providing rich 3D structure information improves molecule complementarity. We make recommendations for future research tackling identified failure modes and hope our benchmark can serve as a springboard for future SBDD generative modelling work to have a real-world impact

arXiv.org e-Print Archive

Recommended from our members

Dynamics-Informed Protein Design with Structure Conditioning

Author: Didi Kieran
Jamnik Mateja
Komorowska Urszula Julia
Lio Pietro
Mathis Simon
Vargas Francisco
Publication venue: Department of Computer Science and Technology
Publication date: 24/04/2024
Field of study

Current protein generative models are able to design novel backbones with desired shapes or functional motifs. However, despite the importance of a protein’s dynamical properties for its function, conditioning on dynamical properties remains elusive. We present a new approach to protein generative modeling by leveraging Normal Mode Analysis that enables us to capture dynamical properties too. We introduce a method for conditioning the diffusion probabilistic models on protein dynamics, specifically on the lowest non-trivial normal mode of oscillation. Our method, similar to the classifier guidance conditioning, formulates the sampling process as being driven by conditional and unconditional terms. However, unlike previous works, we approximate the conditional term with a simple analytical function rather than an external neural network, thus making the eigenvector calculations approachable. We present the corresponding SDE theory as a formal justification of our approach. We extend our framework to conditioning on structure and dynamics at the same time, enabling scaffolding of the dynamical motifs. We demonstrate the empirical effectiveness of our method by turning the open-source unconditional protein diffusion model Genie into the conditional model with no retraining. Generated proteins exhibit the desired dynamical and structural properties while still being biologically plausible. Our work represents a first step towards incorporating dynamical behaviour in protein design and may open the door to designing more flexible and functional proteins in the future

Apollo (Cambridge)

Recommended from our members

MISATO: machine learning dataset of protein-ligand complexes for structure-based drug discovery.

Author: Benassou Sabrina
Didi Kieran
Kesselheim Stefan
Kitel Radosław
Liò Pietro
Menezes Filipe
Merdivan Erinc
Mourão André Santos Dias
Piraud Marie
Popowicz Grzegorz M
Sattler Michael
Siebenmorgen Till
Theis Fabian J
Publication venue: Nat Comput Sci
Publication date: 30/05/2024
Field of study

Acknowledgements: This work received funding from BMWi ZIM KK 5197901TS0 (T.S., F.M., G.M.P.) and BMBF, SUPREME, 031L0268 (T.S., F.M., G.M.P.). This work was supported by the Helmholtz Association’s Initiative and Networking Fund on the HAICORE@FZJ partition. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.Funder: BMWi ZIM. KK 5197901TS0Large language models have greatly enhanced our ability to understand biology and chemistry, yet robust methods for structure-based drug discovery, quantum chemistry and structural biology are still sparse. Precise biomolecule-ligand interaction datasets are urgently needed for large language models. To address this, we present MISATO, a dataset that combines quantum mechanical properties of small molecules and associated molecular dynamics simulations of ~20,000 experimental protein-ligand complexes with extensive validation of experimental data. Starting from the existing experimental structures, semi-empirical quantum mechanics was used to systematically refine these structures. A large collection of molecular dynamics traces of protein-ligand complexes in explicit water is included, accumulating over 170 μs. We give examples of machine learning (ML) baseline models proving an improvement of accuracy by employing our data. An easy entry point for ML experts is provided to enable the next generation of drug discovery artificial intelligence models

Apollo (Cambridge)

Recommended from our members

Biomolecular condensate phase diagrams with a combinatorial microdroplet platform.

Author: Acker Julia
Alberti Simon
Arter William E
Borodavka Alexander
Didi Kieran
Erkamp Nadia A
Franzmann Titus M
George-Hyslop Peter St
Guillén-Boixet Jordina
Hyman Anthony A
Knowles Tuomas PJ
Krainer Georg
Kuster David
Nixon-Abell Jonathan
Qamar Seema
Qi Runzhang
Welsh Timothy J
Publication venue: Nat Commun
Publication date: 23/01/2023
Field of study

The assembly of biomolecules into condensates is a fundamental process underlying the organisation of the intracellular space and the regulation of many cellular functions. Mapping and characterising phase behaviour of biomolecules is essential to understand the mechanisms of condensate assembly, and to develop therapeutic strategies targeting biomolecular condensate systems. A central concept for characterising phase-separating systems is the phase diagram. Phase diagrams are typically built from numerous individual measurements sampling different parts of the parameter space. However, even when performed in microwell plate format, this process is slow, low throughput and requires significant sample consumption. To address this challenge, we present here a combinatorial droplet microfluidic platform, termed PhaseScan, for rapid and high-resolution acquisition of multidimensional biomolecular phase diagrams. Using this platform, we characterise the phase behaviour of a wide range of systems under a variety of conditions and demonstrate that this approach allows the quantitative characterisation of the effect of small molecules on biomolecular phase transitions

Apollo (Cambridge)

Recommended from our members

Biomolecular condensate phase diagrams with a combinatorial microdroplet platform.

Author: Acker Julia
Alberti Simon
Arter William E
Borodavka Alexander
Didi Kieran
Erkamp Nadia A
Franzmann Titus M
George-Hyslop Peter St
Guillén-Boixet Jordina
Hyman Anthony A
Knowles Tuomas PJ
Krainer Georg
Kuster David
Nixon-Abell Jonathan
Qamar Seema
Qi Runzhang
Welsh Timothy J
Publication venue: Nat Commun
Publication date: 21/12/2022
Field of study

Funder: See main manuscript file.The assembly of biomolecules into condensates is a fundamental process underlying the organisation of the intracellular space and the regulation of many cellular functions. Mapping and characterising phase behaviour of biomolecules is essential to understand the mechanisms of condensate assembly, and to develop therapeutic strategies targeting biomolecular condensate systems. A central concept for characterising phase-separating systems is the phase diagram. Phase diagrams are typically built from numerous individual measurements sampling different parts of the parameter space. However, even when performed in microwell plate format, this process is slow, low throughput and requires significant sample consumption. To address this challenge, we present here a combinatorial droplet microfluidic platform, termed PhaseScan, for rapid and high-resolution acquisition of multidimensional biomolecular phase diagrams. Using this platform, we characterise the phase behaviour of a wide range of systems under a variety of conditions and demonstrate that this approach allows the quantitative characterisation of the effect of small molecules on biomolecular phase transitions

Apollo (Cambridge)