11 research outputs found
Fragment Based Drug Discovery: Practical Implementation Based on <sup>19</sup>F NMR Spectroscopy
Fragment based drug discovery (FBDD) is a widely used
tool for
discovering novel therapeutics. NMR is a powerful means for implementing
FBDD, and several approaches have been proposed utilizing <sup>1</sup>H–<sup>15</sup>N heteronuclear single quantum coherence (HSQC)
as well as one-dimensional <sup>1</sup>H and <sup>19</sup>F NMR to
screen compound mixtures against a target of interest. While proton-based
NMR methods of fragment screening (FBS) have been well documented
and are widely used, the use of <sup>19</sup>F detection in FBS has
been only recently introduced (Vulpetti et al. <i>J. Am. Chem.
Soc.</i> <b>2009</b>, <i>131</i> (36), 12949–12959)
with the aim of targeting “fluorophilic” sites in proteins.
Here, we demonstrate a more general use of <sup>19</sup>F NMR-based
fragment screening in several areas: as a key tool for rapid and sensitive
detection of fragment hits, as a method for the rapid development
of structure–activity relationship (SAR) on the hit-to-lead
path using in-house libraries and/or commercially available compounds,
and as a quick and efficient means of assessing target druggability
MELLODDY: cross pharma federated learning at unprecedented scale unlocks benefits in QSAR without compromising proprietary information
Federated multi-partner machine learning can be an appealing and efficient method to increase the effective training data volume and thereby the predictivity of models, particularly when the generation of training data is resource intensive. In the landmark MELLODDY project, each of ten pharmaceutical companies realized aggregated improvements on its own classification and/or regression models through federated learning. To this end, they leveraged a novel implementation extending multi-task learning across partners, on a platform audited for privacy and security. The experiments involved an unprecedented cross-pharma dataset of 2.6+ billion confidential experimental activity data points, documenting 21+ million physical small molecules and 40+ thousand assays in on-target and secondary pharmacodynamics and pharmacokinetics. Appropriate complementary metrics were developed to evaluate predictive performance in the federated setting. In addition to predictive performance increases in labeled space, the results point towards an extended applicability domain in federated learning. Increases in collective training data volume, including by means of auxiliary data resulting from single concentration high-throughput and imaging assays, continued to boost predictive performances, albeit with saturating return. Markedly higher improvements were observed for pharmacokinetics and safety panel assay-based task subsets
MELLODDY: Cross-pharma Federated Learning at Unprecedented Scale Unlocks Benefits in QSAR without Compromising Proprietary Information.
Federated multipartner machine learning has been touted as an appealing and efficient method to increase the effective training data volume and thereby the predictivity of models, particularly when the generation of training data is resource-intensive. In the landmark MELLODDY project, indeed, each of ten pharmaceutical companies realized aggregated improvements on its own classification or regression models through federated learning. To this end, they leveraged a novel implementation extending multitask learning across partners, on a platform audited for privacy and security. The experiments involved an unprecedented cross-pharma data set of 2.6+ billion confidential experimental activity data points, documenting 21+ million physical small molecules and 40+ thousand assays in on-target and secondary pharmacodynamics and pharmacokinetics. Appropriate complementary metrics were developed to evaluate the predictive performance in the federated setting. In addition to predictive performance increases in labeled space, the results point toward an extended applicability domain in federated learning. Increases in collective training data volume, including by means of auxiliary data resulting from single concentration high-throughput and imaging assays, continued to boost predictive performance, albeit with a saturating return. Markedly higher improvements were observed for the pharmacokinetics and safety panel assay-based task subsets
Development of in silico models to predict viscosity and mouse clearance using a comprehensive analytical data set collected on 83 scaffold-consistent monoclonal antibodies
ABSTRACTBiologic drug discovery pipelines are designed to deliver protein therapeutics that have exquisite functional potency and selectivity while also manifesting biophysical characteristics suitable for manufacturing, storage, and convenient administration to patients. The ability to use computational methods to predict biophysical properties from protein sequence, potentially in combination with high throughput assays, could decrease timelines and increase the success rates for therapeutic developability engineering by eliminating lengthy and expensive cycles of recombinant protein production and testing. To support development of high-quality predictive models for antibody developability, we designed a sequence-diverse panel of 83 effector functionless IgG1 antibodies displaying a range of biophysical properties, produced and formulated each protein under standard platform conditions, and collected a comprehensive package of analytical data, including in vitro assays and in vivo mouse pharmacokinetics. We used this robust training data set to build machine learning classifier models that can predict complex protein behavior from these data and features derived from predicted and/or experimental structures. Our models predict with 87% accuracy whether viscosity at 150 mg/mL is above or below a threshold of 15 centipoise (cP) and with 75% accuracy whether the area under the plasma drug concentration–time curve (AUC0–672 h) in normal mouse is above or below a threshold of 3.9 × 106 h x ng/mL