PROteolysis TArgeting Chimeras (PROTACs) are an emerging therapeutic modality
for degrading a protein of interest (POI) by marking it for degradation by the
proteasome. Recent developments in artificial intelligence (AI) suggest that
deep generative models can assist with the de novo design of molecules with
desired properties, and their application to PROTAC design remains largely
unexplored. We show that a graph-based generative model can be used to propose
novel PROTAC-like structures from empty graphs. Our model can be guided towards
the generation of large molecules (30--140 heavy atoms) predicted to degrade a
POI through policy-gradient reinforcement learning (RL). Rewards during RL are
applied using a boosted tree surrogate model that predicts a molecule's
degradation potential for each POI. Using this approach, we steer the
generative model towards compounds with higher likelihoods of predicted
degradation activity. Despite being trained on sparse public data, the
generative model proposes molecules with substructures found in known
degraders. After fine-tuning, predicted activity against a challenging POI
increases from 50% to >80% with near-perfect chemical validity for sampled
compounds, suggesting this is a promising approach for the optimization of
large, PROTAC-like molecules for targeted protein degradation.Comment: Presented at NeurIPS 2022 AI4Science Worksho