Bioinformatic Analysis and Refactoring of Higher Anthracyclines

Abstract

Anthracyclines, renowned for their potent antibiotic and antitumor properties, stand as symbolic aromatic polyketides with diverse therapeutic applications, particularly in cancer treatment. Crypticity of biosynthetic gene clusters (BGCs) limits the identification of novel anthracyclines and higher anthracyclines remain poorly characterized due to complex glycosylation patterns. Therefore, a streamlined synthetic biology framework is essential for refactoring and validating these sophisticated biosynthetic assemblies to facilitate exploration and engineering of novel anthracycline analogs. This thesis establishes an integrated pipeline to address these limitations in synthetic anthracycline refactoring. This framework features automated homology detection with Cblaster, redundancy filtering with Cagecat Cleaner, Clinker comparison, KSβ-domain-based phylogenetic analysis and PRISM-based glycosylation predictions as well as a modular BioBrick-based pathway refactoring system integrated in Streptomyces coelicolor M1152ΔmatAB for decilorubicin and keyicin pathways. UHPLC analysis revealed successful 11-hydroxylation and 10-decarboxylation in the decilorubicin pathway. In the Keyicin pathway, on the other hand, 1-hydroxylation was successful, while glycosylation was absent, terminating subsequent tailoring to the aglycone. Interestingly, AlphaFold3 modelling and InterProScan domain analysis revealed that Kyc52 shows Rossmann-like folding consistent with a glycosyltransferase. Based on BLASTp similarity search, an auxiliary partner protein is required for glycosylation activation and the absence of such an activator could hinder glycosylation. The bioinformatics pipeline and scalable Inter ProScan domain analysis revealed that Kyc52 shows Rossmann-like folding consistent with a glycosyltransferase. Based on BLASTp similarity search, an auxiliary partner protein is required for glycosylation, which could indicate 7-Oglycosylation instead of 1-O-glycosylation. However, the absence of such an activator hindered glycosylation. The bioinformatics pipeline and scalable modular refactoring established in this study provide the grounds for the infrastructure to enable cryptic BGC identification, thereby accelerating the discovery and engineering of higher anthracyclines for novel therapeutic scaffolds

Similar works

Full text

thumbnail-image

National Library of Finland DSpace Services

redirect
Last time updated on 30/12/2025

This paper was published in National Library of Finland DSpace Services.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: CC BY 4.0