5 research outputs found
Machine learning analysis of RB-TnSeq fitness data predicts functional gene modules in Pseudomonas putida KT2440.
UNLABELLED: There is growing interest in engineering Pseudomonas putida KT2440 as a microbial chassis for the conversion of renewable and waste-based feedstocks, and metabolic engineering of P. putida relies on the understanding of the functional relationships between genes. In this work, independent component analysis (ICA) was applied to a compendium of existing fitness data from randomly barcoded transposon insertion sequencing (RB-TnSeq) of P. putida KT2440 grown in 179 unique experimental conditions. ICA identified 84 independent groups of genes, which we call fModules (functional modules), where gene members displayed shared functional influence in a specific cellular process. This machine learning-based approach both successfully recapitulated previously characterized functional relationships and established hitherto unknown associations between genes. Selected gene members from fModules for hydroxycinnamate metabolism and stress resistance, acetyl coenzyme A assimilation, and nitrogen metabolism were validated with engineered mutants of P. putida. Additionally, functional gene clusters from ICA of RB-TnSeq data sets were compared with regulatory gene clusters from prior ICA of RNAseq data sets to draw connections between gene regulation and function. Because ICA profiles the functional role of several distinct gene networks simultaneously, it can reduce the time required to annotate gene function relative to manual curation of RB-TnSeq data sets. IMPORTANCE: This study demonstrates a rapid, automated approach for elucidating functional modules within complex genetic networks. While Pseudomonas putida randomly barcoded transposon insertion sequencing data were used as a proof of concept, this approach is applicable to any organism with existing functional genomics data sets and may serve as a useful tool for many valuable applications, such as guiding metabolic engineering efforts in other microbes or understanding functional relationships between virulence-associated genes in pathogenic microbes. Furthermore, this work demonstrates that comparison of data obtained from independent component analysis of transcriptomics and gene fitness datasets can elucidate regulatory-functional relationships between genes, which may have utility in a variety of applications, such as metabolic modeling, strain engineering, or identification of antimicrobial drug targets
Multiplexed fitness profiling by RB-TnSeq elucidates pathways for lignin-related aromatic catabolism in Sphingobium sp. SYK-6
Summary: Bioconversion of lignin-related aromatic compounds relies on robust catabolic pathways in microbes. Sphingobium sp. SYK-6 (SYK-6) is a well-characterized aromatic catabolic organism that has served as a model for microbial lignin conversion, and its utility as a biocatalyst could potentially be further improved by genome-wide metabolic analyses. To this end, we generate a randomly barcoded transposon insertion mutant (RB-TnSeq) library to study gene function in SYK-6. The library is enriched under dozens of enrichment conditions to quantify gene fitness. Several known aromatic catabolic pathways are confirmed, and RB-TnSeq affords additional detail on the genome-wide effects of each enrichment condition. Selected genes are further examined in SYK-6 or Pseudomonas putida KT2440, leading to the identification of new gene functions. The findings from this study further elucidate the metabolism of SYK-6, while also providing targets for future metabolic engineering in this organism or other hosts for the biological valorization of lignin
Machine learning analysis of RB-TnSeq fitness data predicts functional gene modules in Pseudomonas putida KT2440
There is growing interest in engineering Pseudomonas putida KT2440 as a microbial chassis for the conversion of renewable and waste-based feedstocks, and metabolic engineering of P. putida relies on the understanding of the functional relationships between genes. In this work, independent component analysis (ICA) was applied to a compendium of existing fitness data from randomly barcoded transposon insertion sequencing (RB-TnSeq) of P. putida KT2440 grown in 179 unique experimental conditions. ICA identified 84 independent groups of genes, which we call fModules (“functional modules”), where gene members displayed shared functional influence in a specific cellular process. This machine learning-based approach both successfully recapitulated previously characterized functional relationships and established hitherto unknown associations between genes. Selected gene members from fModules for hydroxycinnamate metabolism and stress resistance, acetyl coenzyme A assimilation, and nitrogen metabolism were validated with engineered mutants of P. putida. Additionally, functional gene clusters from ICA of RB-TnSeq data sets were compared with regulatory gene clusters from prior ICA of RNAseq data sets to draw connections between gene regulation and function. Because ICA profiles the functional role of several distinct gene networks simultaneously, it can reduce the time required to annotate gene function relative to manual curation of RB-TnSeq data sets.</p