Search CORE

6 research outputs found

Benchmarks in antimicrobial peptide prediction are biased due to the selection of negative data

Author: Bakala Laura
Burdukiewicz Michal
Cooke Ira R.
Fingerhut Legana C.H.W.
Gagat Przemyslaw
Kala Jakub
Kolenda Rafal
Mackiewicz Pawel
Pietluch Filip
Rafacz Dominik
Rodiger Stefan
Sidorczuk Katarzyna
Slowik Jadwiga
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2022
Field of study

Antimicrobial peptides (AMPs) are a heterogeneous group of short polypeptides that target not only microorganisms but also viruses and cancer cells. Due to their lower selection for resistance compared with traditional antibiotics, AMPs have been attracting the ever-growing attention from researchers, including bioinformaticians. Machine learning represents the most cost-effective method for novel AMP discovery and consequently many computational tools for AMP prediction have been recently developed. In this article, we investigate the impact of negative data sampling on model performance and benchmarking. We generated 660 predictive models using 12 machine learning architectures, a single positive data set and 11 negative data sampling methods; the architectures and methods were defined on the basis of published AMP prediction software. Our results clearly indicate that similar training and benchmark data set, i.e. produced by the same or a similar negative data sampling method, positively affect model performance. Consequently, all the benchmark analyses that have been performed for AMP prediction models are significantly biased and, moreover, we do not know which model is the most accurate. To provide researchers with reliable information about the performance of AMP predictors, we also created a web server AMPBenchmark for fair model benchmarking. AMP Benchmark is available at http://BioGenies.info/AMPBenchmark

ResearchOnline at James Cook University

PubMed Central

Diposit Digital de Documents de la UAB

Characterization of signal and transit peptides based on motif composition and taxon-specific patterns

Author: Filip Pietluch
Katarzyna Sidorczuk
Paweł Mackiewicz
Przemysław Gagat
Publication venue: Nature Portfolio
Publication date: 01/09/2023
Field of study

Abstract Targeting peptides or presequences are N-terminal extensions of proteins that encode information about their cellular localization. They include signal peptides (SP), which target proteins to the endoplasmic reticulum, and transit peptides (TP) directing proteins to the organelles of endosymbiotic origin: chloroplasts and mitochondria. TPs were hypothesized to have evolved from antimicrobial peptides (AMPs), which are responsible for the host defence against microorganisms, including bacteria, fungi and viruses. In this study, we performed comprehensive bioinformatic analyses of amino acid motifs of targeting peptides and AMPs using a curated set of experimentally verified proteins. We identified motifs frequently occurring in each type of presequence showing specific patterns associated with their amino acid composition, and investigated their position within the presequence. We also compared motif patterns among different taxonomic groups and identified taxon-specific features, providing some evolutionary insights. Considering the functional relevance and many practical applications of targeting peptides and AMPs, we believe that our analyses will prove useful for their design, and better understanding of protein import mechanism and presequence evolution

Directory of Open Access Journals

Prediction of protein subplastid localization and origin with PlastoGram

Author: Burdukiewicz Michał
Gagat Przemysław
Kała Jakub
Mackiewicz Paweł
Nielsen Henrik
Pietluch Filip
Sidorczuk Katarzyna
Publication venue
Publication date: 01/01/2023
Field of study

Due to their complex history, plastids possess proteins encoded in the nuclear and plastid genome. Moreover, these proteins localize to various subplastid compartments. Since protein localization is associated with its function, prediction of subplastid localization is one of the most important steps in plastid protein annotation, providing insight into their potential function. Therefore, we create a novel manually curated data set of plastid proteins and build an ensemble model for prediction of protein subplastid localization. Moreover, we discuss problems associated with the task, e.g. data set sizes and homology reduction. PlastoGram classifies proteins as nuclear- or plastid-encoded and predicts their localization considering: envelope, stroma, thylakoid membrane or thylakoid lumen; for the latter, the import pathway is also predicted. We also provide an additional function to differentiate nuclear-encoded inner and outer membrane proteins. PlastoGram is available as a web server at and as an R package at . The code used for described analyses is available at

Diposit Digital de Documents de la UAB

Testing Antimicrobial Properties of Selected Short Amyloids

Author: Alicja Seniuk
Anna Duda-Madej
Filip Pietluch
Michał Burdukiewicz
Michał Ostrówka
Paweł Mackiewicz
Przemysław Gagat
Publication venue: 'MDPI AG'
Publication date: 01/01/2023
Field of study

Amyloids and antimicrobial peptides (AMPs) have many similarities, e.g., both kill microorganisms by destroying their membranes, form aggregates, and modulate the innate immune system. Given these similarities and the fact that the antimicrobial properties of short amyloids have not yet been investigated, we chose a group of potentially antimicrobial short amyloids to verify their impact on bacterial and eukaryotic cells. We used AmpGram, a best-performing AMP classification model, and selected ten amyloids with the highest AMP probability for our experimental research. Our results indicate that four tested amyloids: VQIVCK, VCIVYK, KCWCFT, and GGYLLG, formed aggregates under the conditions routinely used to evaluate peptide antimicrobial properties, but none of the tested amyloids exhibited antimicrobial or cytotoxic properties. Accordingly, they should be included in the negative datasets to train the next-generation AMP prediction models, based on experimentally confirmed AMP and non-AMP sequences. In the article, we also emphasize the importance of reporting non-AMPs, given that only a handful of such sequences have been officially confirmed

Directory of Open Access Journals

Prediction of protein subplastid localization and origin with PlastoGram

Author: Filip Pietluch
Henrik Nielsen
Jakub Kała
Katarzyna Sidorczuk
Michał Burdukiewicz
Paweł Mackiewicz
Przemysław Gagat
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2023
Field of study

Abstract Due to their complex history, plastids possess proteins encoded in the nuclear and plastid genome. Moreover, these proteins localize to various subplastid compartments. Since protein localization is associated with its function, prediction of subplastid localization is one of the most important steps in plastid protein annotation, providing insight into their potential function. Therefore, we create a novel manually curated data set of plastid proteins and build an ensemble model for prediction of protein subplastid localization. Moreover, we discuss problems associated with the task, e.g. data set sizes and homology reduction. PlastoGram classifies proteins as nuclear- or plastid-encoded and predicts their localization considering: envelope, stroma, thylakoid membrane or thylakoid lumen; for the latter, the import pathway is also predicted. We also provide an additional function to differentiate nuclear-encoded inner and outer membrane proteins. PlastoGram is available as a web server at https://biogenies.info/PlastoGram and as an R package at https://github.com/BioGenies/PlastoGram . The code used for described analyses is available at https://github.com/BioGenies/PlastoGram-analysis

Directory of Open Access Journals

Online Research Database In Technology

Conference Report: Why R? 2019

Author: Burdukiewicz Michal
Chilimoniuk Jaroslaw
Jessen Leon Eyrich
Kosinski Marcin
Pietluch Filip
Rafacz Dominik
Roediger Stefan
Sidorczuk Katarzyna
Wojcik Piotr
Publication venue
Publication date: 01/01/2020
Field of study

Online Research Database In Technology