1,092 research outputs found

    Machine learning and mapping algorithms applied to proteomics problems

    Get PDF
    Proteins provide evidence that a given gene is expressed, and machine learning algorithms can be applied to various proteomics problems in order to gain information about the underlying biology. This dissertation applies machine learning algorithms to proteomics data in order to predict whether or not a given peptide is observable by mass spectrometry, whether a given peptide can serve as a cell penetrating peptide, and then utilizes the peptides observed through mass spectrometry to aid in the structural annotation of the chicken genome. Peptides observed by mass spectrometry are used to identify proteins, and being able to accurately predict which peptides will be seen can allow researchers to analyze to what extent a given protein is observable. Cell penetrating peptides can possibly be utilized to allow targeted small molecule delivery across cellular membranes and possibly serve a role as drug delivery peptides. Peptides and proteins identified through mass spectrometry can help refine computational gene models and improve structural genome annotations

    Interpretable molecular encodings and representations for machine learning tasks

    Get PDF
    Molecular encodings and their usage in machine learning models have demonstrated significant breakthroughs in biomedical applications, particularly in the classification of peptides and proteins. To this end, we propose a new encoding method: Interpretable Carbon-based Array of Neighborhoods (iCAN). Designed to address machine learning models' need for more structured and less flexible input, it captures the neighborhoods of carbon atoms in a counting array and improves the utility of the resulting encodings for machine learning models. The iCAN method provides interpretable molecular encodings and representations, enabling the comparison of molecular neighborhoods, identification of repeating patterns, and visualization of relevance heat maps for a given data set. When reproducing a large biomedical peptide classification study, it outperforms its predecessor encoding. When extended to proteins, it outperforms a lead structure-based encoding on 71% of the data sets. Our method offers interpretable encodings that can be applied to all organic molecules, including exotic amino acids, cyclic peptides, and larger proteins, making it highly versatile across various domains and data sets. This work establishes a promising new direction for machine learning in peptide and protein classification in biomedicine and healthcare, potentially accelerating advances in drug discovery and disease diagnosis

    Transmembrane protein topology prediction using support vector machines

    Get PDF
    Background: Alpha-helical transmembrane (TM) proteins are involved in a wide range of important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion. Many are also prime drug targets, and it has been estimated that more than half of all drugs currently on the market target membrane proteins. However, due to the experimental difficulties involved in obtaining high quality crystals, this class of protein is severely under-represented in structural databases. In the absence of structural data, sequence-based prediction methods allow TM protein topology to be investigated.Results: We present a support vector machine-based (SVM) TM protein topology predictor that integrates both signal peptide and re-entrant helix prediction, benchmarked with full cross-validation on a novel data set of 131 sequences with known crystal structures. The method achieves topology prediction accuracy of 89%, while signal peptides and re-entrant helices are predicted with 93% and 44% accuracy respectively. An additional SVM trained to discriminate between globular and TM proteins detected zero false positives, with a low false negative rate of 0.4%. We present the results of applying these tools to a number of complete genomes. Source code, data sets and a web server are freely available from http://bioinf.cs.ucl.ac.uk/psipred/.Conclusion: The high accuracy of TM topology prediction which includes detection of both signal peptides and re-entrant helices, combined with the ability to effectively discriminate between TM and globular proteins, make this method ideally suited to whole genome annotation of alpha-helical transmembrane proteins

    Combination of gemcitabine with cell-penetrating peptides: A pharmacokinetic approach using in silico tools

    Get PDF
    Gemcitabine is an anticancer drug used to treat a wide range of solid tumors and is a first line treatment for pancreatic cancer. Our group has previously developed novel conjugates of gemcitabine with cell-penetrating peptides (CPP), and here we report some preliminary data regarding the pharmacokinetics of gemcitabine, two gemcitabine-CPP conjugates and respective CPP gathered from GastroPlus™, and analyze these results considering our previous evaluation of gemcitabine release and conjugates’ bioactivity. Additionally, seeking to shed some light on the relation between the penetration ability of CPP and their physicochemical properties, chemical descriptors for the 20 natural amino acids were calculated, a new principal property scale (z-scale) was created and CPP prediction models were developed, establishing quantitative structure-activity relationships (QSAR). The z-scores of the peptides conjugated with gemcitabine are presented and analyzed with the aforementioned data.This work has been financed by Fundo Europeu de Desenvolvimento Regional (FEDER) funds through the COMPETE 2020 - Operational Programme for Competitiveness and Internationalisation (POCI) and Portugal 2020, and Portuguese funds through Fundação para a Ciência e a Tecnologia (FCT, Portugal), in the framework of the project “Institute for Research and Innovation in Health Sciences” (POCI-01-0145-FEDER-007274), and through grant UID/QUI/50006/2019 (LAQV-REQUIMTE). NV also acknowledges support from FCT and FEDER (European Union), award number IF/00092/2014/CP1255/CT0004. AF thanks FCT for a doctoral fellowship (PD/BD/135120/2017)

    Delivering Signal-Altering Bacterial Effector Proteins to Mammalian Cells Using Cell-Penetrating Peptide Technology

    Get PDF
    A major role of the mitogen activated protein kinase (MAPK) pathway in eukaryotes is to activate the bacterial pathogen defense response upon the detection of bacterial products in the environment. This defensive signaling results in the induction of inflammation, the transcription of antimicrobial peptides, the modulation of the cell cycle and cell survival. Some Gram-negative bacteria have evolved needle-like structures called Type III Secretion Systems (T3SS) that secrete signal-altering molecules into the host cell to interrupt signaling pathways that would otherwise lead to the elimination of the bacterial infection. These signal-altering molecules are known as bacterial effector proteins (BEPs). Bacterial effectors YopJ and VopA have been shown to interfere with specific signaling molecules in the MAPK pathway; effectively inducing apoptosis in mammalian intestinal endothelial cells. In this study, we deliver these proteins to colon cancer cells to artificially induce cell death, using a novel cell-penetrating peptide (CPP) delivery system called TAT-CaM. Here, we show that the TAT-CaM system is capable of delivering YopJ into mammalian cells and that YopJ is capable of inducing cell death once delivered. Although we encountered issues with reproducibility, we believe that TAT-CaM-YopJ could be effective in inducing cell death in cancer cells in a reproducible manner after experimental adjustments

    Konopeptiidide klassifitseermine ja kindlakstegemine varjatud Markovi mudelite ja positsioonispetsiifiliste skoorimaatriksite abil

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsioone.Konopeptiidid on soojades meredes elavate koonustigude (Conus sp.) mürgis leiduvad lühikesed valgud. Koonusteod on kiskjad ja toituvad ussikestest, teistest molluskitest või kaladest. Nad tulistavad saaki mürgiga täidetud harpuuniga, mürk muudab saaklooma liikumatuks ja tigu saab ta rahulikult tervelt alla neelata. Mürgi kiirelt uimastav ja halvav toime tuleb paljude erinevate konopeptiidide segust. Konopeptiidid sünteesitakse eellasvalkudena, millel on signaaljärjestus mürgitorusse transportimiseks ja propeptiid, mis aitab mürgipeptiidi õigesti pakkida ja mürgipeptiid. Konopeptiidid jagatakse sarnaste signaaljärjestuste alusel perekondadesse. Teadlased uurivad konopeptiide lootusega leida nende hulgast uusi ravimikandidaate. Konopeptiidid on väga spetsiifilised närvirakkudes leiduvate ioonkanalite modulaatorid ja omavad suurt potentsiaali valuvaigistite või lihastelõdvestajatena. Antud uurimistöö esimeseks eesmärgiks oli välja töötada meetod, mille abil saaks suurtest järjestuste hulkadest välja otsida ja klassifitseerida konopeptiidid. Klassifitseerimine on oluline, sest uuele valgule sarnaste valkude teadasaamine annab palju informatsiooni tema omaduste kohta. Meie valisime sarnasuse võrdlemiseks kahte tüüpi mudelid - profiil-HMM’id ja PSSM’id. Kuna signaalpeptiidid on ühe perekonna piires väga konserveerunud, siis on nende olemasolul järjestuste klassifitseerimine meie mudelitega 100% tundlik ja ka 100% spetsiifiline. pHMM’ide ja PSSM’ide kombineerimisega saavutasime 91% tundlikkuse ka mürgipeptiidide klassifitseerimisel. Töö teiseks eesmärgiks oli otsida konopeptiide koonusteo Conus consors’i genoomist ja mürgitoru transkriptoomist. Konopeptiidide otsimiseks järjestatud transkriptoomist ja genoomist kasutasime teiste meetodite hulgas ka pHMM’e. Me leidsime C. consors’i genoomist 214 konopeptiidi, millest 187 olid uued järjestused. Meil õnnestus teada saada 13-sse erinevasse perekonda kuuluva 15 konopeptiidi geenide ekson-intron struktuur. See on oluline, sest geenistruktuur võib mõjutada konopeptiidide mitmekesisuse teket.Conopeptides are small proteins found in the venom of cone snails (Conus sp.). Cone snails feed on worms, molluscs and fish. They paralyze their prey with venom and swallow it whole. The fast immobilization appears as a result of the mixture of conopeptides in the venom. Conopeptides are synthesized as prepropeptides with a signal sequence for transport into the venom duct, pro-peptide that facilitates proper folding and mature peptide. Conopeptides are grouped into superfamilies according to the signal sequences. Scientists are studying the conopeptides hoping to find new drug candidates. Conopeptides are specific modulators of ion channels in nerve and muscle cells and therefore can be potentially used as painkillers or muscle relaxants. Aim of this study was to develop a method for finding and classifying conopeptides from large amounts of sequences. Classification is important since finding the proteins similar to a newly discovered protein we get a lot of information about it. We used two types of models for classification and identification – profile hidden Markov Models (pHMMs) and position specific scoring matrices (PSSMs). With the signal peptide present the classification is 100% specific and sensitive. By combining the pHMMs and PSSMs we were able to obtain 91% sensitivity also for classification of mature peptides, which is better than with other methods. The second aim of this study was to find conopeptides from the genome and venom duct transcriptome of Conus consors. We used multiple methods, including the pHMMs, to locate the conopeptides. We discovered 214 conopeptides from the genome, 187 of which were novel. We also described the exon-intron structure for 15 conopeptide genes from 13 different superfamilies. Gene structure may influence the propagation of conopeptide diversity

    In silico screening, analysis, and modelling for a novel anticancer peptide

    Get PDF
    Cancer is currently one of the leading causes of mortality and morbidity worldwide. Most anticancer therapies rely on small molecule drugs (\u3c0.5 kDa). As with all small molecule drugs, chemotherapy is highly toxic and presents many off-target side effects. Peptide drugs offer improved specificity and are cheaper and more accessible to manufacture. In this study, we have developed a support vector machine (SVM) model in order to detect peptide sequences with potential anticancer activity through scanning the Red Sea Metagenomic library. Furthermore, we conducted an in silico study in order to analyze one of the peptides returned by the SVM pipeline and assessed its cytotoxicity and the mode of cell death by conducting MTT and Annexin V staining assays, respectively. We observed that the selected anticancer peptide contains the C-terminal portion of the homeodomain structure, of human Pax6, an antennapedia homeodomain region, and can bind DNA. Furthermore, we observed dose-response cytotoxicity of HepG2 cells with our peptide. No such cytotoxicity was observed in HeLa cells; a morphological change, however, was observed. We examined the cytotoxicity of our drug against 1BR-hTERT normal skin cells. Our peptide drug induced dose-dependent cytotoxicity that was markedly weaker than that of cancer treated cells. Together our data illustrates the isolation of one peptide drug candidate from the AUC Red Sea metagenomic library; furthermore, we were able to observe the selective dose-dependent reduction of HepG2 cell viabilit

    Identification of a Short Cell-Penetrating Peptide from Bovine Lactoferricin for Intracellular Delivery of DNA in Human A549 Cells

    Get PDF
    Cell-penetrating peptides (CPPs) have been shown to deliver cargos, including protein, DNA, RNA, and nanomaterials, in fully active forms into live cells. Most of the CPP sequences in use today are based on non-native proteins that may be immunogenic. Here we demonstrate that the L5a CPP (RRWQW) from bovine lactoferricin (LFcin), stably and noncovalently complexed with plasmid DNA and prepared at an optimal nitrogen/phosphate ratio of 12, is able to efficiently enter into human lung cancer A549 cells. The L5a CPP delivered a plasmid containing the enhanced green fluorescent protein (EGFP) coding sequence that was subsequently expressed in cells, as revealed by real-time PCR and fluorescent microscopy at the mRNA and protein levels, respectively. Treatment with calcium chloride increased the level of gene expression, without affecting CPP-mediated transfection efficiency. Zeta-potential analysis revealed that positively electrostatic interactions of CPP/DNA complexes correlated with CPP-mediated transport. The L5a and L5a/DNA complexes were not cytotoxic. This biomimetic LFcin L5a represents one of the shortest effective CPPs and could be a promising lead peptide with less immunogenic for DNA delivery in gene therapy

    Cell Penetrating Peptide Adsorption on Magnetite and Silica Surfaces: A Computational Investigation

    Get PDF
    Magnetic nanoparticles (MNPs) represent one of the most promising materials as they can act as a versatile platform in the field of bionanotechnology for enhanced imaging, diagnosis, and treatment of various diseases. Silica is the most common compound for preparing coated iron oxide NPs since it improves colloidal stability and the binding affinity for various organic molecules. Biomolecules such as cell penetrating peptides (CPPs) might be employed to decorate MNPs, combining their promising physicochemical properties with a cell penetrating ability. In this work, a computational investigation on adsorption of Antennapedia homeodomain-derived penetrating peptide (pAntp) on silica and magnetite (MAG) surfaces is presented. By employing umbrella sampling molecular dynamics, we provided a quantitative estimation of the pAntp-surface adsorption free energy to highlight the influence of surface hydroxylation state on the adsorption mechanism. The interaction between peptide and surface has shown to be mainly driven by electrostatics. In case of MAG surface, also an important contribution of van der Waals (VdW) attraction was observed. Our data suggest that a competitive mechanism between MNPs and cell membrane might partially inhibit the CPP to carry out its membrane penetrating function
    corecore