1,142 research outputs found

    How to find simple and accurate rules for viral protease cleavage specificities

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information for individual proteases and some have been applied to extract cleavage rules from data. However, the hitherto proposed methods for extracting rules have been neither easy to understand nor very accurate. To be practically useful, cleavage rules should be accurate, compact, and expressed in an easily understandable way.</p> <p>Results</p> <p>A new method is presented for producing cleavage rules for viral proteases with seemingly complex cleavage profiles. The method is based on orthogonal search-based rule extraction (OSRE) combined with spectral clustering. It is demonstrated on substrate data sets for human immunodeficiency virus type 1 (HIV-1) protease and hepatitis C (HCV) NS3/4A protease, showing excellent prediction performance for both HIV-1 cleavage and HCV NS3/4A cleavage, agreeing with observed HCV genotype differences. New cleavage rules (consensus sequences) are suggested for HIV-1 and HCV NS3/4A cleavages. The practical usability of the method is also demonstrated by using it to predict the location of an internal cleavage site in the HCV NS3 protease and to correct the location of a previously reported internal cleavage site in the HCV NS3 protease. The method is fast to converge and yields accurate rules, on par with previous results for HIV-1 protease and better than previous state-of-the-art for HCV NS3/4A protease. Moreover, the rules are fewer and simpler than previously obtained with rule extraction methods.</p> <p>Conclusion</p> <p>A rule extraction methodology by searching for multivariate low-order predicates yields results that significantly outperform existing rule bases on out-of-sample data, but are more transparent to expert users. The approach yields rules that are easy to use and useful for interpreting experimental data.</p

    HIV Drug Resistant Prediction and Featured Mutants Selection using Machine Learning Approaches

    Get PDF
    HIV/AIDS is widely spread and ranks as the sixth biggest killer all over the world. Moreover, due to the rapid replication rate and the lack of proofreading mechanism of HIV virus, drug resistance is commonly found and is one of the reasons causing the failure of the treatment. Even though the drug resistance tests are provided to the patients and help choose more efficient drugs, such experiments may take up to two weeks to finish and are expensive. Because of the fast development of the computer, drug resistance prediction using machine learning is feasible. In order to accurately predict the HIV drug resistance, two main tasks need to be solved: how to encode the protein structure, extracting the more useful information and feeding it into the machine learning tools; and which kinds of machine learning tools to choose. In our research, we first proposed a new protein encoding algorithm, which could convert various sizes of proteins into a fixed size vector. This algorithm enables feeding the protein structure information to most state of the art machine learning algorithms. In the next step, we also proposed a new classification algorithm based on sparse representation. Following that, mean shift and quantile regression were included to help extract the feature information from the data. Our results show that encoding protein structure using our newly proposed method is very efficient, and has consistently higher accuracy regardless of type of machine learning tools. Furthermore, our new classification algorithm based on sparse representation is the first application of sparse representation performed on biological data, and the result is comparable to other state of the art classification algorithms, for example ANN, SVM and multiple regression. Following that, the mean shift and quantile regression provided us with the potentially most important drug resistant mutants, and such results might help biologists/chemists to determine which mutants are the most representative candidates for further research

    Alfaviiruse mittestruktuurne proteaas ja tema liitvalgust substraat: täiuslikult korraldatud kooselu reeglid

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsioone.Alfaviirused (sugukond Togaviridae) on artriiti ja entsefaliiti põhjustavad RNA genoomsed viirused. Nende paljunemise strateegia aluseks on viiruse replikaasi valkude süntees ühe nn. mittestruktuurse eelvalgu P1234 kujul ning selle ajaliselt reguleeritud lõikamine valmis valkudeks nsP2 proteaasi abil. Käesoleva väitekirja aluseks olevad uurimistööd viisid nsP2 substraat-spetsiifilisust tagavate mehhanismide väljaselgitamiseni; muu hulgas kirjeldati uudset proteolüütiliste lõikamiste regulatsioonimehhanismi, mis põhineb liitvalgu erinevate regioonide vahelisel „suhtlemisel“ viiruse replikatsiooni kompleksi moodustamise käigus. Sellest saab järeldada, et P1234 lõikamise ajaline regulatsioon sõltub otseselt replikatsioonikompleksi konfiguratsioonidest, millised omakorda on määratud selle komponentide vaheliste interaktsioonide poolt. Seega tõuseb viiruse nsP2 proteaas esile kui keerulise signaalvõrgustiku keskne element, mille roll viirus infektsiooni regulatsioonis seisneb replikatsiooniga kaasnevate sündmuste „jälgimises“ ja nendele reageerimises. Viimane põhineb sellel, et kui viiruse paljunemine jõuab kindla vahe-etapini, siis kaasneb sellega lõikamiskohtade ja/või muude oluliste struktuuride „esitlemine“ proteaasile, mis reageerib toimunud muudatustele lokaalse signaalülekande, mis lõppkokkuvõttes viib replikaasi kompleksi struktuuri järjestikulistele muudatustele, käivitamisega. Kokkuvõttes, tõid läbiviidud uurimised välja asjaolu, et lisaks varem teada olnud lõikamisjärjestuste äratundmisele, omab ka makromolekulaarsete struktuuride moodustamine viiruse valkude poolt olulist (ja mitmel juhul isegi määravat) rolli viiruse proteaasi töö reguleerimisel. Veel enam, eeldati, et seesugune mitmetahuline regulatsioon võib olla paljukomponentsete proteolüütiliste süsteemide üldine omadus. Kirjeldatud avastused ja nende lahtimõtestamine omavad olulist rolli uurimistöödele, mille eesmärgiks on alfaviiruste paljunemist takistavate lähenemiste väljatöötamine. Nii võib saadud tulemuste põhjal järeldada, et lisaks proteaasi aktiivsuse otsesele mõjutamisele võib viiruse replikatsiooni takistada ka mõjutades proteolüüsi regulatsiooni tagavaid molekulide vahelised seoseid.Alphaviruses from the Togaviridae family are RNA viruses that may cause arthritic syndroms and encephalitis. The alphavirus replication strategy relies on the production of replicase proteins initially in the form of non-structural (ns) polyprotein precursor P1234, which during the course of replication becomes proteolytically processed by the virus-encoded nsP2 protease in a temporally regulated manner. The studies that constitute the basis of this thesis led to identification of the requirements for substrate specificity of nsP2 protease and revealed novel mechanism for the regulation of processing based on the specific communication between distant parts of the viral polyprotein brought together during assembly of replication complex. It was concluded that the order of alphaviral ns-polyprotein processing is mostly dependent on the configuration of the replication complex imposed by intermolecular interactions meant to guarantee timely cleavages. The alphaviral protease therefore emerges as an integral part of the sophisticated signaling mechanism, in which the regulatory task of the protease consists of monitoring the succession and completion of the events of viral infection. Once the respective replication status-induced conformational changes within replicase allow the presentation of the scissile bond and/or other essential determinants of substrate recognition like exosites, the local protease signaling is initiated, which apparently leads to further reconfiguration of the viral replication complex. Combined, the studies unveiled the decisive role played by the macromolecular assembly-dependent component of substrate recognition in addition to the sequence-dependent component, the combination of which may be expected to constitute the basis of regulation in multi-site proteolytic systems in general. Described findings and their interpretations are expected to provide with essential grounds and directions for further studies on the restriction of alphaviral replication through affecting the center of viral proteolytic activity or via intervention with its regulation by targeting intramolecular interactions

    A Sequence and Structure Based Method to Predict Putative Substrates, Functions and Regulatory Networks of Endo Proteases

    Get PDF
    BACKGROUND: Proteases play a central role in cellular homeostasis and are responsible for the spatio-temporal regulation of function. Many putative proteases have been recently identified through genomic approaches, leading to a surge in global profiling attempts to characterize their function. Through such efforts and others it has become evident that many proteases play non-traditional roles. Accordingly, the number and the variety of the substrate repertoire of proteases are expected to be much larger than previously assumed. In line with such global profiling attempts, we present here a method for the prediction of natural substrates of endo proteases (human proteases used as an example) by employing short peptide sequences as specificity determinants. METHODOLOGY/PRINCIPAL FINDINGS: Our method incorporates specificity determinants unique to individual enzymes and physiologically relevant dual filters namely, solvent accessible surface area--a parameter dependent on protein three-dimensional structure and subcellular localization. By incorporating such hitherto unused principles in prediction methods, a novel ligand docking strategy to mimic substrate binding at the active site of the enzyme, and GO functions, we identify and perform subjective validation on putative substrates of matriptase and highlight new functions of the enzyme. Using relative solvent accessibility to rank order we show how new protease regulatory networks and enzyme cascades can be created. CONCLUSION: We believe that our physiologically relevant computational approach would be a very useful complementary method in the current day attempts to profile proteases (endo proteases in particular) and their substrates. In addition, by using functional annotations, we have demonstrated how normal and unknown functions of a protease can be envisaged. We have developed a network which can be integrated to create a proteolytic world. This network can in turn be extended to integrate other regulatory networks to build a system wide knowledge of the proteome

    From the test tube to the World Wide Web - The cleavage specificity of the proteasome

    Get PDF
    Diese Dissertation handelt von Proteasomen (von 'Protease' und dem griechischen 'soma' = Protein-schneidender Körper) und ihrer Rolle in der Regulierung von Immunantworten. Proteasomen sind fassförmige, molekulare Maschinen (Enzyme), die in jeder Körperzelle zu finden sind. Ihre Aufgabe ist es, Proteine klein zu hacken, so ähnlich wie eine Häckselmaschine, die Äste und Zweige in kleine Stücke schneidet. Die kleinen Proteinstücke können zur Zelloberfläche transportiert und dort den zu den weißen Blutkörperchen gehörenden T-Zellen präsentiert werden. Wenn eine Körperzelle 'krank' ist (d.h. sie ist zu einer Tumorzelle geworden oder ist mit Krankheitserregern wie Viren oder Bakterien infiziert), sehen die Proteinfragmente auf der Zelloberfläche anders aus. T-Zellen werden dadurch aktiviert, die 'kranke' Körperzelle zum Wohl des Gesamtorganismus abzutöten. Während der Forschung für meine Diplomarbeit (April-Dez. 1997) und meine Doktorarbeit (Jan. 1998-Dez. 2000) versuchte ich im Detail zu klären, wie Proteine von Proteasomen klein geschnitten werden. Ich hatte Glück und konnte einige Regeln bestimmen, nach denen Proteasomen Proteine zerhäckseln. Diese Regeln wurden als Grundlage für die Vorhersage von Proteasomen-Schnitten herangezogen. Meine Forschungsergebnisse haben großen Nutzen für die Entwicklung von Impfstoffen und die Vorhersage von Immunantworten.This dissertation deals with proteasomes (from 'protease' and Greek 'soma' = protein-chopping body) and their role in the regulation of immune responses. Proteasomes are barrel-shaped molecular machines (enzymes) that are found in every cell of the body. Their job is to chop up proteins, much like a garden shredder that cuts twigs and branches into small pieces. The small protein pieces can be transported to the cell surface to be presented to T-cells, immune cells that constitute a part of the white blood cells. If a body cell is 'sick' (i.e. it has turned into a tumor cell or is infected by pathogens such as viruses or bacteria), the protein fragments on the cell surface look different. They therefore can activate T-cells to kill the diseased cell for the good of the whole organism. During the research for my Diploma thesis (April-Dec. 1997) and my Ph.D. thesis (Jan. 1998-Dec. 2000) I tried to find out more about how exactly proteins are cleaved by proteasomes. I was lucky and could determine some of the rules that proteasomes follow to chop up proteins. These rules were used as a basis for the prediction of proteasome cleavages. My results have important implications for vaccine development and the prediction of immune responses

    Strategies for Computational Protein Design with Application to the Development of a Biomolecular Tool-kit for Single Molecule Protein Sequencing

    Get PDF
    One of the key properties of proteins is that they exhibit remarkable affinities and specificities for small-molecule and peptide binding partners. To improve the success rate of rational, computational protein design and widen the scope of potential applications, it is useful to define generalized strategies and automated methodology to improve and/or alter the affinity and specificity of interactions. I have implemented several strategies for engineering protein-small molecule interactions including: improvement of substrate accessibility, stabilization of the bound state, truncation and surface engineering, and transplantation of residue level, native (or native-like) interactions. Each strategy was applied to one or more model protein, and the resulting changes in affinity, specificity, and activity were characterized experimentally. Finally, we designed a biomolecular tool-kit, consisting of 17 engineered proteins for amino acid side-chain recognition and a single enzyme to catalyze the Edman degradation. We profiled the affinity and specificity of each protein, and implemented a computational framework that demonstrates its utility for amino acid calling in a single molecule protein sequencing assay

    Structural Mechanism of Substrate Specificity In Human Cytidine Deaminase Family APOBEC3s

    Get PDF
    APOBEC3s (A3s) are a family of human cytidine deaminases that play important roles in both innate immunity and cancer. A3s protect host cells against retroviruses and retrotransposons by deaminating cytosine to uracil on foreign pathogenic genomes. However, when mis-regulated, A3s can cause heterogeneities in host genome and thus promote cancer and the development of therapeutic resistance. The family consists of seven members with either one (A3A, A3C and A3H) or two zinc-binding domains (A3B, A3D, A3D and A3G). Despite overall similarity, A3 proteins have distinct deamination activity and substrate specificity. Over the past years, several crystal and NMR structures of apo A3s and DNA/RNA-bound A3s have been determined. These structures have suggested the importance of the loops around the active site for nucleotide specificity and binding. However, the structural mechanism underlying A3 activity and substrate specificity requires further examination. Using a combination of computational molecular modeling and parallel molecular dynamics (pMD) simulations followed by experimental verifications, I investigated the roles of active site residues and surrounding loops in determining the substrate specificity and RNA versus DNA binding among A3s. Starting with A3B, I revealed the structural basis and gatekeeper residue for DNA binding. I also identified a unique auto-inhibited conformation in A3B that restricts access to the active site and may underlie lower catalytic activity compared to the highly similar A3A. Besides, I investigated the structural mechanism of substrate specificity and ssDNA binding conformation in A3s. I found an interdependence between substrate conformation and specificity. Specifically, the linear DNA conformation helps accommodate CC dinucleotide motif while the U-shaped conformation prefers TC. I also identified the molecular mechanisms of substrate sequence specificity at -1’ and -2’ positions. Characterization of substrate binding to A3A revealed that intra-DNA interactions may be responsible for the specificity in A3A. Finally, I investigated the structural mechanism for exclusion of RNA from A3G catalytic activity using similar methods. Overall, the comprehensive analysis of A3s in this thesis shed light into the structural mechanism of substrate specificity and broaden the understanding of molecular interactions underlying the biological function of these enzymes. These results have implications for designing specific A3 inhibitors as well as base editing systems for gene therapy

    E-selectin as anti-inflammatory drug target : expression, purification and charactrization for structural studies : assay development for antaginists evaluation

    Get PDF
    E-, P- and L-selectin belong to the C-type lectin family of cell adhesion molecules that initiate inflammatory response. Infammation per se is a physiologocal defense mechanism, but excessive leukcyte extravasation leads to numerous pathoIogical and disease states, as well as metastatic cancer spread. Leukocyte tethering and rolling toward inflammatory site start with the interaction of selectins and the carbohydrate epitope of their glycoprotein ligands, sialyl Lewisx. Therefore inhibitors of selectin-ligand interaction are of high pharmaceutical interest as potent anti-inflammatory agents. Tetrasaccharide sialyl Lewisx serves as a lead strucure in chemical and computational search for selectin antagonists. Structural NMR and X-ray studies indicated binding mode of sialyl Lewisx with E-, and P-selectin, but improved structural studies with the second and third generation antagonists is missing. We expressed recombinant human E-, P- and L-selectin/IgG as secreted proteins in mammalian expression system and purified them to homogniety. Acitivity of the proteins was confirmed with blocking monoclonal antibodies and ligand binding confirmed by NMR. Bioassays were developed in cell-free and cell-based formats with E-selectin/IgG to evaluate inhibitory potencies of in-house synthesized selectin antagonists. Due to variation and instabilities on day-to-day and batch-to-batch basis, assays were used only for preliminary antagonists screen. To enhance further structural studies, we developed a new system for the expression of truncated form of human E-selectin (lectin and EGF-like domains). Initialy we tried to express these two domains in E.coli, but refolding of expressed inclusion bodies was inefficient. Therefore lectin and EGF-like domains of human E-selectin were expressed as secreted form in baculovirus-infeced insect cells with a flag-epitope on its C-terminus. Expressed protein (LecEGFFlag) was monomeric in solution, correctly folded and active, as confirmed in the reaction with monoclonal blocking antibodies, and NMR studies. Protein was expressed in two distinct glycosylation forms, with apparent molecular weigts of 19.96 kDa and 21.15 kDa. In addition, we developed for the first time a cell-free assay with truncated form of Eselectin (aforementioned LecEGFFlag) for the evaluation of of E-selectin inhibitors. In a proof-of-concept manner, three different E-selectin antagonists were tested and obtained IC50 values were in close agreement with published results. Reproducibility and stability of the assay on day-to-day and batch-to batch basis make it suitable not only for the preliminary screening, but also to quantify inhibitory potencies of E-selectin antagonists. Developed system is suitable for expression and similar characterization of P- and Lselectin as well
    corecore