7 research outputs found
Structure-based function prediction of uncharacterized protein using binding sites comparison.
A challenge in structural genomics is prediction of the function of uncharacterized proteins. When proteins cannot be related to other proteins of known activity, identification of function based on sequence or structural homology is impossible and in such cases it would be useful to assess structurally conserved binding sites in connection with the protein's function. In this paper, we propose the function of a protein of unknown activity, the Tm1631 protein from Thermotoga maritima, by comparing its predicted binding site to a library containing thousands of candidate structures. The comparison revealed numerous similarities with nucleotide binding sites including specifically, a DNA-binding site of endonuclease IV. We constructed a model of this Tm1631 protein with a DNA-ligand from the newly found similar binding site using ProBiS, and validated this model by molecular dynamics. The interactions predicted by the Tm1631-DNA model corresponded to those known to be important in endonuclease IV-DNA complex model and the corresponding binding free energies, calculated from these models were in close agreement. We thus propose that Tm1631 is a DNA binding enzyme with endonuclease activity that recognizes DNA lesions in which at least two consecutive nucleotides are unpaired. Our approach is general, and can be applied to any protein of unknown function. It might also be useful to guide experimental determination of function of uncharacterized proteins
Metabolism of halogenated compounds by Rhodococcus UKMP-5M
Members of the genus Rhodococcus are well known for their high metabolic capabilities to degrade wide range of organic compounds ranging from simple hydrocarbons to more recalcitrant compounds such as polychlorinated biphenyls. Their ability to display novel enzymatic capabilities for the transformation of many hazardous contaminants in the environment makes them a potential candidate for bioremediation. Rhodococcus UKMP-5M, an actinomycete isolated in Peninsular Malaysia shows great potential towards degradation of cyanide, hydrocarbons and phenolic compounds. In the present study, the capacity of this strain to degrade halogenated compounds was explored.
Preliminary investigations have proven that R. UKMP-5M was not able to utilise any of the halogenated compounds tested as sole carbon and energy source, but the resting cells of R. UKMP-5M was able to dechlorinate several compounds which include chloroalkanes, chloroalcohols and chloroacids and the activity was three fold higher when the cells were grown in the presence of 1-Chlorobutane (1-CB). Therefore, 1-CB was chosen as a substrate to unravel the mechanism of dehalogenation in R. UKMP-5M. In contrast to the classic hydrolytic route for the assimilation of 1-CB in many organisms, R. UKMP-5M was able to metabolise and release chloride from 1-CB, but is unable to use the product from 1-CB metabolism as growth substrate.
On comparing the protein profiles of the induced and non-induced cells of R. UKMP-5M, two types of monooxygenases were identified in the induced condition, which were not present in the uninduced sample. The strict oxygen requirement for dechlorination of 1-CB and the identification of monooxygenases in the induced protein extract suggests that 1-CB dehalogenation is likely to be catalysed by a monooxygenase. In addition to these monooxygenases, a protein that was later identified as amidohydrolase (Ah) was also found to be induced when the cells were exposed to 1-CB. Therefore, Ah from R. UKMP-5M was cloned and expressed in E. coli to test the ability of the purified Ah to release chloride from 1-CB. The heterologous expression of Ah in E. coli resulted in the formation of inclusion bodies and the western blot analyses further confirmed that no soluble form of Ah was present. Multiple attempts to obtain a soluble and functionally active Ah were not successful. Therefore, on-column refolding was carried out to obtain a biologically active Ah. A 3D model based on structural homology was predicted as a preliminary step to characterize this protein. However, when assayed with 1-CB, Ah was found not to catalyze dehalogenation. All results of this thesis suggest that metabolism of 1-CB by R. UKMP-5M is via γ-butyrolactone which acts as a potent intracellular electrophile that covalently modifies proteins and nucleic acids. The findings from this research are important to determine the metabolic capacity of a Malaysian Rhodoccoccus in dehalogenation of halogenated compounds and its potential application in bioremediation.Open Acces
Recommended from our members
Structure-Based Genome Scale Function Prediction and Reconstruction of the Mycobacterium tuberculosis Metabolic Network
Due to vast improvements in sequencing methods over the past few decades, the availability of genomic data is rapidly increasing, thus bringing about the need for functional characterization tools. Considering the breadth of data involved, functional assays would be impractical and only a computational method could afford fast and cost-effective functional annotations. Therefore, homology-based computational methods are routinely used to assign putative molecular functions that can later be confirmed with targeted experiments. These methods are particularly well suited to predict the function of enzymes because most metabolic pathways are conserved across organisms. However, the current methods have limitations, especially when considering enzymes that have very low sequence and structure homology to well-annotated enzymes.
We hypothesized that two enzymes with the same molecular function shared significant sequence homology in the region surrounding the active site, even if they appear diverged at the global sequence level. First, we have investigated the limits of sequence and structure conservation for enzymes with the same function during divergent evolution. The goal of this was to determine the sequence identity threshold beyond which functional annotations should not be transferred between two sequences; that is the level of homology beyond which the pair of proteins would not be expected to have the same function. Our analysis, which compares several models of sequence evolution, shows that the sequences of orthologous proteins catalyzing the same reaction rarely diverge beyond 30 % identity, even after approximately 3.5 billion years of evolution. As for structure conservation, enzymes catalyzing the same reactions rarely diverge beyond 3 Ã… root-mean-square distance (RMSD). We have also explored sequence conservation constraints as a function of the distance to the active site. Although residues closer to the protein active site (within a radius of 10 Ã… around the catalytic residues) are mutating significantly slower, the requirement to preserve the molecular function also constrains residues at other parts of the protein.
From these results, we have developed a structure-based function prediction method where we employ active site conservation in addition to global sequence homology for functional characterization. We then integrated this method with a probabilistic whole-genome function prediction framework previously developed in the Vitkup group, GLOBUS. The original version of GLOBUS uses sampling of probability space to assign functions to all putative metabolic genes in an input genome by considering sequence homology to known enzymes, gene-gene context and EC co-occurrence. Applying this novel method to the whole-genome metabolic reconstruction of Mycobacterium tuberculosis, we made several novel predictions for genes with apparent links to pathogenesis. Notably, our predictions allowed us to reconstruct the cholesterol degradation pathway in M. tuberculosis, which has been implicated in bacterial persistence in the literature but remains to be fully characterized. This pathway is absent from previously published metabolic models of M. tuberculosis. Our new model can now be used to simulate different environments and conditions in order to gain a better understanding of the metabolic adaptability of M. tuberculosis during pathogenesis
Novel pharmacophore clustering methods for protein binding site comparison
Proteins perform diverse functions within cells. Some of the functions depend on the protein being involved in a protein complex, interacting with other proteins or with other entities (ligands) through specific binding sites on their surface. Comparison of protein binding sites has potential benefits in many research fields, including drug promiscuity studies, polypharmacology and immunology. While multiple methods have been proposed for comparing binding sites, they tend to focus on comparing very similar proteins and have only been developed for small specific datasets or very targeted applications. None of these methods make use of the powerful representation afforded by 3D complex-based pharmacophores. A pharmacophore model provides a description of a binding site, consisting of a group of chemical features arranged in three-dimensional space, that can be used to represent biological activities.
Two different pharmacophore comparison and clustering methods based on the Iterative Closest Point (ICP) algorithm are proposed: a 3-dimensional ICP pharmacophore clustering method, and an N-dimensional ICP pharmacophore clustering method. These methods are complemented by a series of data pre-processing methods for input data preparation. The implementation of the methods takes computational representations (pharmacophores) of single molecule or protein complexes as input and produces distance matrices that can be visualised as dendrograms. The methods integrate both alignment-dependent and alignment-independent concepts.
Both clustering methods were successfully evaluated using a 31 globulin-binding steroid dataset and a 41 antibody-antigen dataset, and were able to handle a larger dataset of 159 protein homodimers. For the steroid dataset, the resulting classification of ligands shows good correspondence with a classification based on binding affinity. For the antibody-antigen dataset, the classification of antigens reflected both antigen type and binding antibody. The applications to homodimers demonstrated the ability of both clustering methods to handle a larger dataset, and the possibility to visualise N-D pairwise comparisons using structural superposition of binding sites
Strukturanalyse von Virulenzfaktoren und essentiellen Proteinen aus Clostridium difficile
Clostridium difficile is a Gram-positive, anaerobic, endospore-forming bacterium that produces several virulence factors, most prominently the secreted protein toxins Toxin A (TcdA) and Toxin B (TcdB). Clostridium difficile infections (CDI) are often hospital acquired and antibiotic-associated. Treatment of CDI currently involves taking broad-spectrum antibiotics, e.g. vancomycin. Due to the extremely high relapse rate of CDI after antibiotic treatment, the emergence of new highly virulent C. difficile strains and the threatening antibiotic resistance, the need for new therapeutic treatment methods for CDI is more urgent than ever before. To develop new therapeutics, a detailed knowledge of the molecular processes inside the pathogen as well as a comprehensive structural and functional knowledge of its virulence factors and proteins involved in infection is essential.
Aim of this thesis was therefore the structural characterization of several important proteins of Clostridium difficile: its main virulence factor TcdB, proteins that are involved in basic cellular processes (i.e. growth and sporulation: CD1219 and CD1823) and the so-called diffocin proteins CD1363 and CD1364, bacteriophage tail-like proteins that act as bacteriocins.
Full-length genes of the respective proteins (and truncated fragments of TcdB) were cloned in expression vectors, the proteins were expressed in E. coli or Bacillus megaterium and purified by affinity chromatography, ion exchange chromatography and size exclusion chromatography. Pure protein samples were used for structural analysis by small angle X-ray scattering and X-ray crystallography. SAXS envelopes were calculated for all proteins in this thesis, crystal structures were determined for CD1219, CD1823, CD1363 and CD1364. Based on the crystal structures of the proteins hypotheses about their molecular function could be derived.Das grampositive, anaerobe, endosporenbildende Bakterium Clostridium difficile produziert mehrere Virulenzfaktoren, allen voran die beiden sekretierten homologen Proteintoxine Toxin A (TcdA) und Toxin B (TcdB). Die von Clostridium difficile verursachten Infektionen werden häufig im Krankenhaus und nach Antibiotika-Behandlung erworben. Die Behandlung erfolgt häufig mit der Einnahme von Breitband-Antibiotika, z.B. Vancomycin. Aufgrund der hohen Rate an Reinfektionen nach Absetzung der Antibiotika-Therapie, der Entdeckung neuer höchst virulenter C. difficile Stämme und deren Antibiotikaresistenz ist der Bedarf an neuen Therapiemöglichkeiten für C. difficile Infektionen dringender denn je. Dafür ist eine detaillierte Kenntnis der molekularen Prozesse im Bakterium, sowie eine umfassende strukturelle und funktionelle Analyse seiner Virulenzfaktoren und anderer Proteine, die an essentiellen Stoffwechselvorgängen und der Infektion beteiligt sind, unerlässlich.
Ziel dieser Arbeit war daher die Strukturanalyse verschiedener Proteine aus Clostridium difficile: eines seiner wichtigsten Virulenzfaktoren (TcdB), verschiedener Proteine, die essentiell für das Wachstum bzw. die Sporulation des Bakteriums sind (CD1219 und CD1823) und der Diffocin-Proteine CD1363 und CD1364, die strukturelle Ähnlichkeit zu Bacteriophagen-Schwänzen zeigen und als Bacteriocine agieren.
Die Gene der Volllängen-Proteine (sowie Fragmente von TcdB) wurden in Expressionsvektoren kloniert, in E. coli oder Bacillus megaterium exprimiert und über Affinitätschromatographie, Ionenaustauschchromatographie und Gelfiltration gereinigt. Proteine in ausreichender Reinheit wurden für SAXS (Kleinwinkelröntgenstreuung) und Röntgenkristallographie-Experimente verwendet. In dieser Arbeit konnten SAXS-Hüllen für alle Proteine ermittelt und zusätzlich Röntgenkristallstrukturen für CD1219, CD1823, CD1363 und CD1364 bestimmt werden. Basierend auf den Kristallstrukturen der jeweiligen Proteine konnten Hypothesen über deren molekulare Funktion abgeleitet werden