110 research outputs found
Modeling structure and flexibility of Candida antarctica lipase B in organic solvents
<p>Abstract</p> <p>Background</p> <p>The structure and flexibility of <it>Candida antarctica </it>lipase B in water and five different organic solvent models was investigated using multiple molecular dynamics simulations to describe the effect of solvents on structure and dynamics. Interactions of the solvents with the protein and the distribution of water molecules at the protein surface were examined.</p> <p>Results</p> <p>The simulated structure was independent of the solvent, and had a low deviation from the crystal structure. However, the hydrophilic surface of CALB in non-polar solvents decreased by 10% in comparison to water, while the hydrophobic surface is slightly increased by 1%. There is a large influence on the flexibility depending on the dielectric constant of the solvent, with a high flexibility in water and a low flexibility in organic solvents. With decreasing dielectric constant, the number of surface bound water molecules significantly increased and a spanning water network with an increasing size was formed.</p> <p>Conclusion</p> <p>The reduced flexibility of <it>Candida antarctica </it>lipase B in organic solvents is caused by a spanning water network resulting from less mobile and slowly exchanging water molecules at the protein-surface. The reduced flexibility of <it>Candida antarctica </it>lipase B in organic solvent is not only caused by the interactions between solvent-protein, but mainly by the formation of a spanning water network.</p
Analysis of the distribution of functionally relevant rare codons
<p>Abstract</p> <p>Background</p> <p>The substitution of rare codons with more frequent codons is a commonly applied method in heterologous gene expression to increase protein yields. However, in some cases these substitutions lead to a decrease of protein solubility or activity. To predict these functionally relevant rare codons, a method was developed which is based on an analysis of multisequence alignments of homologous protein families.</p> <p>Results</p> <p>The method successfully predicts functionally relevant codons in fatty acid binding protein and chloramphenicol acetyltransferase which had been experimentally determined. However, the analysis of 16 homologous protein families belonging to the α/β hydrolase fold showed that functionally rare codons share no common location in respect to the tertiary and secondary structure.</p> <p>Conclusion</p> <p>A systematic analysis of multisequence alignments of homologous protein families can be used to predict rare codons with a potential impact on protein expression. Our analysis showed that most genes contain at least one putative rare codon rich region. Rare codons located near to those regions should be excluded in an approach of improving protein expression by an exchange of rare codons by more frequent codons.</p
Standardized data, scalable documentation, sustainable storage : EnzymeML ss a basis for FAIR data management in biocatalysis
The often reported reproducibility crisis in the biomedical sciences also applies to enzymology and biocatalysis, and mainly results from incomplete reporting of reaction conditions. In this Concept article, an infrastructure based on EnzymeML is sketched, which enables reporting, exchange, and storage of enzymatic data according to the FAIR data principles. EnzymeML is a novel data exchange format for enzymology and biocatalysis, which facilitates the application of the STRENDA Guidelines and thus makes data on enzyme‐catalyzed reactions findable, accessible, interoperable, and reusable. EnzymeML enables the comprehensive documentation of metadata, thus fostering reproducibility and replicability in enzymology and biocatalysis. An EnzymeML Application Programming Interface integrates electronic lab notebooks with modelling platforms and databases on enzymatic reactions, and thus enables the seamless flow of enzymatic data from measurement to modelling to publication, without the need for manual intervention such as reformatting or editing. EnzymeML serves as a valuable tool for the design of biocatalytic experiments and contributes to the vision of a unified research data infrastructure for catalysis research.Deutsche Forschungsgemeinschaft DF
Meta-analysis of viscosity of aqueous deep eutectic solvents and their components
Deep eutectic solvents (DES) formed by quaternary ammonium salts and hydrogen bond donors are a promising green alternative to organic solvents. Their high viscosity at ambient temperatures can limit biocatalytic applications and therefore requires fine-tuning by adjusting water content and temperature. Here, we performed a meta-analysis of the impact of water content and temperature on the viscosities of four deep eutectic solvents (glyceline, reline, N,N-diethylethanol ammonium chloride-glycerol, N,N-diethylethanol ammonium chloride-ethylene glycol), their components (choline chloride, urea, glycerol, ethylene glycol), methanol, and pure water. We analyzed the viscosity data by an automated workflow, using Arrhenius and Vogel-Fulcher-Tammann-Hesse models. The consistency and completeness of experimental data and metadata was used as an essential criterion of data quality. We found that viscosities were reported for different temperature ranges, half the time without specifying a method of desiccation, and in almost half of the reports without specifying experimental errors. We found that the viscosity of the pure components varied widely, but that all aqueous mixtures (except for reline) have similar excess activation energy of viscous flow E-eta(excess)= 3-5 kJ/mol, whereas reline had a negative excess activation energy (E-eta(excess)= - 19 kJ/mol). The data and workflows used are accessible at https://doi.org/10.15490/FAIRDOMHUB.1.STUDY.767.1
The Thiamine diphosphate dependent Enzyme Engineering Database: A tool for the systematic analysis of sequence and structure relations
<p>Abstract</p> <p>Background</p> <p>Thiamine diphosphate (ThDP)-dependent enzymes form a vast and diverse class of proteins, catalyzing a wide variety of enzymatic reactions including the formation or cleavage of carbon-sulfur, carbon-oxygen, carbon-nitrogen, and especially carbon-carbon bonds. Although very diverse in sequence and domain organisation, they share two common protein domains, the pyrophosphate (PP) and the pyrimidine (PYR) domain. For the comprehensive and systematic comparison of protein sequences and structures the Thiamine diphosphate (ThDP)-dependent Enzyme Engineering Database (TEED) was established.</p> <p>Description</p> <p>The TEED <url>http://www.teed.uni-stuttgart.de</url> contains 12048 sequence entries which were assigned to 9443 different proteins and 379 structure entries. Proteins were assigned to 8 different superfamilies and 63 homologous protein families. For each family, the TEED offers multisequence alignments, phylogenetic trees, and family-specific HMM profiles. The conserved pyrophosphate (PP) and pyrimidine (PYR) domains have been annotated, which allows the analysis of sequence similarities for a broad variety of proteins. Human ThDP-dependent enzymes are known to be involved in many diseases. 20 different proteins and over 40 single nucleotide polymorphisms (SNPs) of human ThDP-dependent enzymes were identified in the TEED.</p> <p>Conclusions</p> <p>The online accessible version of the TEED has been designed to serve as a navigation and analysis tool for the large and diverse family of ThDP-dependent enzymes.</p
Structural classification by the Lipase Engineering Database: a case study of Candida antarctica lipase A
<p>Abstract</p> <p>Background</p> <p>The Lipase Engineering Database (LED) integrates information on sequence, structure and function of lipases, esterases and related proteins with the α/β hydrolase fold. A new superfamily for <it>Candida antarctica </it>lipase A (CALA) was introduced including the recently published crystal structure of CALA. Since CALA has a highly divergent sequence in comparison to other α/β hydrolases, the Lipase Engineering Database was used to classify CALA in the frame of the already established classification system. This involved the comparison of CALA to similar structures as well as sequence-based comparisons against the content of the LED.</p> <p>Results</p> <p>The new release 3.0 (December 2009) of the Lipase Engineering Database contains 24783 sequence entries for 18585 proteins as well as 656 experimentally determined protein structures, including the structure of CALA. In comparison to the previous release <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> with 4322 protein and 167 structure entries this update represents a significant increase in data volume. By comparing CALA to representative structures from all superfamilies, a structure from the deacetylase superfamily was found to be most similar to the structure of CALA. While the α/β hydrolase fold is conserved in both proteins, the major difference is found in the cap region. Sequence alignments between both proteins show a sequence similarity of only 15%. A multisequence alignment of both protein families was used to create hidden Markov models for the cap region of CALA and showed that the cap region of CALA is unique among all other proteins of the α/β hydrolase fold. By specifically comparing the substrate binding pocket of CALA to other binding pockets of α/β hydrolases, the binding pocket of <it>Candida rugosa </it>lipase was identified as being highly similar. This similarity also applied to the lid of <it>Candida rugosa </it>lipase in comparison to the potential lid of CALA.</p> <p>Conclusion</p> <p>The LED serves as a valuable tool for the systematic analysis of single proteins or protein families. The updated release 3.0 was used for the evaluation of α/β hydrolases. The HTML version of the database with new features is available at <url>http://www.led.uni-stuttgart.de</url> and provides sequences, structures and a set of analysis tools including phylogenetic trees and HMM profiles</p
The Lactamase Engineering Database: a critical survey of TEM sequences in public databases
<p>Abstract</p> <p>Background</p> <p>TEM β-lactamases are the main cause for resistance against β-lactam antibiotics. Sequence information about TEM β-lactamases is mainly found in the NCBI peptide database and TEM mutation table at <url>http://www.lahey.org/Studies/temtable.asp</url>. While the TEM mutation table is manually curated by experts in the lactamase field, who guarantee reliable and consistent information, the rapidly growing sequence and annotation information from the NCBI peptide database is sometimes inconsistent. Therefore, the Lactamase Engineering Database has been developed to collect the TEM β-lactamase sequences from the NCBI peptide database and the TEM mutation table, systematically compare sequence information and naming, identify inconsistencies, and thus provide a versatile tool for reconciliation of data and for an investigation of the sequence-function relationship.</p> <p>Description</p> <p>The LacED currently provides 2399 sequence entries and 37 structure entries. Sequence information on 150 different TEM β-lactamases was derived from the TEM mutation table which provides a unique number to each protein classified as TEM β-lactamase. 293 TEM-like proteins were found in the NCBI protein database, but only 113 TEM β-lactamase were common to both data sets. The 180 TEM β-lactamases from the NCBI protein database which have not yet been assigned to a TEM number fall in three classes: (1) 89 proteins from microbial organisms and 35 proteins from cloning or expression vectors had a new mutation profile; (2) 55 proteins had inconsistent annotation in terms of TEM assignment or reported mutation profile; (3) 39 proteins are fragments. The LacED is web accessible at <url>http://www.LacED.uni-stuttgart.de</url> and contains multisequence alignments, structure information and reconciled annotation of TEM β-lactamases. The LacED is weekly updated and supplies all data for download.</p> <p>Conclusion</p> <p>The Lactamase Engineering Database enables a systematic analysis of TEM β-lactamase sequence and annotation data from different data sources, and thus provides a valuable tool to identify inconsistencies in sequences from the NCBI peptide database, to detect TEM β-lactamases with a novel mutation profile, and to identify new amino acid positions at which mutations can occur.</p
Prediction and analysis of the modular structure of cytochrome P450 monooxygenases
<p>Abstract</p> <p>Background</p> <p>Cytochrome P450 monooxygenases (CYPs) form a vast and diverse family of highly variable sequences. They catalyze a wide variety of oxidative reactions and are therefore of great relevance in drug development and biotechnological applications. Despite their differences in sequence and substrate specificity, the structures of CYPs are highly similar. Although being in research focus for years, factors mediating selectivity and activity remain vague.</p> <p>Description</p> <p>This systematic comparison of CYPs based on the Cytochrome P450 Engineering Database (<it>CYPED</it>) involved sequence and structure analysis of more than 8000 sequences. 31 structures have been applied to generate a reliable structure-based HMM profile in order to predict structurally conserved regions. Therefore, it was possible to automatically transfer these modules on CYP sequences without any secondary structure information, to analyze substrate interacting residues and to compare interaction sites with redox partners.</p> <p>Conclusions</p> <p>Functionally relevant structural sites of CYPs were predicted. Regions involved in substrate binding were analyzed in all sequences among the <it>CYPED</it>. For all CYPs that require a reductase, two reductase interaction sites were identified and classified according to their length. The newly gained insights promise an improvement of engineered enzyme properties for potential biotechnological application. The annotated sequences are accessible on the current version of the <it>CYPED</it>. The prediction tool can be applied to any CYP sequence via the web interface at <url>http://www.cyped.uni-stuttgart.de/cgi-bin/strpred/dosecpred.pl</url>.</p
The Cytochrome P450 Engineering Database: integration of biochemical properties
<p>Abstract</p> <p>Background</p> <p>Cytochrome P450 monooxygenases (CYPs) form a vast and diverse enzyme class of particular interest in drug development and a high biotechnological potential. Although very diverse in sequence, they share a common structural fold. For the comprehensive and systematic comparison of protein sequences and structures the Cytochrome P450 Engineering Database (CYPED) was established. It was built up based on an extensible data model that enables its functions readily enhanced.</p> <p>Description</p> <p>The new version of the CYPED contains information on sequences and structures of 8613 and 47 proteins, respectively, which strictly follow Nelson's classification rules for homologous families and superfamilies. To gain biochemical information on substrates and inhibitors, the CYPED was linked to the Cytochrome P450 Knowledgebase (CPK). To overcome differences in the data model and inconsistencies in the content of CYPED and CPK, a metric was established based on sequence similarity to link protein sequences as primary keys. In addition, the annotation of structurally and functionally relevant residues was extended by a reliable prediction of conserved secondary structure elements and by information on the effect of single nucleotide polymorphisms.</p> <p>Conclusion</p> <p>The online accessible version of the CYPED at <url>http://www.cyped.uni-stuttgart.de</url> provides a valuable tool for the analysis of sequences, structures and their relationships to biochemical properties.</p
- …