20,835 research outputs found
Prediction of solubility on recombinant expression of Plasmodium falciparum erythrocyte membrane protein 1 domains in Escherichia coli
BACKGROUND: Cellular interactions elicited by Plasmodium falciparum erythrocyte membrane protein antigen 1 (PfEMP1) are brought about by multiple DBL (Duffy binding like), CIDR (cysteine-rich interdomain region) and C2 domain types. Elucidation of the functional and structural characteristics of these domains is contingent on the abundant availability of recombinant protein in a soluble form. A priori prediction of PfEMP1 domains of the 3D7 genome strain, most likely to be expressed in the soluble form in Escherichia coli was computed and proven experimentally. METHODS: A computational analysis correlating sequence-dependent features to likelihood for expression in soluble form was computed and predictions were validated by the colony filtration blot method for rapid identification of soluble protein expression in E. coli. RESULTS: Solubility predictions for all constituent PfEMP1 domains in the decreasing order of their probability to be expressed in a soluble form (% mean solubility) are as follows: ATS (56.7%) > CIDR1α (46.8%) > CIDR2β (42.9%) > DBL2-4γ (31.7%) > DBL2β + C2 (30.6%) > DBL1α (24.9%) > DBL2-7ε (23.1%) > DBL2-5δ (14.8%). The length of the domains does not correlate to their probability for successful expression in the soluble form. Immunoblot analysis probing for soluble protein confirmed the differential in solubility predictions. CONCLUSION: The acidic terminal segment (ATS) and CIDR α/β domain types are suitable for recombinant expression in E. coli while all DBL subtypes (α, β, γ, δ, ε) are a poor choice for obtaining soluble protein on recombinant expression in E. coli. This study has relevance for researchers pursuing functional and structural studies on PfEMP1 domains
Classifying sequences by the optimized dissimilarity space embedding approach: a case study on the solubility analysis of the E. coli proteome
We evaluate a version of the recently-proposed classification system named
Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space
of sequences of generic objects. The ODSE system has been originally presented
as a classification system for patterns represented as labeled graphs. However,
since ODSE is founded on the dissimilarity space representation of the input
data, the classifier can be easily adapted to any input domain where it is
possible to define a meaningful dissimilarity measure. Here we demonstrate the
effectiveness of the ODSE classifier for sequences by considering an
application dealing with the recognition of the solubility degree of the
Escherichia coli proteome. Solubility, or analogously aggregation propensity,
is an important property of protein molecules, which is intimately related to
the mechanisms underlying the chemico-physical process of folding. Each protein
of our dataset is initially associated with a solubility degree and it is
represented as a sequence of symbols, denoting the 20 amino acid residues. The
herein obtained computational results, which we stress that have been achieved
with no context-dependent tuning of the ODSE system, confirm the validity and
generality of the ODSE-based approach for structured data classification.Comment: 10 pages, 49 reference
Toward a multilevel representation of protein molecules: comparative approaches to the aggregation/folding propensity problem
This paper builds upon the fundamental work of Niwa et al. [34], which
provides the unique possibility to analyze the relative aggregation/folding
propensity of the elements of the entire Escherichia coli (E. coli) proteome in
a cell-free standardized microenvironment. The hardness of the problem comes
from the superposition between the driving forces of intra- and inter-molecule
interactions and it is mirrored by the evidences of shift from folding to
aggregation phenotypes by single-point mutations [10]. Here we apply several
state-of-the-art classification methods coming from the field of structural
pattern recognition, with the aim to compare different representations of the
same proteins gathered from the Niwa et al. data base; such representations
include sequences and labeled (contact) graphs enriched with chemico-physical
attributes. By this comparison, we are able to identify also some interesting
general properties of proteins. Notably, (i) we suggest a threshold around 250
residues discriminating "easily foldable" from "hardly foldable" molecules
consistent with other independent experiments, and (ii) we highlight the
relevance of contact graph spectra for folding behavior discrimination and
characterization of the E. coli solubility data. The soundness of the
experimental results presented in this paper is proved by the statistically
relevant relationships discovered among the chemico-physical description of
proteins and the developed cost matrix of substitution used in the various
discrimination systems.Comment: 17 pages, 3 figures, 46 reference
Solution structure of a bacterial microcompartment targeting peptide and its application in the construction of an ethanol bioreactor
Targeting of proteins to bacterial microcompartments (BMCs) is mediated by an 18-amino-acid peptide sequence. Herein, we report the solution structure of the N-terminal targeting peptide (P18) of PduP, the aldehyde dehydrogenase associated with the 1,2-propanediol utilization metabolosome from Citrobacter freundii. The solution structure reveals the peptide to have a well-defined helical conformation along its whole length. Saturation transfer difference and transferred NOE NMR has highlighted the observed interaction surface on the peptide with its main interacting shell protein, PduK. By tagging both a pyruvate decarboxylase and an alcohol dehydrogenase with targeting peptides, it has been possible to direct these enzymes to empty BMCs in vivo and to generate an ethanol bioreactor. Not only are the purified, redesigned BMCs able to transform pyruvate into ethanol efficiently, but the strains containing the modified BMCs produce elevated levels of alcohol
One-class classifiers based on entropic spanning graphs
One-class classifiers offer valuable tools to assess the presence of outliers
in data. In this paper, we propose a design methodology for one-class
classifiers based on entropic spanning graphs. Our approach takes into account
the possibility to process also non-numeric data by means of an embedding
procedure. The spanning graph is learned on the embedded input data and the
outcoming partition of vertices defines the classifier. The final partition is
derived by exploiting a criterion based on mutual information minimization.
Here, we compute the mutual information by using a convenient formulation
provided in terms of the -Jensen difference. Once training is
completed, in order to associate a confidence level with the classifier
decision, a graph-based fuzzy model is constructed. The fuzzification process
is based only on topological information of the vertices of the entropic
spanning graph. As such, the proposed one-class classifier is suitable also for
data characterized by complex geometric structures. We provide experiments on
well-known benchmarks containing both feature vectors and labeled graphs. In
addition, we apply the method to the protein solubility recognition problem by
considering several representations for the input samples. Experimental results
demonstrate the effectiveness and versatility of the proposed method with
respect to other state-of-the-art approaches.Comment: Extended and revised version of the paper "One-Class Classification
Through Mutual Information Minimization" presented at the 2016 IEEE IJCNN,
Vancouver, Canad
Mechanistic behaviour and molecular interactions of heat shock protein 47 (HSP47)
This project involves the study of heat shock protein 47 (HSP47), which is a molecular chaperone crucial for collagen biosynthesis. It exhibits a high degree of sequence homology with members of the serine protease inhibitor (serpin) superfamily, though HSP47 does not possess the inhibitory activity. It is a single-substrate chaperone, and binds only to collagen. ‘Knock-out’ of the hsp47 gene impairs the secretion of correctly folded collagen triple helix molecules leading to embryonic lethality in mice. Thus the aim of this project was to elucidate the specific mechanism that governs the binding to and release from collagen at the molecular level, known as the ‘pH-switch mechanism’. Emphasis is given on histidine (His) residues as the HSP47-collagen dissociation pH is similar to the pKa of the imidazole side chain of His residues. Site directed mutagenesis was used to mutate surface His residues, based on a mouse HSP47 homology model. The effects of the mutations on the behaviour of HSP47 were then assessed by collagen binding assays and structural analyses with circular dichroism (CD). All mutants were found to have good solubility and retain their binding ability to collagen like wild-type HSP47 in batch assay, but perturbed behaviour was seen in column experiment. Mutation of His residue at position 191 (H191) causes the shift in the collagen dissociation pH, while mutation of H197 and/or 198 disrupt the specific HSP47-collagen interaction. H191, 197 and 198 are predicted to be located in the region near the C-terminus of strand 3 of β-sheet A (s3A) in the homology model, a region specifically known as the ‘breach cluster’ in serpin nomenclature. The extent of conformational rearrangement of this region was further investigated by means of intrinsic tryptophan fluorescence spectroscopy using a series of single tryptophan (Trp) mutants. Results from analyses performed on the mutants did not contradict the observation seen in His mutational work, as Trp residues in the ‘breach’ cluster are likely to be located in the dynamic region of HSP47 pH-triggered conformational change. In conclusion, this study establishes the importance of His residues in the ‘breach cluster’ to HSP47 pH-switch behaviour. Finally, a model for HSP47 pH-switch mechanism was proposed from data obtained via mutagenesis experiments. The model is hoped to assist future research into HSP47 cellular behaviour and will also be of great use in therapeutic applications involving the molecular chaperone
Increased protein stability and decreased protein turnover in the Caenorhabditis elegans Ins/IGF-1 daf-2 mutant
In Caenorhabditis elegans, cellular proteostasis is likely essential for longevity. Autophagy has been shown to be essential for lifespan extension of daf-2 insulin/IGF mutants. Therefore, it can be hypothesized that daf-2 mutants achieve this phenotype by increasing protein turnover. However, such a mechanism would exert a substantial energy cost. By using classical S-35 pulse-chase labeling, we observed that protein synthesis and degradation rates are decreased in young adults of the daf-2 insulin/IGF mutants. Although reduction of protein turnover may be energetically favorable, it may lead to accumulation and aggregation of damaged proteins. As this has been shown not to be the case in daf-2 mutants, another mechanism must exist to maintain proteostasis in this strain. We observed that proteins isolated from daf-2 mutants are more soluble in acidic conditions due to increased levels of trehalose. This suggests that trehalose may decrease the potential for protein aggregation and increases proteostasis in the daf-2 mutants. We postulate that daf-2 mutants save energy by decreasing protein turnover rates and instead stabilize their proteome by trehalose
Expression and purification of an adenylation domain from a eukaryotic nonribosomal peptide synthetase: Using structural genomics tools for a challenging target
Nonribosomal peptide synthetases (NRPSs) are large multimodular and multidomain enzymes that are involved in synthesising an array of molecules that are important in human and animal health. NRPSs are found in both bacteria and fungi but most of the research to date has focused on the bacterial enzymes. This is largely due to the technical challenges in producing active fungal NRPSs, which stem from their large size and multidomain nature. In order to target fungal NRPS domains for biochemical and structural characterisation, we tackled this challenge by using the cloning and expression tools of structural genomics to screen the many variables that can influence the expression and purification of proteins. Using these tools we have screened 32 constructs containing 16 different fungal NRPS domains or domain combinations for expression and solubility. Two of these yielded soluble protein with one, the third adenylation domain of the SidN NRPS (SidNA3) from the grass endophyte Neotyphodium lolii, being tractable for purification using Ni-affinity resin. The initial purified protein exhibited poor solution behaviour but optimisation of the expression construct and the buffer conditions used for purification, resulted in stable recombinant protein suitable for biochemical characterisation, crystallisation and structure determination
Recommended from our members
Preliminary X-ray diffraction analysis of YqjH from Escherichia coli: a putative cytoplasmic ferri-siderophore reductase
YqjH is a cytoplasmic FAD-containing protein from Escherichia coli; based on homology to ViuB of Vibrio cholerae, it potentially acts as a ferri-siderophore reductase. This work describes its overexpression, purification, crystallization and structure solution at 3.0 A resolution. YqjH shares high sequence similarity with a number of known siderophore-interacting proteins and its structure was solved by molecular replacement using the siderophore-interacting protein from Shewanella putrefaciens as the search model. The YqjH structure resembles those of other members of the NAD(P)H:flavin oxidoreductase superfamily
- …