38 research outputs found
Inferring stabilizing mutations from protein phylogenies : application to influenza hemagglutinin
One selection pressure shaping sequence evolution is the requirement that a protein fold with sufficient stability to perform its biological functions. We present a conceptual framework that explains how this requirement causes the probability that a particular amino acid mutation is fixed during evolution to depend on its effect on protein stability. We mathematically formalize this framework to develop a Bayesian approach for inferring the stability effects of individual mutations from homologous protein sequences of known phylogeny. This approach is able to predict published experimentally measured mutational stability effects (ΔΔG values) with an accuracy that exceeds both a state-of-the-art physicochemical modeling program and the sequence-based consensus approach. As a further test, we use our phylogenetic inference approach to predict stabilizing mutations to influenza hemagglutinin. We introduce these mutations into a temperature-sensitive influenza virus with a defect in its hemagglutinin gene and experimentally demonstrate that some of the mutations allow the virus to grow at higher temperatures. Our work therefore describes a powerful new approach for predicting stabilizing mutations that can be successfully applied even to large, complex proteins such as hemagglutinin. This approach also makes a mathematical link between phylogenetics and experimentally measurable protein properties, potentially paving the way for more accurate analyses of molecular evolution
Native-state stability determines the extent of degradation relative to secretion of protein variants from Pichia pastoris.
We have investigated the relationship between the stability and secreted yield of a series of mutational variants of human lysozyme (HuL) in Pichia pastoris. We show that genes directly involved in the unfolded protein response (UPR), ER-associated degradation (ERAD) and ER-phagy are transcriptionally up-regulated more quickly and to higher levels in response to expression of more highly-destabilised HuL variants and those variants are secreted to lower yield. We also show that the less stable variants are retained within the cell and may also be targeted for degradation. To explore the relationship between stability and secretion further, two different single-chain-variable-fragment (scFv) antibodies were also expressed in P. pastoris, but only one of the scFvs gave rise to secreted protein. The non-secreted scFv was detected within the cell and the UPR indicators were pronounced, as they were for the poorly-secreted HuL variants. The non-secreted scFv was modified by changing either the framework regions or the linker to improve the predicted stability of the scFv and secretion was then achieved and the levels of UPR indicators were lowered Our data support the hypothesis that less stable proteins are targeted for degradation over secretion and that this accounts for the decrease in the yields observed. We discuss the secretion of proteins in relation to lysozyme amyloidosis, in particular, and optimised protein secretion, in general
Stabilisation of the Fc Fragment of Human IgG1 by Engineered Intradomain Disulfide Bonds
We report the stabilization of the human IgG1 Fc fragment by engineered intradomain disulfide bonds. One of these bonds, which connects the N-terminus of the CH3 domain with the F-strand, led to an increase of the melting temperature of this domain by 10°C as compared to the CH3 domain in the context of the wild-type Fc region. Another engineered disulfide bond, which connects the BC loop of the CH3 domain with the D-strand, resulted in an increase of Tm of 5°C. Combined in one molecule, both intradomain disulfide bonds led to an increase of the Tm of about 15°C. All of these mutations had no impact on the thermal stability of the CH2 domain. Importantly, the binding of neonatal Fc receptor was also not influenced by the mutations. Overall, the stabilized CH3 domains described in this report provide an excellent basic scaffold for the engineering of Fc fragments for antigen-binding or other desired additional or improved properties. Additionally, we have introduced the intradomain disulfide bonds into an IgG Fc fragment engineered in C-terminal loops of the CH3 domain for binding to Her2/neu, and observed an increase of the Tm of the CH3 domain for 7.5°C for CysP4, 15.5°C for CysP2 and 19°C for the CysP2 and CysP4 disulfide bonds combined in one molecule
A review of estimation of distribution algorithms in bioinformatics
Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain
Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen
The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca’s large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibitor antagonism when combined with PIK3CB/D inhibition contrasting to synergy when combined with other PI3K-pathway inhibitors in PIK3CA mutant cells.Peer reviewe
Nh3D: A reference dataset of non-homologous protein structures
Abstract
Background
The statistical analysis of protein structures requires datasets in which structural features can be considered independently distributed, i.e. not related through common ancestry, and that fulfil minimal requirements regarding the experimental quality of the structures it contains. However, non-redundant datasets based on sequence similarity invariably contain distantly related homologues. Here we provide a reference dataset of non-homologous protein domains, assuming that structural dissimilarity at the topology level is incompatible with recognizable common ancestry. The dataset is based on domains at the Topology level of the CATH database which hierarchically classifies all protein structures. It contains the best refined representatives of each Topology level, validates structural dissimilarity and removes internally duplicated fragments. The compilation of Nh3D is fully scripted.
Results
The current Nh3D list contains 570 domains with a total of 90780 residues. It covers more than 70% of folds at the Topology level of the CATH database and represents more than 90% of the structures in the PDB that have been classified by CATH. We observe that even though all protein pairs are structurally dissimilar, some pairwise sequence identities after global alignment are greater than 30%.
Conclusion
Nh3D is freely available as a reference dataset for the statistical analysis of sequence and structure features of proteins in the PDB. Regularly updated versions of Nh3D and the corresponding PDB-formatted coordinate sets are accessible from our Web site
http://www.schematikon.org
Preferential species-restricted heavy/light chain pairing in rat/mouse quadromas. Implications for a single-step purification of bispecific antibodies.
Conventional mouse/mouse or rat/rat hybrid-hybridoma supernatants contain up to 10 different IgG molecules consisting of various combinations of heavy and light chains. Hence, the yield of functional bispecific Ab is low, and purification is often complicated, hampering a general preclinical evaluation of, e.g., bispecific Ab-mediated tumor immunotherapy in animal models. In experiments to overcome this drawback we found that fusion of rat with mouse hybridomas opens the possibility of large scale production of bispecific Ab due to the increased incidence of correctly paired Ab and facilitated purification. In essence, rat/mouse quadroma-derived bispecific Ab have the following advantages: 1) enrichment of functional bispecific Ab because of preferential species-restricted heavy/light chain pairing (observed in four of four rat-mouse quadromas) in contrast to the random pairing in conventional mouse/mouse or rat/rat quadromas, and 2) a possible one-step purification of the quadroma supernatant with protein A. This simple chromatography step does not bind unwanted variants with parental rat/rat heavy chain configuration, and the desired rat/mouse bispecific Ab are retained, which can then easily be separated from parental mouse Ab by sequential pH elution