259 research outputs found

    Metazoans evolved by taking domains from soluble proteins to expand intercellular communication network

    Get PDF
    A central question in animal evolution is how multicellular animals evolved from unicellular ancestors. We hypothesize that membrane proteins must be key players in the development of multicellularity because they are well positioned to form the cell-cell contacts and to provide the intercellular communication required for the creation of complex organisms. Here we find that a major mechanism for the necessary increase in membrane protein complexity in the transition from non-metazoan to metazoan life was the new incorporation of domains from soluble proteins. The membrane proteins that have incorporated soluble domains in metazoans are enriched in many of the functions unique to multicellular organisms such as cell-cell adhesion, signaling, immune defense and developmental processes. They also show enhanced protein-protein interaction (PPI) network complexity and centrality, suggesting an important role in the cellular diversification found in complex organisms. Our results expose an evolutionary mechanism that contributed to the development of higher life forms.open1144sciescopu

    Structural Insights into the Evolution of a Non-Biological Protein: Importance of Surface Residues in Protein Fold Optimization

    Get PDF
    Phylogenetic profiling of amino acid substitution patterns in proteins has led many to conclude that most structural information is carried by interior core residues that are solvent inaccessible. This conclusion is based on the observation that buried residues generally tolerate only conserved sequence changes, while surface residues allow more diverse chemical substitutions. This notion is now changing as it has become apparent that both core and surface residues play important roles in protein folding and stability. Unfortunately, the ability to identify specific mutations that will lead to enhanced stability remains a challenging problem. Here we discuss two mutations that emerged from an in vitro selection experiment designed to improve the folding stability of a non-biological ATP binding protein. These mutations alter two solvent accessible residues, and dramatically enhance the expression, solubility, thermal stability, and ligand binding affinity of the protein. The significance of both mutations was investigated individually and together, and the X-ray crystal structures of the parent sequence and double mutant protein were solved to a resolution limit of 2.8 and 1.65 Å, respectively. Comparative structural analysis of the evolved protein to proteins found in nature reveals that our non-biological protein evolved certain structural features shared by many thermophilic proteins. This experimental result suggests that protein fold optimization by in vitro selection offers a viable approach to generating stable variants of many naturally occurring proteins whose structures and functions are otherwise difficult to study

    Somatic Mutations Reveal Lineage Relationships and Age-Related Mutagenesis in Human Hematopoiesis

    Get PDF
    Mutation accumulation during life can contribute to hematopoietic dysfunction; however, the underlying dynamics are unknown. Somatic mutations in blood progenitors can provide insight into the rate and processes underlying this accumulation, as well as the developmental lineage tree and stem cell division numbers. Here,we catalog mutations in the genomes of human-bone-marrow-derived and umbilical-cordblood- derived hematopoietic stem and progenitor cells (HSPCs). We find that mutations accumulate gradually during life with approximately 14 base substitutions per year. The majority of mutations were acquired after birth and could be explained by the constant activity of various endogenous mutagenic processes, which also explains the mutation load in acute myeloid leukemia (AML). Using these mutations, we construct a developmental lineage tree of human hematopoiesis, revealing a polyclonal architecture and providing evidence that developmental clones exhibit multipotency. Our approach highlights features of human native hematopoiesis and its implications for leukemogenesis.The authors would like to thank the Hartwig Medical Foundation (Amsterdam, the Netherlands) for facilitating low-input whole-genome sequencing, P.J. Coffer for providing umbilical cord blood samples, and P.J. Campbell and D.C. Wedge for sharing scripts. This study was financially supported by an EMBO long-term fellowship to F.G.O. (ALTF 655-2016), an ERC starting grant (ERC2014-STG637904) to I.V., a VIDI grant of the Netherlands Organisation for Scientific Research (NWO) (no. 016.Vidi.171.023) to R.v.B., funding from Worldwide Cancer Research (WCR) (no. 16-0193) to R.v.B., and NIH grants HL128850-01A1 and P01HL13147 to F.D.C. F.D.C. is a scholar of the Howard Hughes Medical Institute and the Leukemia and Lymphoma Society

    Somatic Mutations Reveal Lineage Relationships and Age-Related Mutagenesis in Human Hematopoiesis

    Get PDF
    Mutation accumulation during life can contribute to hematopoietic dysfunction; however, the underlying dynamics are unknown. Somatic mutations in blood progenitors can provide insight into the rate and processes underlying this accumulation, as well as the developmental lineage tree and stem cell division numbers. Here,we catalog mutations in the genomes of human-bone-marrow-derived and umbilical-cordblood- derived hematopoietic stem and progenitor cells (HSPCs). We find that mutations accumulate gradually during life with approximately 14 base substitutions per year. The majority of mutations were acquired after birth and could be explained by the constant activity of various endogenous mutagenic processes, which also explains the mutation load in acute myeloid leukemia (AML). Using these mutations, we construct a developmental lineage tree of human hematopoiesis, revealing a polyclonal architecture and providing evidence that developmental clones exhibit multipotency. Our approach highlights features of human native hematopoiesis and its implications for leukemogenesis.The authors would like to thank the Hartwig Medical Foundation (Amsterdam, the Netherlands) for facilitating low-input whole-genome sequencing, P.J. Coffer for providing umbilical cord blood samples, and P.J. Campbell and D.C. Wedge for sharing scripts. This study was financially supported by an EMBO long-term fellowship to F.G.O. (ALTF 655-2016), an ERC starting grant (ERC2014-STG637904) to I.V., a VIDI grant of the Netherlands Organisation for Scientific Research (NWO) (no. 016.Vidi.171.023) to R.v.B., funding from Worldwide Cancer Research (WCR) (no. 16-0193) to R.v.B., and NIH grants HL128850-01A1 and P01HL13147 to F.D.C. F.D.C. is a scholar of the Howard Hughes Medical Institute and the Leukemia and Lymphoma Society

    Potential for early warning of viral influenza activity in the community by monitoring clinical diagnoses of influenza in hospital emergency departments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although syndromic surveillance systems are gaining acceptance as useful tools in public health, doubts remain about whether the anticipated early warning benefits exist. Many assessments of this question do not adequately account for the confounding effects of autocorrelation and trend when comparing surveillance time series and few compare the syndromic data stream against a continuous laboratory-based standard. We used time series methods to assess whether monitoring of daily counts of Emergency Department (ED) visits assigned a clinical diagnosis of influenza could offer earlier warning of increased incidence of viral influenza in the population compared with surveillance of daily counts of positive influenza test results from laboratories.</p> <p>Methods</p> <p>For the five-year period 2001 to 2005, time series were assembled of ED visits assigned a provisional ED diagnosis of influenza and of laboratory-confirmed influenza cases in New South Wales (NSW), Australia. Poisson regression models were fitted to both time series to minimise the confounding effects of trend and autocorrelation and to control for other calendar influences. To assess the relative timeliness of the two series, cross-correlation analysis was performed on the model residuals. Modelling and cross-correlation analysis were repeated for each individual year.</p> <p>Results</p> <p>Using the full five-year time series, short-term changes in the ED time series were estimated to precede changes in the laboratory series by three days. For individual years, the estimate was between three and 18 days. The time advantage estimated for the individual years 2003–2005 was consistently between three and four days.</p> <p>Conclusion</p> <p>Monitoring time series of ED visits clinically diagnosed with influenza could potentially provide three days early warning compared with surveillance of laboratory-confirmed influenza. When current laboratory processing and reporting delays are taken into account this time advantage is even greater.</p

    Nature of protein family signatures: Insights from singular value analysis of position-specific scoring matrices

    Get PDF
    Position-specific scoring matrices (PSSMs) are useful for detecting weak homology in protein sequence analysis, and they are thought to contain some essential signatures of the protein families. In order to elucidate what kind of ingredients constitute such family-specific signatures, we apply singular value decomposition to a set of PSSMs and examine the properties of dominant right and left singular vectors. The first right singular vectors were correlated with various amino acid indices including relative mutability, amino acid composition in protein interior, hydropathy, or turn propensity, depending on proteins. A significant correlation between the first left singular vector and a measure of site conservation was observed. It is shown that the contribution of the first singular component to the PSSMs act to disfavor potentially but falsely functionally important residues at conserved sites. The second right singular vectors were highly correlated with hydrophobicity scales, and the corresponding left singular vectors with contact numbers of protein structures. It is suggested that sequence alignment with a PSSM is essentially equivalent to threading supplemented with functional information. The presented method may be used to separate functionally important sites from structurally important ones, and thus it may be a useful tool for predicting protein functions.Comment: 22 pages, 7 figures, 4 table

    Molecular Basis of NDM-1, a New Antibiotic Resistance Determinant

    Get PDF
    The New Delhi Metallo-β-lactamase (NDM-1) was first reported in 2009 in a Swedish patient. A recent study reported that Klebsiella pneumonia NDM-1 positive strain or Escherichia coli NDM-1 positive strain was highly resistant to all antibiotics tested except tigecycline and colistin. These can no longer be relied on to treat infections and therefore, NDM-1 now becomes potentially a major global health threat

    Structure-based statistical analysis of transmembrane helices

    Get PDF
    Recent advances in determination of the high-resolution structure of membrane proteins now enable analysis of the main features of amino acids in transmembrane (TM) segments in comparison with amino acids in water-soluble helices. In this work, we conducted a large-scale analysis of the prevalent locations of amino acids by using a data set of 170 structures of integral membrane proteins obtained from the MPtopo database and 930 structures of water-soluble helical proteins obtained from the protein data bank. Large hydrophobic amino acids (Leu, Val, Ile, and Phe) plus Gly were clearly prevalent in TM helices whereas polar amino acids (Glu, Lys, Asp, Arg, and Gln) were less frequent in this type of helix. The distribution of amino acids along TM helices was also examined. As expected, hydrophobic and slightly polar amino acids are commonly found in the hydrophobic core of the membrane whereas aromatic (Trp and Tyr), Pro, and the hydrophilic amino acids (Asn, His, and Gln) occur more frequently in the interface regions. Charged amino acids are also statistically prevalent outside the hydrophobic core of the membrane, and whereas acidic amino acids are frequently found at both cytoplasmic and extra-cytoplasmic interfaces, basic amino acids cluster at the cytoplasmic interface. These results strongly support the experimentally demonstrated biased distribution of positively charged amino acids (that is, the so-called the positive-inside rule) with structural data

    Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>HIV-1 targets human cells expressing both the CD4 receptor, which binds the viral envelope glycoprotein gp120, as well as either the CCR5 (R5) or CXCR4 (X4) co-receptors, which interact primarily with the third hypervariable loop (V3 loop) of gp120. Determination of HIV-1 affinity for either the R5 or X4 co-receptor on host cells facilitates the inclusion of co-receptor antagonists as a part of patient treatment strategies. A dataset of 1193 distinct gp120 V3 loop peptide sequences (989 R5-utilizing, 204 X4-capable) is utilized to train predictive classifiers based on implementations of random forest, support vector machine, boosted decision tree, and neural network machine learning algorithms. An <it>in silico </it>mutagenesis procedure employing multibody statistical potentials, computational geometry, and threading of variant V3 sequences onto an experimental structure, is used to generate a feature vector representation for each variant whose components measure environmental perturbations at corresponding structural positions.</p> <p>Results</p> <p>Classifier performance is evaluated based on stratified 10-fold cross-validation, stratified dataset splits (2/3 training, 1/3 validation), and leave-one-out cross-validation. Best reported values of sensitivity (85%), specificity (100%), and precision (98%) for predicting X4-capable HIV-1 virus, overall accuracy (97%), Matthew's correlation coefficient (89%), balanced error rate (0.08), and ROC area (0.97) all reach critical thresholds, suggesting that the models outperform six other state-of-the-art methods and come closer to competing with phenotype assays.</p> <p>Conclusions</p> <p>The trained classifiers provide instantaneous and reliable predictions regarding HIV-1 co-receptor usage, requiring only translated V3 loop genotypes as input. Furthermore, the novelty of these computational mutagenesis based predictor attributes distinguishes the models as orthogonal and complementary to previous methods that utilize sequence, structure, and/or evolutionary information. The classifiers are available online at <url>http://proteins.gmu.edu/automute</url>.</p

    Fully automated high-quality NMR structure determination of small 2H-enriched proteins

    Get PDF
    Determination of high-quality small protein structures by nuclear magnetic resonance (NMR) methods generally requires acquisition and analysis of an extensive set of structural constraints. The process generally demands extensive backbone and sidechain resonance assignments, and weeks or even months of data collection and interpretation. Here we demonstrate rapid and high-quality protein NMR structure generation using CS-Rosetta with a perdeuterated protein sample made at a significantly reduced cost using new bacterial culture condensation methods. Our strategy provides the basis for a high-throughput approach for routine, rapid, high-quality structure determination of small proteins. As an example, we demonstrate the determination of a high-quality 3D structure of a small 8 kDa protein, E. coli cold shock protein A (CspA), using <4 days of data collection and fully automated data analysis methods together with CS-Rosetta. The resulting CspA structure is highly converged and in excellent agreement with the published crystal structure, with a backbone RMSD value of 0.5 Å, an all atom RMSD value of 1.2 Å to the crystal structure for well-defined regions, and RMSD value of 1.1 Å to crystal structure for core, non-solvent exposed sidechain atoms. Cross validation of the structure with 15N- and 13C-edited NOESY data obtained with a perdeuterated 15N, 13C-enriched 13CH3 methyl protonated CspA sample confirms that essentially all of these independently-interpreted NOE-based constraints are already satisfied in each of the 10 CS-Rosetta structures. By these criteria, the CS-Rosetta structure generated by fully automated analysis of data for a perdeuterated sample provides an accurate structure of CspA. This represents a general approach for rapid, automated structure determination of small proteins by NMR
    corecore