338 research outputs found

    Folding factors and partners for the intrinsically disordered protein Micro-Exon Gene 14 (MEG-14)

    Get PDF
    The micro-exon genes (MEG) of Schistosoma mansoni, a parasite responsible for the second most widely spread tropical disease, code for small secreted proteins with sequences unique to the Schistosoma genera. Bioinformatics analyses suggest the soluble domain of the MEG-14 protein will be largely disordered, and using synchrotron radiation circular dichroism spectroscopy, its secondary structure was shown to be essentially completely unfolded in aqueous solution. It does, however, show a strong propensity to fold into more ordered structures under a wide range of conditions. Partial folding was produced by increasing temperature (in a reversible process), contrary to the behavior of most soluble proteins. Furthermore, significant folding was observed in the presence of negatively charged lipids and detergents, but not in zwitterionic or neutral lipids or detergents. Absorption onto a surface followed by dehydration stimulated it to fold into a helical structure, as it did when the aqueous solution was replaced by nonaqueous solvents. Hydration of the dehydrated folded protein was accompanied by complete unfolding. These results support the identification of MEG-14 as a classic intrinsically disordered protein, and open the possibility of its interaction/folding with different partners and factors being related to multifunctional roles and states within the host

    NODAL/TGFβ signalling mediates the self-sustained stemness induced by PIK3CAH1047R homozygosity in pluripotent stem cells

    Get PDF
    Activating PIK3CA mutations are known “drivers” of human cancer and developmental overgrowth syndromes. We recently demonstrated that the "hotspot" PIK3CAH1047R variant exerts unexpected allele dose-dependent effects on stemness in human pluripotent stem cells (hPSCs). In the present study, we combine high-depth transcriptomics, total proteomics and reverse-phase protein arrays to reveal potentially disease-related alterations in heterozygous cells, and to assess the contribution of activated TGFβ signalling to the stemness phenotype of homozygous PIK3CAH1047R cells. We demonstrate signalling rewiring as a function of oncogenic PI3K signalling strength, and provide experimental evidence that self-sustained stemness is causally related to enhanced autocrine NODAL/TGFβ signalling. A significant transcriptomic signature of TGFβ pathway activation in heterozygous PIK3CAH1047R was observed but was modest and was not associated with the stemness phenotype seen in homozygous mutants. Notably, the stemness gene expression in homozygous PIK3CAH1047R iPSCs was reversed by pharmacological inhibition of NODAL/TGFβ signalling, but not by pharmacological PI3Kα pathway inhibition. Altogether, this provides the first in-depth analysis of PI3K signalling in human pluripotent stem cells and directly links strong PI3K activation to developmental NODAL/TGFβ signalling. This work illustrates the importance of allele dosage and expression when artificial systems are used to model human genetic disease caused by activating PIK3CA mutations

    Prediction of phosphotyrosine signaling networks using a scoring matrix-assisted ligand identification approach

    Get PDF
    Systematic identification of binding partners for modular domains such as Src homology 2 (SH2) is important for understanding the biological function of the corresponding SH2 proteins. We have developed a worldwide web-accessible computer program dubbed SMALI for scoring matrix-assisted ligand identification for SH2 domains and other signaling modules. The current version of SMALI harbors 76 unique scoring matrices for SH2 domains derived from screening oriented peptide array libraries. These scoring matrices are used to search a protein database for short peptides preferred by an SH2 domain. An experimentally determined cut-off value is used to normalize an SMALI score, therefore allowing for direct comparison in peptide-binding potential for different SH2 domains. SMALI employs distinct scoring matrices from Scansite, a popular motif-scanning program. Moreover, SMALI contains built-in filters for phosphoproteins, Gene Ontology (GO) correlation and colocalization of subject and query proteins. Compared to Scansite, SMALI exhibited improved accuracy in identifying binding peptides for SH2 domains. Applying SMALI to a group of SH2 domains identified hundreds of interactions that overlap significantly with known networks mediated by the corresponding SH2 proteins, suggesting SMALI is a useful tool for facile identification of signaling networks mediated by modular domains that recognize short linear peptide motifs

    Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila

    Get PDF
    BACKGROUND: Compositionally biased (CB) regions are stretches in protein sequences made from mainly a distinct subset of amino acid residues; such regions are frequently associated with a structural role in the cell, or with protein disorder. RESULTS: We derived a procedure for the exhaustive assignment and classification of CB regions, and have applied it to thirteen metazoan proteomes. Sequences are initially scanned for the lowest-probability subsequences (LPSs) for single amino-acid types; subsequently, an exhaustive search for lowest probability subsequences (LPSs) for multiple residue types is performed iteratively until convergence, to define CB region boundaries. We analysed > 40,000 CB regions with > 20 million residues; strikingly, nine single-/double- residue biases are universally abundant, and are consistently highly ranked across both vertebrates and invertebrates. To home in subpopulations of CB regions of interest in human and D. melanogaster, we analysed CB region lengths, conservation, inferred functional categories and predicted protein disorder, and filtered for coiled coils and protein structures. In particular, we found that some of the universally abundant CB regions have significant associations to transcription and nuclear localization in Human and Drosophila, and are also predicted to be moderately or highly disordered. Focussing on Q-based biased regions, we found that these regions are typically only well conserved within mammals (appearing in 60–80% of orthologs), with shorter human transcription-related CB regions being unconserved outside of mammals; they are also preferentially linked to protein domains such as the homeodomain and glucocorticoid-receptor DNA-binding domain. In general, only ~40–50% of residues in these human and Drosophila CB regions have predicted protein disorder. CONCLUSION: This data is of use for the further functional characterization of genes, and for structural genomics initiatives

    DNA resection in eukaryotes: deciding how to fix the break

    Get PDF
    DNA double-strand breaks are repaired by different mechanisms, including homologous recombination and nonhomologous end-joining. DNA-end resection, the first step in recombination, is a key step that contributes to the choice of DSB repair. Resection, an evolutionarily conserved process that generates single-stranded DNA, is linked to checkpoint activation and is critical for survival. Failure to regulate and execute this process results in defective recombination and can contribute to human disease. Here, I review recent findings on the mechanisms of resection in eukaryotes, from yeast to vertebrates, provide insights into the regulatory strategies that control it, and highlight the consequences of both its impairment and its deregulation

    Predicting mostly disordered proteins by using structure-unknown protein data

    Get PDF
    BACKGROUND: Predicting intrinsically disordered proteins is important in structural biology because they are thought to carry out various cellular functions even though they have no stable three-dimensional structure. We know the structures of far more ordered proteins than disordered proteins. The structural distribution of proteins in nature can therefore be inferred to differ from that of proteins whose structures have been determined experimentally. We know many more protein sequences than we do protein structures, and many of the known sequences can be expected to be those of disordered proteins. Thus it would be efficient to use the information of structure-unknown proteins in order to avoid training data sparseness. We propose a novel method for predicting which proteins are mostly disordered by using spectral graph transducer and training with a huge amount of structure-unknown sequences as well as structure-known sequences. RESULTS: When the proposed method was evaluated on data that included 82 disordered proteins and 526 ordered proteins, its sensitivity was 0.723 and its specificity was 0.977. It resulted in a Matthews correlation coefficient 0.202 points higher than that obtained using FoldIndex, 0.221 points higher than that obtained using the method based on plotting hydrophobicity against the number of contacts and 0.07 points higher than that obtained using support vector machines (SVMs). To examine robustness against training data sparseness, we investigated the correlation between two results obtained when the method was trained on different datasets and tested on the same dataset. The correlation coefficient for the proposed method is 0.14 higher than that for the method using SVMs. When the proposed SGT-based method was compared with four per-residue predictors (VL3, GlobPlot, DISOPRED2 and IUPred (long)), its sensitivity was 0.834 for disordered proteins, which is 0.052–0.523 higher than that of the per-residue predictors, and its specificity was 0.991 for ordered proteins, which is 0.036–0.153 higher than that of the per-residue predictors. The proposed method was also evaluated on data that included 417 partially disordered proteins. It predicted the frequency of disordered proteins to be 1.95% for the proteins with 5%–10% disordered sequences, 1.46% for the proteins with 10%–20% disordered sequences and 16.57% for proteins with 20%–40% disordered sequences. CONCLUSION: The proposed method, which utilizes the information of structure-unknown data, predicts disordered proteins more accurately than other methods and is less affected by training data sparseness

    Control of COVID-19 Outbreaks under Stochastic Community Dynamics, Bimodality, or Limited Vaccination

    Get PDF
    Reaching population immunity against COVID-19 is proving difficult even in countries with high vaccination levels. Thus, it is critical to identify limits of control and effective measures against future outbreaks. The effects of nonpharmaceutical interventions (NPIs) and vaccination strategies are analyzed with a detailed community-specific agent-based model (ABM). The authors demonstrate that the threshold for population immunity is not a unique number, but depends on the vaccination strategy. Prioritizing highly interactive people diminishes the risk for an infection wave, while prioritizing the elderly minimizes fatalities when vaccinations are low. Control over COVID-19 outbreaks requires adaptive combination of NPIs and targeted vaccination, exemplified for Germany for January–September 2021. Bimodality emerges from the heterogeneity and stochasticity of community-specific human–human interactions and infection networks, which can render the effects of limited NPIs uncertain. The authors' simulation platform can process and analyze dynamic COVID-19 epidemiological situations in diverse communities worldwide to predict pathways to population immunity even with limited vaccination.Peer Reviewe

    Notable sequence homology of the ORF10 protein introspects the architecture of SARS-CoV-2

    Get PDF
    The current Coronavirus Disease 19 (COVID-19) pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) shows similar pathology to MERS and SARS-CoV, with a current estimated fatality rate of 1.4%. Open reading frame 10 (ORF10) is a unique SARS-CoV-2 accessory protein, which contains eleven cytotoxic T lymphocyte (CTL) epitopes each of nine amino acids in length. Twenty-two unique SARS-CoV-2 ORF10 variants have been identified based on missense mutations found in sequence databases. Some of these mutations are predicted to decrease the stability of ORF10 in silico physicochemical and structural comparative analyses were carried out on SARS-CoV-2 and Pangolin-CoV ORF10 proteins, which share 97.37% amino acid (aa) homology. Though there is a high degree of ORF10 protein similarity of SARS-CoV-2 and Pangolin-CoV, there are differences of these two ORF10 proteins related to their sub-structure (loop/coil region), solubility, antigenicity and shift from strand to coil at aa position 26 (tyrosine). SARS-CoV-2 ORF10, which is apparently expressed in vivo since reactive T cell clones are found in convalescent patients should be monitored for changes which could correlate with the pathogenesis of COVID-19

    Characterization of the Interaction between the Cohesin Subunits Rad21 and SA1/2

    Get PDF
    The cohesin complex is responsible for the fidelity of chromosomal segregation during mitosis. It consists of four core subunits, namely Rad21/Mcd1/Scc1, Smc1, Smc3, and one of the yeast Scc3 orthologs SA1 or SA2. Sister chromatid cohesion is generated during DNA replication and maintained until the onset of anaphase. Among the many proposed models of the cohesin complex, the メcoreメ cohesin subunits Smc1, Smc3, and Rad21 are almost universally displayed as tripartite ring. However, other than its supportive role in the cohesin ring, little is known about the fourth core subunit SA1/SA2. To gain deeper insight into the function of SA1/SA2 in the cohesin complex, we have mapped the interactive regions of SA2 and Rad21 in vitro and ex vivo. Whereas SA2 interacts with Rad21 through a broad region (301ヨ750 aa), Rad21 binds to SA proteins through two SA-binding motifs on Rad21, namely N-terminal (NT) and middle part (MP) SA-binding motif, located At 60-81 aa of the N-terminus and 383ヨ392 aa of the MP of Rad21, respectively. The MP SA-binding motif is a 10 amino acid, a-helical motif. Deletion of these 10 amino acids or mutation of three conserved amino acids (L385, F389, and T390) in this ahelical motif significantly hinders Rad21 from physically interacting with SA1/2. Besides the MP SA-binding motif, the NT SAbinding motif is also important for SA1/2 interaction. Although mutations on both SA-binding motifs disrupt Rad21-SA1/2 interaction, they had no apparent effect on the Smc1-Smc3-Rad21 interaction. However, the Rad21-Rad21 dimerization was reduced by the mutations, indicating potential involvement of the two SA-binding motifs in the formation of the two-ring handcuff for chromosomal cohesion. Furthermore, mutant Rad21 proteins failed to significantly rescue precocious chromosome separation caused by depletion of endogenous Rad21 in mitotic cells, further indicating the physiological significance of the two SA-binding motifs of Rad21

    A microscale protein NMR sample screening pipeline

    Get PDF
    As part of efforts to develop improved methods for NMR protein sample preparation and structure determination, the Northeast Structural Genomics Consortium (NESG) has implemented an NMR screening pipeline for protein target selection, construct optimization, and buffer optimization, incorporating efficient microscale NMR screening of proteins using a micro-cryoprobe. The process is feasible because the newest generation probe requires only small amounts of protein, typically 30–200 μg in 8–35 μl volume. Extensive automation has been made possible by the combination of database tools, mechanization of key process steps, and the use of a micro-cryoprobe that gives excellent data while requiring little optimization and manual setup. In this perspective, we describe the overall process used by the NESG for screening NMR samples as part of a sample optimization process, assessing optimal construct design and solution conditions, as well as for determining protein rotational correlation times in order to assess protein oligomerization states. Database infrastructure has been developed to allow for flexible implementation of new screening protocols and harvesting of the resulting output. The NESG micro NMR screening pipeline has also been used for detergent screening of membrane proteins. Descriptions of the individual steps in the NESG NMR sample design, production, and screening pipeline are presented in the format of a standard operating procedure
    corecore