88 research outputs found

    Predicting mostly disordered proteins by using structure-unknown protein data

    Get PDF
    BACKGROUND: Predicting intrinsically disordered proteins is important in structural biology because they are thought to carry out various cellular functions even though they have no stable three-dimensional structure. We know the structures of far more ordered proteins than disordered proteins. The structural distribution of proteins in nature can therefore be inferred to differ from that of proteins whose structures have been determined experimentally. We know many more protein sequences than we do protein structures, and many of the known sequences can be expected to be those of disordered proteins. Thus it would be efficient to use the information of structure-unknown proteins in order to avoid training data sparseness. We propose a novel method for predicting which proteins are mostly disordered by using spectral graph transducer and training with a huge amount of structure-unknown sequences as well as structure-known sequences. RESULTS: When the proposed method was evaluated on data that included 82 disordered proteins and 526 ordered proteins, its sensitivity was 0.723 and its specificity was 0.977. It resulted in a Matthews correlation coefficient 0.202 points higher than that obtained using FoldIndex, 0.221 points higher than that obtained using the method based on plotting hydrophobicity against the number of contacts and 0.07 points higher than that obtained using support vector machines (SVMs). To examine robustness against training data sparseness, we investigated the correlation between two results obtained when the method was trained on different datasets and tested on the same dataset. The correlation coefficient for the proposed method is 0.14 higher than that for the method using SVMs. When the proposed SGT-based method was compared with four per-residue predictors (VL3, GlobPlot, DISOPRED2 and IUPred (long)), its sensitivity was 0.834 for disordered proteins, which is 0.052–0.523 higher than that of the per-residue predictors, and its specificity was 0.991 for ordered proteins, which is 0.036–0.153 higher than that of the per-residue predictors. The proposed method was also evaluated on data that included 417 partially disordered proteins. It predicted the frequency of disordered proteins to be 1.95% for the proteins with 5%–10% disordered sequences, 1.46% for the proteins with 10%–20% disordered sequences and 16.57% for proteins with 20%–40% disordered sequences. CONCLUSION: The proposed method, which utilizes the information of structure-unknown data, predicts disordered proteins more accurately than other methods and is less affected by training data sparseness

    Inherent Structural Disorder and Dimerisation of Murine Norovirus NS1-2 Protein

    Get PDF
    Human noroviruses are highly infectious viruses that cause the majority of acute, non-bacterial epidemic gastroenteritis cases worldwide. The first open reading frame of the norovirus RNA genome encodes for a polyprotein that is cleaved by the viral protease into six non-structural proteins. The first non-structural protein, NS1-2, lacks any significant sequence similarity to other viral or cellular proteins and limited information is available about the function and biophysical characteristics of this protein. Bioinformatic analyses identified an inherently disordered region (residues 1–142) in the highly divergent N-terminal region of the norovirus NS1-2 protein. Expression and purification of the NS1-2 protein of Murine norovirus confirmed these predictions by identifying several features typical of an inherently disordered protein. These were a biased amino acid composition with enrichment in the disorder promoting residues serine and proline, a lack of predicted secondary structure, a hydrophilic nature, an aberrant electrophoretic migration, an increased Stokes radius similar to that predicted for a protein from the pre-molten globule family, a high sensitivity to thermolysin proteolysis and a circular dichroism spectrum typical of an inherently disordered protein. The purification of the NS1-2 protein also identified the presence of an NS1-2 dimer in Escherichia coli and transfected HEK293T cells. Inherent disorder provides significant advantages including structural flexibility and the ability to bind to numerous targets allowing a single protein to have multiple functions. These advantages combined with the potential functional advantages of multimerisation suggest a multi-functional role for the NS1-2 protein

    Polycation-π Interactions Are a Driving Force for Molecular Recognition by an Intrinsically Disordered Oncoprotein Family

    Get PDF
    Molecular recognition by intrinsically disordered proteins (IDPs) commonly involves specific localized contacts and target-induced disorder to order transitions. However, some IDPs remain disordered in the bound state, a phenomenon coined "fuzziness", often characterized by IDP polyvalency, sequence-insensitivity and a dynamic ensemble of disordered bound-state conformations. Besides the above general features, specific biophysical models for fuzzy interactions are mostly lacking. The transcriptional activation domain of the Ewing's Sarcoma oncoprotein family (EAD) is an IDP that exhibits many features of fuzziness, with multiple EAD aromatic side chains driving molecular recognition. Considering the prevalent role of cation-π interactions at various protein-protein interfaces, we hypothesized that EAD-target binding involves polycation- π contacts between a disordered EAD and basic residues on the target. Herein we evaluated the polycation-π hypothesis via functional and theoretical interrogation of EAD variants. The experimental effects of a range of EAD sequence variations, including aromatic number, aromatic density and charge perturbations, all support the cation-π model. Moreover, the activity trends observed are well captured by a coarse-grained EAD chain model and a corresponding analytical model based on interaction between EAD aromatics and surface cations of a generic globular target. EAD-target binding, in the context of pathological Ewing's Sarcoma oncoproteins, is thus seen to be driven by a balance between EAD conformational entropy and favorable EAD-target cation-π contacts. Such a highly versatile mode of molecular recognition offers a general conceptual framework for promiscuous target recognition by polyvalent IDPs. © 2013 Song et al

    Totally laparoscopic versus conventional ileoanal pouch procedure – design of a single-centre, expertise based randomised controlled trial to compare the laparoscopic and conventional surgical approach in patients undergoing primary elective restorative proctocolectomy- LapConPouch-Trial

    Get PDF
    BACKGROUND: Restorative proctocolectomy is increasingly being performed minimal invasively but a totally laparoscopic technique has not yet been compared to the standard open technique in a randomized study. METHODS/DESIGN: This is a two armed, single centre, expertise based, preoperatively randomized, patient blinded study. It is designed as a two-group parallel superiority study. Power calculation revealed 80 patients per group in order to recruit the 65 patients to be analysed for the primary endpoint. The primary objective is to investigate intra-operative blood loss and the need for blood transfusions. We hypothesise that intra-operative blood loss and the need for peri-operative blood transfusions are significantly higher in the conventional group. Additionally a set of surgical and non-surgical parameters related to the operation will be analysed as secondary objectives. These will include operative time, complications, postoperative pain, lung function, postoperative length of hospital stay, a cosmetic score and pre-and postoperative quality of life. DISCUSSION: The trial will answer the question whether there is indeed an advantage in the laparoscopic group in regard to blood loss and the need for blood transfusions. Moreover, it will generate data on the safety and potential advantages and disadvantages of the minimally invasive approach

    Long-Term Surgical Recurrence, Morbidity, Quality of Life, and Body Image of Laparoscopic-Assisted vs. Open Ileocolic Resection for Crohn’s Disease: A Comparative Study

    Get PDF
    PurposeSeveral studies have compared conventional open ileocolic resection with a laparoscopic-assisted approach. However, long-term outcome after laparoscopic-assisted ileocolic resection remains to be determined. This study was designed to compare long-term results of surgical recurrence, quality of life, body image, and cosmesis in patients who underwent laparoscopic-assisted or open ileocolic resection for Crohn's disease.MethodsSeventy-eight consecutive patients who underwent ileocolic resection during the period 1995 to 1998 were analyzed; 48 underwent a conventional open approach in the Academic Medical Centre (Amsterdam, The Netherlands) and 30 underwent a laparoscopic-assisted approach in the Leiden University Medical Centre (Leiden, The Netherlands). Primary outcome parameters were reoperation and readmission rate. Secondary outcome parameters were quality of life, body image, and cosmesis.ResultsThe two groups were comparable for characteristics of sex, age, and immunosuppressive therapy. Seventy-one patients had a complete follow-up of median 8.5 years. Resection for recurrent Crohn's disease was performed in 6 of 27 (22 percent) and 10 of 44 (23 percent) patients in the laparoscopic and open groups, respectively. Reoperations for incisional hernia were only performed after conventional open ileocolic resection (3/44 = 6.8 percent). Quality of life and body image were comparable, but cosmesis scores were significantly higher in the laparoscopic group.ConclusionsDespite small numbers, we found that surgical recurrence and quality of life after laparoscopic-assisted and open ileocolic resection were comparable. Incisional hernias occurred only after open ileocolic resection, and laparoscopic-assisted ileocolic resection resulted in a significantly better cosmesis

    Predictive Power Estimation Algorithm (PPEA) - A New Algorithm to Reduce Overfitting for Genomic Biomarker Discovery

    Get PDF
    Toxicogenomics promises to aid in predicting adverse effects, understanding the mechanisms of drug action or toxicity, and uncovering unexpected or secondary pharmacology. However, modeling adverse effects using high dimensional and high noise genomic data is prone to over-fitting. Models constructed from such data sets often consist of a large number of genes with no obvious functional relevance to the biological effect the model intends to predict that can make it challenging to interpret the modeling results. To address these issues, we developed a novel algorithm, Predictive Power Estimation Algorithm (PPEA), which estimates the predictive power of each individual transcript through an iterative two-way bootstrapping procedure. By repeatedly enforcing that the sample number is larger than the transcript number, in each iteration of modeling and testing, PPEA reduces the potential risk of overfitting. We show with three different cases studies that: (1) PPEA can quickly derive a reliable rank order of predictive power of individual transcripts in a relatively small number of iterations, (2) the top ranked transcripts tend to be functionally related to the phenotype they are intended to predict, (3) using only the most predictive top ranked transcripts greatly facilitates development of multiplex assay such as qRT-PCR as a biomarker, and (4) more importantly, we were able to demonstrate that a small number of genes identified from the top-ranked transcripts are highly predictive of phenotype as their expression changes distinguished adverse from nonadverse effects of compounds in completely independent tests. Thus, we believe that the PPEA model effectively addresses the over-fitting problem and can be used to facilitate genomic biomarker discovery for predictive toxicology and drug responses

    The Unconserved Groucho Central Region Is Essential for Viability and Modulates Target Gene Specificity

    Get PDF
    Groucho (Gro) is a Drosophila corepressor required by numerous DNA-binding repressors, many of which are distributed in gradients and provide positional information during development. Gro contains well-conserved domains at its N- and C-termini, and a poorly conserved central region that includes the GP, CcN, and SP domains. All lethal point mutations in gro map to the conserved regions, leading to speculation that the unconserved central domains are dispensable. However, our sequence analysis suggests that the central domains are disordered leading us to suspect that the lack of lethal mutations in this region reflects a lack of order rather than an absence of essential functions. In support of this conclusion, genomic rescue experiments with Gro deletion variants demonstrate that the GP and CcN domains are required for viability. Misexpression assays using these same deletion variants show that the SP domain prevents unrestrained and promiscuous repression by Gro, while the GP and CcN domains are indispensable for repression. Deletion of the GP domain leads to loss of nuclear import, while deletion of the CcN domain leads to complete loss of repression. Changes in Gro activity levels reset the threshold concentrations at which graded repressors silence target gene expression. We conclude that co-regulators such as Gro are not simply permissive components of the repression machinery, but cooperate with graded DNA-binding factors in setting borders of gene expression. We suspect that disorder in the Gro central domains may provide the flexibility that allows this region to mediate multiple interactions required for repression

    Reduction in Structural Disorder and Functional Complexity in the Thermal Adaptation of Prokaryotes

    Get PDF
    Genomic correlates of evolutionary adaptation to very low or very high optimal growth temperature (OGT) values have been the subject of many studies. Whereas these provided a protein-structural rationale of the activity and stability of globular proteins/enzymes, the point has been neglected that adaptation to extreme temperatures could also have resulted from an increased use of intrinsically disordered proteins (IDPs), which are resistant to these conditions in vitro. Contrary to these expectations, we found a conspicuously low level of structural disorder in bacteria of very high (and very low) OGT values. This paucity of disorder does not reflect phylogenetic relatedness, i.e. it is a result of genuine adaptation to extreme conditions. Because intrinsic disorder correlates with important regulatory functions, we asked how these bacteria could exist without IDPs by studying transcription factors, known to harbor a lot of function-related intrinsic disorder. Hyperthermophiles have much less transcription factors, which have reduced disorder compared to their mesophilic counterparts. On the other hand, we found by systematic categorization of proteins with long disordered regions that there are certain functions, such as translation and ribosome biogenesis that depend on structural disorder even in hyperthermophiles. In all, our observations suggest that adaptation to extreme conditions is achieved by a significant functional simplification, apparent at both the level of the genome and individual genes/proteins

    Malleable Machines in Transcription Regulation: The Mediator Complex

    Get PDF
    The Mediator complex provides an interface between gene-specific regulatory proteins and the general transcription machinery including RNA polymerase II (RNAP II). The complex has a modular architecture (Head, Middle, and Tail) and cryoelectron microscopy analysis suggested that it undergoes dramatic conformational changes upon interactions with activators and RNAP II. These rearrangements have been proposed to play a role in the assembly of the preinitiation complex and also to contribute to the regulatory mechanism of Mediator. In analogy to many regulatory and transcriptional proteins, we reasoned that Mediator might also utilize intrinsically disordered regions (IDRs) to facilitate structural transitions and transmit transcriptional signals. Indeed, a high prevalence of IDRs was found in various subunits of Mediator from both Saccharomyces cerevisiae and Homo sapiens, especially in the Tail and the Middle modules. The level of disorder increases from yeast to man, although in both organisms it significantly exceeds that of multiprotein complexes of a similar size. IDRs can contribute to Mediator's function in three different ways: they can individually serve as target sites for multiple partners having distinctive structures; they can act as malleable linkers connecting globular domains that impart modular functionality on the complex; and they can also facilitate assembly and disassembly of complexes in response to regulatory signals. Short segments of IDRs, termed molecular recognition features (MoRFs) distinguished by a high protein–protein interaction propensity, were identified in 16 and 19 subunits of the yeast and human Mediator, respectively. In Saccharomyces cerevisiae, the functional roles of 11 MoRFs have been experimentally verified, and those in the Med8/Med18/Med20 and Med7/Med21 complexes were structurally confirmed. Although the Saccharomyces cerevisiae and Homo sapiens Mediator sequences are only weakly conserved, the arrangements of the disordered regions and their embedded interaction sites are quite similar in the two organisms. All of these data suggest an integral role for intrinsic disorder in Mediator's function
    corecore