15 research outputs found
On the Importance of Polar Interactions for Complexes Containing Intrinsically Disordered Proteins
<div><p>There is a growing recognition for the importance of proteins with large intrinsically disordered (ID) segments in cell signaling and regulation. ID segments in these proteins often harbor regions that mediate molecular recognition. Coupled folding and binding of the recognition regions has been proposed to confer high specificity to interactions involving ID segments. However, researchers recently questioned the origin of the interaction specificity of ID proteins because of the overrepresentation of hydrophobic residues in their interaction interfaces. Here, we focused on the role of polar and charged residues in interactions mediated by ID segments. Making use of the extended nature of most ID segments when in complex with globular proteins, we first identified large numbers of complexes between globular proteins and ID segments by using radius-of-gyration-based selection criteria. Consistent with previous studies, we found the interfaces of these complexes to be enriched in hydrophobic residues, and that these residues contribute significantly to the stability of the interaction interface. However, our analyses also show that polar interactions play a larger role in these complexes than in structured protein complexes. Computational alanine scanning and salt-bridge analysis indicate that interfaces in ID complexes are highly complementary with respect to electrostatics, more so than interfaces of globular proteins. Follow-up calculations of the electrostatic contributions to the free energy of binding uncovered significantly stronger Coulombic interactions in complexes harbouring ID segments than in structured protein complexes. However, they are counter-balanced by even higher polar-desolvation penalties. We propose that polar interactions are a key contributing factor to the observed high specificity of ID segment-mediated interactions.</p></div
Box plots of changes in free energy of binding (ΔΔG<sub>bind</sub>) in the alanine scan.
<p>Free energy changes for hydrophobic residues, charged residues, and only charged residues that are forming salt bridge interactions are shown in (a), (b) and (c), respectively. Asterisks identify distributions that are significantly different (<i>p</i> values<0.05; Wilcoxon test). The results for residues in the ID segments, ID binding partners (BPs), and 3D complex proteins are shown in red, magenta and grey respectively.</p
Interface residue composition.
<p>(a) The residue composition at the core regions of complex interfaces. (b) The residue composition at the rim regions of complex interfaces. The interface residue compositions of ID segments, ID binding partners (BPs), and 3D complex proteins are shown in red, magenta, and grey respectively.</p
Interface characteristics.
a<p>Average number of hydrogen bonds or salt bridges per complex interface.</p>b<p>Average SASA buried in the interface (1/2 of the sum of 2 sides).</p>c<p>Significance of the difference between ID and 3D complexes (Wilcoxon test).</p
Box plots of electrostatic components of the binding free energy.
<p>(a) Electrostatic contribution to the desolvation free energy of binding. (b) Coulombic interaction energy of binding. (c) Total electrostatic free energy of binding. Asterisks identify distributions that are significantly different (<i>p</i> values<0.05; Wilcoxon test). Electrostatic contributions to binding are shown for ID complexes and 3D complexes in red and grey, respectively. (d–f) Nup50/importin-α2 is an example of a complex that involves burial of extensive polar surfaces (PDB: 2C1M <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003192#pcbi.1003192-Matsuura1" target="_blank">[81]</a>). The surface of importin-α2 was generated using a probe radius of 1.5 Å. The surface is colored using the electrostatic potential map of importin-α2 that was generated by Delphi. The ID segment, Nup50, is represented by the cartoon ribbon structure with ARG, LYS, and HIS residues colored in blue and GLU and ASP residues colored in red. (d) The full view of the interacting region of the Nup50/importin-α2 complex. (e) Nup50 (37–46) contains a high concentration of positively charged residues that bind to an acidic region of importin-α2. (f) The positively charged residues of the N-terminus of Nup50 are complementary to an acidic surface on its binding partner.</p
Radius of gyration (Rg) of interacting proteins.
<p>(a) Rg as a function of protein length for ID segments that interact with partner molecules (red squares, n = 52) and globular protein of the 3D complex dataset (black triangles, n = 762). 3D complex proteins that are disulfide-rich domains (n = 29) or coiled coils (n = 42) are enclosed in dark blue squares and green circles, respectively. The Rg/N threshold of 0.26 Å is represented by the dotted line. (b) Ribbon structure of the ID segment of p27 (red) that “wraps” around its complex partners cyclin A (grey) and Cdk2 (gold) (p27; PDB: 1JSU chain C, Rg = 21 Å, N = 69). (c) Ribbon structure of one of the 3D complexes, α-chymotrypsin (grey) and eglin c (gold) (bovine α-chymotrypsin; PDB: 1ACB chain E, Rg = 16 Å, N = 241). (d) Ribbon structure of the coiled coil EB1 (EB1; PDB: 1WU9 chain A, Rg = 20 Å, N = 59).</p
Box plot of the fraction of disordered residues in the selected ID complex dataset (Rg/N>0.26 Å) and two controls.
<p>Disorder content was calculated using Disopred2. The first control consists of all the structures in the non-redundant PDB dataset. The second control is the polypeptides of the non-redundant PDB dataset that have an Rg/N<0.26 Å while bound to a large protein partner. Asterisks identify distributions that are significantly different (<i>p</i> values<0.05; Wilcoxon test). Box plot identifies the middle 50% of the data, the median, and the extreme points. The entire set of data points is divided into quartiles and the inter-quartile range (IQR) is calculated as the difference between ×0.75 and ×0.25. The range of the 25% of the data points above (×0.75) and below (×0.25) the median (×0.50) is displayed as a filled box. The horizontal line represents the median. Data points greater or less than 1.5·IQR represent outliers and are shown as hollow circles.</p
List of datasets analyzed and the number of structures in each dataset.
a<p>The number of structures used in the continuum electrostatic calculations are in brackets.</p
Residue composition of the proteins in the selected ID set relative to the 3D complex dataset.
<p>Averaged percentage residue compositions from ID dataset are subtracted by the respective percentage from the 3D complex. Positive and negative values indicate an enrichment and depletion, respectively, of a specific residue in the ID complex set with respect to the 3D complex dataset. Amino acids are sorted according to their ranking in protein chain flexibility <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003192#pcbi.1003192-Romero1" target="_blank">[41]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003192#pcbi.1003192-Vihinen1" target="_blank">[80]</a>. The residue composition for the ID segments (only residues with coordinates in the PDB) and extended ID segments (30 residues on each end) are shown in red and yellow, respectively.</p
Computational Identification of MoRFs in Protein Sequences Using Hierarchical Application of Bayes Rule
<div><p>Motivation</p><p>Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is often the binding to globular protein domains via sequence elements known as molecular recognition features (MoRFs). Development of computational tools for the identification of candidate MoRF locations in amino acid sequences is an important task and an area of growing interest. Given the relative sparseness of MoRFs in protein sequences, the accuracy of the available MoRF predictors is often inadequate for practical usage, which leaves a significant need and room for improvement. In this work, we introduce MoRF<sub>CHiBi_Web</sub>, which predicts MoRF locations in protein sequences with higher accuracy compared to current MoRF predictors.</p><p>Methods</p><p>Three distinct and largely independent property scores are computed with component predictors and then combined to generate the final MoRF propensity scores. The first score reflects the likelihood of sequence windows to harbour MoRFs and is based on amino acid composition and sequence similarity information. It is generated by MoRF<sub>CHiBi</sub> using small windows of up to 40 residues in size. The second score identifies long stretches of protein disorder and is generated by ESpritz with the DisProt option. Lastly, the third score reflects residue conservation and is assembled from PSSM files generated by PSI-BLAST. These propensity scores are processed and then hierarchically combined using Bayes rule to generate the final MoRF<sub>CHiBi_Web</sub> predictions.</p><p>Results</p><p>MoRF<sub>CHiBi_Web</sub> was tested on three datasets. Results show that MoRF<sub>CHiBi_Web</sub> outperforms previously developed predictors by generating less than half the false positive rate for the same true positive rate at practical threshold values. This level of accuracy paired with its relatively high processing speed makes MoRF<sub>CHiBi_Web</sub> a practical tool for MoRF prediction.</p><p>Availability</p><p><a href="http://morf.chibi.ubc.ca:8080/morf/" target="_blank">http://morf.chibi.ubc.ca:8080/morf/</a>.</p></div