Search CORE

24 research outputs found

Disordered Binding Regions and Linear Motifs—Bridging the Gap between Two Models of Molecular Recognition

Author: Bálint Mészáros (130991)
István Simon (24760)
Zsuzsanna Dosztányi (57464)
Publication venue
Publication date: 03/10/2012
Field of study

<div>Intrinsically disordered proteins (IDPs) exist without the presence of a stable tertiary structure in isolation. These proteins are often involved in molecular recognition processes via their disordered binding regions that can recognize partner molecules by undergoing a coupled folding and binding process. The specific properties of disordered binding regions give way to specific, yet transient interactions that enable IDPs to play central roles in signaling pathways and act as hubs of protein interaction networks. An alternative model of protein-protein interactions with largely overlapping functional properties is offered by the concept of linear interaction motifs. This approach focuses on distilling a short consensus sequence pattern from proteins with a common interaction partner. These motifs often reside in disordered regions and are considered to mediate the interaction roughly independent from the rest of the protein. Although a connection between linear motifs and disordered binding regions has been established through common examples, the complementary nature of the two concepts has yet to be fully explored. In many cases the sequence based definition of linear motifs and the structural context based definition of disordered binding regions describe two aspects of the same phenomenon. To gain insight into the connection between the two models, prediction methods were utilized. We combined the regular expression based prediction of linear motifs with the disordered binding region prediction method ANCHOR, each specialized for either model to get the best of both worlds. The thorough analysis of the overlap of the two methods offers a bioinformatics tool for more efficient binding site prediction that can serve a wide range of practical implications. At the same time it can also shed light on the theoretical connection between the two co-existing interaction models. </div

Directory of Open Access Journals

PubMed Central

The Francis Crick Institute

Efficiency of ANCHOR for individual LIG motifs.

Author: Bálint Mészáros (130991)
István Simon (24760)
Zsuzsanna Dosztányi (57464)
Publication venue
Publication date
Field of study

The total number of annotated instances for each of the ligand binding motifs that have at least three independent instances in the ELM database. Dark red bars show the number of instances overlapping ANCHOR predicted binding regions. Stars mark the motifs for which the recovery rate is significantly higher than that expected by chance alone (see Methods).</p

The Francis Crick Institute

Efficiency of ANCHOR on linear motifs with respect to structural context.

Author: Bálint Mészáros (130991)
István Simon (24760)
Zsuzsanna Dosztányi (57464)
Publication venue
Publication date
Field of study

Instances are classified according to the predicted disorder status of their flanking sequential environment. Motif instances with both N- and C-terminal flanking regions predicted by IUPred as ordered are classified as ‘Ordered’, instances with one or both flanking regions predicted to be disordered are classified as ‘Mixed’ or ‘Disordered’, respectively.</p

The Francis Crick Institute

The predictive power of ANCHOR as a filter in motif searches.

Author: Bálint Mészáros (130991)
István Simon (24760)
Zsuzsanna Dosztányi (57464)
Publication venue
Publication date
Field of study

Left: fraction of known instances of ligand binding motifs recognized by ANCHOR. Right: the reduction in the number of ligand binding motif hits in the eukaryotic sequences of UniProt.</p

The Francis Crick Institute

Results of motif scans in the three domains of life.

Author: Bálint Mészáros (130991)
István Simon (24760)
Zsuzsanna Dosztányi (57464)
Publication venue
Publication date
Field of study

A: the number of found motif hits from the four different motif groups (CLV – cleavage sites, LIG – generic ligand binding motifs, MOD – modification sites, TRG – target signals) in the eukaryotic (blue), bacterial (green) and archaeal (red) proteins included in the UniProt database. As the size of the three databases are different, the number of actual hits in the prokaryotic sets were scaled with the ratio of the number of residues in each dataset. B: The average number of motif hits per protein for the three databases covering the three domains of life. Again, hit numbers in prokaryotic sets are corrected for different number of residues compared to the eukaryotic dataset. Coloring is identical to that of part A (red – archaea, green – bacteria, blue – eukaryotes). C: The upper bars show the number of found hits in the three domains of life for PCNA, PDZ and cyclin binding motifs (the average hits per protein for the three motifs are shown with vertical lines in part B; note that there are three different PDZ binding motifs and each one is shown with separate lines in part B but only their cumulative numbers are shown in part C). Lower bars show the actual number of corresponding partner domains that can serve as interaction partners for these motifs in the same datasets. Domain occurrences were taken from the PFAM database. Prokaryotic hit numbers are corrected for different number of proteins and the coloring scheme follows that of parts A and B.</p

The Francis Crick Institute

Efficiency of ANCHOR on linear motifs with respect to bound secondary structure.

Author: Bálint Mészáros (130991)
István Simon (24760)
Zsuzsanna Dosztányi (57464)
Publication venue
Publication date
Field of study

Motifs are classified according to the adopted secondary structure upon binding to their partner domain. The efficiency of ANCHOR for separate structural classes were calculated and were compared to the average efficiency calculated on all instances. The difference between average and secondary structure-specific efficiencies were compared using standard χ2 test. The resulting p-values are quoted for all 4 separate structural classes.</p

The Francis Crick Institute

Application to whole proteome scans.

Author: Bálint Mészáros (130991)
István Simon (24760)
Zsuzsanna Dosztányi (57464)
Publication venue
Publication date
Field of study

Results of applying ANCHOR as a filter for scanning the human proteome for instances of the nuclear receptor interacting motif (LIG_NRBOX). A: number of proteins matching the motif; B–D: fraction of proteins containing NRBOX matches with biological process, cellular component and molecular function GO annotations (B, C and D, respectively) matching the annotations of true NRBOX instances (black boxes), with other annotations (grey boxes), and no annotations (white boxes). The height of bars in B–D represent 100% of all found motifs and thus in each sub-figure the complete left bar stands for 7,897 proteins and the complete bar on the right stands for 1,623. The two different number of hits are scaled to accurately represent enrichments of correctly annotated proteins.</p

The Francis Crick Institute

Examples of true motif instances with ANCHOR predictions.

Author: Bálint Mészáros (130991)
István Simon (24760)
Zsuzsanna Dosztányi (57464)
Publication venue
Publication date
Field of study

A: Three instances of the nuclear receptor binding motif (LIG_NRBOX) in the human nuclear receptor coactivator 2 protein (NCOA2). Left: IUPred (red) and ANCHOR (blue) predictions for the 601–800 region of NCOA2. Red bars mark the motif instances with the black box showing the instance for which the corresponding bound structure is shown. Right: the structure of NCOA2 (salmon) with the motif shown in red bound to the glucocorticoid receptor (grey) (structure 1 m2z). B: MAP kinase binding motif (LIG_MAPK_1) in the rhodenase domain of the human DUS6 protein. Left: IUPred (red) and ANCHOR (blue) predictions with the red bar and black box indicating the position of the motif. Right: the structure of DUS6 in monomeric form (structure 1 hzm) with the motif shown in red.</p

The Francis Crick Institute

Alignment of six representative members of the Mgm101p sequence family.

Author: David C. Hayward (198611)
George Desmond Clark-Walker (281878)
Zsuzsanna Dosztányi (57464)
Publication venue
Publication date
Field of study

The C-terminal extension for Dictyostelium discoideum and Naegleria gruberi, that lack sequence conservation, were omitted from the alignment.</p

The Francis Crick Institute

Complementation of the temperature sensitive mutant.

Author: David C. Hayward (198611)
George Desmond Clark-Walker (281878)
Zsuzsanna Dosztányi (57464)
Publication venue
Publication date
Field of study

A GlyYP plate with 1, M2915-7C mgm101-1ts, and M2915-7C transformed with pCXJ22 plasmids containing, 2, S.cerevisiae MGM101, 3, A.millepora MGM101 with a S.cerevisiae mitochondrial targeting signal (A.m.ID-A.m.C)(<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0056465#pone-0056465-g003" target="_blank">Fig. 3</a>), 4, S.cerevisiae intrinsically disordered (ID) domain joined to A.millepora core region (S.c.ID-A.m.C) and 5, A.millepora ID region joined to S.cerevisiae core region (A.m.ID-S.c.C). The constructs all have a mitochondrial targeting signal sequence as shown in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0056465#pone.0056465.s002" target="_blank">Figure S2</a>. The plate was incubated at 35°C for 3 days before being photographed.</p

The Francis Crick Institute