Article thumbnail

Using a structural and logics systems approach to infer bHLH–DNA binding specificity determinants

By Federico De Masi, Christian A. Grove, Anastasia Vedenko, Andreu Alibés, Stephen S. Gisselbrecht, Luis Serrano, Martha L. Bulyk and Albertha J. M. Walhout


Numerous efforts are underway to determine gene regulatory networks that describe physical relationships between transcription factors (TFs) and their target DNA sequences. Members of paralogous TF families typically recognize similar DNA sequences. Knowledge of the molecular determinants of protein–DNA recognition by paralogous TFs is of central importance for understanding how small differences in DNA specificities can dictate target gene selection. Previously, we determined the in vitro DNA binding specificities of 19 Caenorhabditis elegans basic helix-loop-helix (bHLH) dimers using protein binding microarrays. These TFs bind E-box (CANNTG) and E-box-like sequences. Here, we combine these data with logics, bHLH–DNA co-crystal structures and computational modeling to infer which bHLH monomer can interact with which CAN E-box half-site and we identify a critical residue in the protein that dictates this specificity. Validation experiments using mutant bHLH proteins provide support for our inferences. Our study provides insights into the mechanisms of DNA recognition by bHLH dimers as well as a blueprint for system-level studies of the DNA binding determinants of other TF families in different model organisms and humans

Topics: Computational Biology
Publisher: Oxford University Press
OAI identifier:
Provided by: PubMed Central

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.

Suggested articles


  1. (2005). A compendium of C. elegans regulatory transcription factors: a resource for mapping transcription regulatory networks.
  2. (1994). A framework for the DNA-protein recognition code of the probe helix in transcription factors: chemical and stereochemical rules.
  3. (2003). A HANDful of questions: the molecular biology of the heart and neural crest derivatives (HAND)-subclass of basic helix-loop-helix transcription factors.
  4. (1998). A helix propensity scale based on experimental studies of peptides and proteins.
  5. (2009). A multiparameter network reveals extensive divergence between C. elegans bHLH transcription factors.
  6. (1997). A natural classification of the basic helix-loop-helix class of transcription factors.
  7. (2007). A systems approach to measuring the binding energy landscapes of transcription factors.
  8. (2008). An analysis of information content present in protein-DNA interactions.
  9. (2008). Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites.
  10. (2006). Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities.
  11. (2010). Comparison of DNA binding across protein superfamilies.
  12. (2008). Crystal structure of E47-NeuroD1/beta2 bHLH domain-DNA complex: heterodimer selectivity and DNA recognition.
  13. (1994). Crystal structure of MyoD bHLH domain-DNA complex: perspectives on DNA recognition and implications for transcriptional activation.
  14. (1997). Crystal structure of PHO4 bHLH domain-DNA complex: flanking base recognition.
  15. (1994). Crystal structure of transcription factor E47: E-box recognition by a basic region helix-loop-helix dimer.
  16. (1991). Cyclic amplification and selection of targets (CASTing) for the Myogenin consensus binding site.
  17. (2006). DBD: a transcription factor prediction database.
  18. (2007). Delta–Notch—and then? Protein interactions and proposed modes of repression by Hes and Hey bHLH factors.
  19. (1990). Differences and similarities in DNA-binding preferences of MyoD and E2A protein complexes revealed by binding site selection.
  20. (2009). Diversity and complexity in DNA recognition by transcription factors.
  21. (2000). Establishment of distinct MyoD, E2A, and Twist DNA-binding specificities by different basic region-DNA conformations.
  22. (2009). Experimental determination of the evolvability of a transcription factor.
  23. (2009). From nonspecific DNA-protein encounter complexes to the prediction of DNA-protein interactions.
  24. (2000). Helix-loop-helix proteins: regulators of transcription in eucaryotic organisms.
  25. (1998). High-resolution structures of variant Zif268-DNA complexes: implications for understanding zinc finger-DNA recognition.
  26. (2008). JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update.
  27. (2004). Looking into DNA recognition: zinc finger binding specificity.
  28. (2006). Molecular architecture of the DNA-binding region and its relationship to classification of basic helix-loop-helix proteins.
  29. (2009). Nuance in the double-helix and its role in protein-DNA recognition.
  30. (2002). Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations.
  31. (2002). Probabilistic code for DNA recognition by proteins of the EGR family.
  32. (2000). Probing the Escherichia coli transcriptional activator MarA using alanine-scanning mutagenesis: residues important for DNA binding and activation.
  33. (2000). Protein interaction mapping in C. elegans using proteins involved in vulval development.
  34. (2005). Quantitative analysis of EGR proteins binding to DNA: assessing additivity in both the binding site and the protein.
  35. (2001). Rearrangement of side-chains in a Zif268 mutant highlights the complexities of zinc finger-DNA recognition.
  36. (1993). Recognition by Max of its cognate DNA through a dimeric b/HLH/Z domain.
  37. (1997). Regulation of the twist target gene tinman by modular cis-regulatory elements during early mesoderm development.
  38. (1994). Structure and function of the b/HLH/Z domain of USF.
  39. (2007). Structure-based prediction of C2H2 zinc-finger binding specificity: sensitivity to docking geometry.
  40. (1997). The crystal structure of an intact human Max-DNA complex: new insights into mechanisms of transcriptional control.
  41. (1990). The energetic basis of specificity in the Eco RI endonuclease—DNA interaction.
  42. (2005). The FoldX web server: an online force field.
  43. (2008). Transcription factor functionality and transcription regulatory networks.
  44. (2009). UniPROBE: an online database of protein binding microarray data on protein-DNA interactions.
  45. (2009). Universal protein-binding microarrays for the comprehensive characterization of the DNA-binding specificities of transcription factors.
  46. (2006). Unraveling transcription regulatory networks by protein-DNA and protein-protein interaction mapping.
  47. (2010). Using protein design algorithms to understand the molecular basis of disease caused by protein-DNA interactions: the Pax6 example.
  48. (2008). Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences.
  49. (2001). Vertebrate hairy and enhancer of split related proteins: transcriptional repressors regulating cellular differentiation and embryonic patterning.
  50. (2003). X-ray structures of Myc-Max and Mad-Max recognizing DNA. Molecular bases of regulation by proto-oncogenic transcription factors.