thesis

Decoding the protein-DNA recognition rules

Abstract

The C2H2 zinc finger (ZF) transcription factors (TF) form the largest family of DNA binding proteins in eukaryotes. TFs are key proteins involved in gene regulation that bind to specific DNA sites. A major obstacle towards understanding the molecular basis of transcriptional regulation is the lack of a recognition code for protein-DNA interactions. We aim to understand molecular mechanisms of DNA recognition and to quantitatively estimate recognition rules for TF-DNA interactions. We identified key residues playing an important role in ZF-DNA interactions and found that they are prealigned to conformations observed in the bound state prior to binding. A binding site for Cl- ions corresponding to the pocket where DNA phosphates are found most buried in the complex of ZFs is identified. Bound ions constrain conformations of important residues consistent with observations of increased binding affinity with increased ionic strength in protein-DNA interactions. These results suggest a general mechanism where ZFs, through their key residues, rapidly form encounter complexes amenable for a fast readout of the DNA. We developed a novel experimentally-based approach using crystal structures and binding data on the C2H2 ZFs and decoded ten fundamental specific interactions for protein-DNA recognition. These are: Five hydrogen bonds, three desolvation penalties, a non-polar energy, and a novel water accessibility factor. The code is applied to three data sets with a total of 89 ZF mutants on three ZFs of EGR. Guided by simulations of individual ZFs, we mapped the interactions into homology models with all feasible intra- and inter- molecular bonds and selected the structure with the lowest free energy for each ZF. The interactions reproduce changes in affinity of 35 mutants of finger I (FI) (R2 = 0.99), 23 mutants of FII (R2 = 0.97) and 31 human ZFs on FIIII (R2 = 0.95). The method predicts bound ZF-DNA complexes for all mutants, decoding molecular basis of ZF-DNA specificity. These findings reveal recognition rules that depend on DNA sequence/structure, molecular water at the interface and induced fit of the C2H2 TFs. In summary, our method provides the first robust framework to decode the molecular basis of TFs binding to DNA

    Similar works