The binding of a transcription factor (TF) to a DNA operator site can
initiate or repress the expression of a gene. Computational prediction of sites
recognized by a TF has traditionally relied upon knowledge of several cognate
sites, rather than an ab initio approach. Here, we examine the possibility of
using structure-based energy calculations that require no knowledge of bound
sites but rather start with the structure of a protein-DNA complex. We study
the PurR E. coli TF, and explore to which extent atomistic models of
protein-DNA complexes can be used to distinguish between cognate and
non-cognate DNA sites. Particular emphasis is placed on systematic evaluation
of this approach by comparing its performance with bioinformatic methods, by
testing it against random decoys and sites of homologous TFs. We also examine a
set of experimental mutations in both DNA and the protein. Using our explicit
estimates of energy, we show that the specificity for PurR is dominated by
direct protein-DNA interactions, and weakly influenced by bending of DNA.Comment: 26 pages, 3 figure