We present a general framework to study the thermodynamic denaturation of
double-stranded DNA under superhelical stress. We report calculations of
position- and size-dependent opening probabilities for bubbles along the
sequence. Our results are obtained from transfer-matrix solutions of the
Zimm-Bragg model for unconstrained DNA and of a self-consistent linearization
of the Benham model for superhelical DNA. The numerical efficiency of our
method allows for the analysis of entire genomes and of random sequences of
corresponding length (106−109 base pairs). We show that, at physiological
conditions, opening in superhelical DNA is strongly cooperative with average
bubble sizes of 102−103 base pairs (bp), and orders of magnitude higher
than in unconstrained DNA. In heterogeneous sequences, the average degree of
base-pair opening is self-averaging, while bubble localization and statistics
are dominated by sequence disorder. Compared to random sequences with identical
GC-content, genomic DNA has a significantly increased probability to open large
bubbles under superhelical stress. These bubbles are frequently located
directly upstream of transcription start sites.Comment: to be appeared in Physical Review