Ab initio RNA secondary structure predictions have long dismissed helices
interior to loops, so-called pseudoknots, despite their structural importance.
Here, we report that many pseudoknots can be predicted through long time scales
RNA folding simulations, which follow the stochastic closing and opening of
individual RNA helices. The numerical efficacy of these stochastic simulations
relies on an O(n^2) clustering algorithm which computes time averages over a
continously updated set of n reference structures. Applying this exact
stochastic clustering approach, we typically obtain a 5- to 100-fold simulation
speed-up for RNA sequences up to 400 bases, while the effective acceleration
can be as high as 100,000-fold for short multistable molecules (<150 bases). We
performed extensive folding statistics on random and natural RNA sequences, and
found that pseudoknots are unevenly distributed amongst RNAstructures and
account for up to 30% of base pairs in G+C rich RNA sequences (Online RNA
folding kinetics server including pseudoknots : http://kinefold.u-strasbg.fr/
).Comment: 6 pages, 5 figure