Probability estimation is an elementary building block of every statistical
data compression algorithm. In practice probability estimation is often based
on relative letter frequencies which get scaled down, when their sum is too
large. Such algorithms are attractive in terms of memory requirements, running
time and practical performance. However, there still is a lack of theoretical
understanding. In this work we formulate a typical probability estimation
algorithm based on relative frequencies and frequency discount, Algorithm RFD.
Our main contribution is its theoretical analysis. We show that the code length
it requires above an arbitrary piecewise stationary model with bounded and
unbounded letter probabilities is small. This theoretically confirms the
recency effect of periodic frequency discount, which has often been observed
empirically