Probability estimation is essential for every statistical data compression
algorithm. In practice probability estimation should be adaptive, recent
observations should receive a higher weight than older observations. We present
a probability estimation method based on exponential smoothing that satisfies
this requirement and runs in constant time per letter. Our main contribution is
a theoretical analysis in case of a binary alphabet for various smoothing rate
sequences: We show that the redundancy w.r.t. a piecewise stationary model with
s segments is O(sn) for any bit sequence of length n, an
improvement over redundancy O(snlogn) of previous
approaches with similar time complexity