From the output produced by a memoryless deletion channel from a uniformly
random input of known length n, one obtains a posterior distribution on the
channel input. The difference between the Shannon entropy of this distribution
and that of the uniform prior measures the amount of information about the
channel input which is conveyed by the output of length m, and it is natural
to ask for which outputs this is extremized. This question was posed in a
previous work, where it was conjectured on the basis of experimental data that
the entropy of the posterior is minimized and maximized by the constant strings
000… and 111… and the alternating strings
0101… and 1010… respectively. In the present
work we confirm the minimization conjecture in the asymptotic limit using
results from hidden word statistics. We show how the analytic-combinatorial
methods of Flajolet, Szpankowski and Vall\'ee for dealing with the hidden
pattern matching problem can be applied to resolve the case of fixed output
length and n→∞, by obtaining estimates for the entropy in
terms of the moments of the posterior distribution and establishing its
minimization via a measure of autocorrelation.Comment: 11 pages, 2 figure