61 research outputs found

    Penney's game between many players

    Full text link
    We recall a combinatorial derivation of the functions generating probability of winnings for each of many participants of the Penney's game and show a generalization of the Conway's formula to this case.Comment: 6 page

    A Proof of Entropy Minimization for Outputs in Deletion Channels via Hidden Word Statistics

    Get PDF
    From the output produced by a memoryless deletion channel from a uniformly random input of known length nn, one obtains a posterior distribution on the channel input. The difference between the Shannon entropy of this distribution and that of the uniform prior measures the amount of information about the channel input which is conveyed by the output of length mm, and it is natural to ask for which outputs this is extremized. This question was posed in a previous work, where it was conjectured on the basis of experimental data that the entropy of the posterior is minimized and maximized by the constant strings 000…\texttt{000}\ldots and 111…\texttt{111}\ldots and the alternating strings 0101…\texttt{0101}\ldots and 1010…\texttt{1010}\ldots respectively. In the present work we confirm the minimization conjecture in the asymptotic limit using results from hidden word statistics. We show how the analytic-combinatorial methods of Flajolet, Szpankowski and Vall\'ee for dealing with the hidden pattern matching problem can be applied to resolve the case of fixed output length and n→∞n\rightarrow\infty, by obtaining estimates for the entropy in terms of the moments of the posterior distribution and establishing its minimization via a measure of autocorrelation.Comment: 11 pages, 2 figure

    Highly Scalable Algorithms for Robust String Barcoding

    Full text link
    String barcoding is a recently introduced technique for genomic-based identification of microorganisms. In this paper we describe the engineering of highly scalable algorithms for robust string barcoding. Our methods enable distinguisher selection based on whole genomic sequences of hundreds of microorganisms of up to bacterial size on a well-equipped workstation, and can be easily parallelized to further extend the applicability range to thousands of bacterial size genomes. Experimental results on both randomly generated and NCBI genomic data show that whole-genome based selection results in a number of distinguishers nearly matching the information theoretic lower bounds for the problem
    • …
    corecore