Article thumbnail

The most probable string: an algorithmic study

By Colin de la Higuera and Jose Oncina Carratalá

Abstract

The problem of finding the consensus (most probable string) for a distribution generated by a weighted finite automaton or a probabilistic grammar is related to a number of important questions: computing the distance between two distributions or finding the best translation (the most probable one) given a probabilistic finite state transducer. The problem is undecidable with general weights and is NP-hard if the automaton is probabilistic. We give a pseudo-polynomial algorithm that solves a decision problem directly associated with the consensus string and answers if there is a (reasonably short) string whose probability is larger than a given bound in time polynomial in the the size of this bound, both for probabilistic finite automata and probabilistic context-free grammars. We also study a randomized algorithm solving the same problem. Finally, we report links between the length of the consensus string and the probability of this string.The first author acknowledges partial support by the Région des Pays de la Loire. The second author thanks the Spanish CICyT for partial support of this work through projects TIN2009-14205-C04-C1, and the program Consolider Ingenio 2010 (CSD2007-00018). This work was supported in part by the IST Programme of the European Community, under the Pascal 2 Network of Excellence, IST-2007-216886

Topics: Probabilistic automata, Parsing, Lenguajes y Sistemas Informáticos
Publisher: 'Oxford University Press (OUP)'
Year: 2013
DOI identifier: 10.1093/logcom/exs049
OAI identifier: oai:rua.ua.es:10045/29316
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://hdl.handle.net/10045/29... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.