5,080 research outputs found
Polynomial Time Algorithms for Multi-Type Branching Processes and Stochastic Context-Free Grammars
We show that one can approximate the least fixed point solution for a
multivariate system of monotone probabilistic polynomial equations in time
polynomial in both the encoding size of the system of equations and in
log(1/\epsilon), where \epsilon > 0 is the desired additive error bound of the
solution. (The model of computation is the standard Turing machine model.)
We use this result to resolve several open problems regarding the
computational complexity of computing key quantities associated with some
classic and heavily studied stochastic processes, including multi-type
branching processes and stochastic context-free grammars
Structure preserving transformations on non-left-recursive grammars
We will be concerned with grammar covers, The first part of this paper presents a general framework for covers. The second part introduces a transformation from nonleft-recursive grammars to grammars in Greibach normal form. An investigation of the structure preserving properties of this transformation, which serves also as an illustration of our framework for covers, is presented
From left-regular to Greibach normal form grammars
Each context-free grammar can be transformed to a context-free grammar in Greibach normal form, that is, a context-free grammar where each right-hand side of a prorfuction begins with a terminal symbol and the remainder of the right-hand side consists of nonterminal symbols. In this short paper we show that for a left-regular grammar G we can obtain a right-regular grammar G’ (which is by definition in Greibach normal form) which left-to-right covers G (in this case left parses of G’ can be mapped by a homomorphism on right parses of G. Moreover, it is possible to obtain a context-free grammar G” in Greibach normal form which right covers the left-regular grammar G (in this case right parses of G” are mapped on right parses of G)
Genomics and proteomics: a signal processor's tour
The theory and methods of signal processing are becoming increasingly important in molecular biology. Digital filtering techniques, transform domain methods, and Markov models have played important roles in gene identification, biological sequence analysis, and alignment. This paper contains a brief review of molecular biology, followed by a review of the applications of signal processing theory. This includes the problem of gene finding using digital filtering, and the use of transform domain methods in the study of protein binding spots. The relatively new topic of noncoding genes, and the associated problem of identifying ncRNA buried in DNA sequences are also described. This includes a discussion of hidden Markov models and context free grammars. Several new directions in genomic signal processing are briefly outlined in the end
Grammars with two-sided contexts
In a recent paper (M. Barash, A. Okhotin, "Defining contexts in context-free
grammars", LATA 2012), the authors introduced an extension of the context-free
grammars equipped with an operator for referring to the left context of the
substring being defined. This paper proposes a more general model, in which
context specifications may be two-sided, that is, both the left and the right
contexts can be specified by the corresponding operators. The paper gives the
definitions and establishes the basic theory of such grammars, leading to a
normal form and a parsing algorithm working in time O(n^4), where n is the
length of the input string.Comment: In Proceedings AFL 2014, arXiv:1405.527
Type-driven semantic interpretation and feature dependencies in R-LFG
Once one has enriched LFG's formal machinery with the linear logic mechanisms
needed for semantic interpretation as proposed by Dalrymple et. al., it is
natural to ask whether these make any existing components of LFG redundant. As
Dalrymple and her colleagues note, LFG's f-structure completeness and coherence
constraints fall out as a by-product of the linear logic machinery they propose
for semantic interpretation, thus making those f-structure mechanisms
redundant. Given that linear logic machinery or something like it is
independently needed for semantic interpretation, it seems reasonable to
explore the extent to which it is capable of handling feature structure
constraints as well.
R-LFG represents the extreme position that all linguistically required
feature structure dependencies can be captured by the resource-accounting
machinery of a linear or similiar logic independently needed for semantic
interpretation, making LFG's unification machinery redundant. The goal is to
show that LFG linguistic analyses can be expressed as clearly and perspicuously
using the smaller set of mechanisms of R-LFG as they can using the much larger
set of unification-based mechanisms in LFG: if this is the case then we will
have shown that positing these extra f-structure mechanisms is not
linguistically warranted.Comment: 30 pages, to appear in the the ``Glue Language'' volume edited by
Dalrymple, uses tree-dvips, ipa, epic, eepic, fullnam
Certified Context-Free Parsing: A formalisation of Valiant's Algorithm in Agda
Valiant (1975) has developed an algorithm for recognition of context free
languages. As of today, it remains the algorithm with the best asymptotic
complexity for this purpose. In this paper, we present an algebraic
specification, implementation, and proof of correctness of a generalisation of
Valiant's algorithm. The generalisation can be used for recognition, parsing or
generic calculation of the transitive closure of upper triangular matrices. The
proof is certified by the Agda proof assistant. The certification is
representative of state-of-the-art methods for specification and proofs in
proof assistants based on type-theory. As such, this paper can be read as a
tutorial for the Agda system
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities
We describe an extension of Earley's parser for stochastic context-free
grammars that computes the following quantities given a stochastic context-free
grammar and an input string: a) probabilities of successive prefixes being
generated by the grammar; b) probabilities of substrings being generated by the
nonterminals, including the entire string being generated by the grammar; c)
most likely (Viterbi) parse of the string; d) posterior expected number of
applications of each grammar production, as required for reestimating rule
probabilities. (a) and (b) are computed incrementally in a single left-to-right
pass over the input. Our algorithm compares favorably to standard bottom-up
parsing methods for SCFGs in that it works efficiently on sparse grammars by
making use of Earley's top-down control structure. It can process any
context-free rule format without conversion to some normal form, and combines
computations for (a) through (d) in a single algorithm. Finally, the algorithm
has simple extensions for processing partially bracketed inputs, and for
finding partial parses and their likelihoods on ungrammatical inputs.Comment: 45 pages. Slightly shortened version to appear in Computational
Linguistics 2
- …