11,124 research outputs found

    Splittability of bilexical context-free grammars is undecidable

    Get PDF
    Bilexical context-free grammars (2-LCFGs) have proved to be accurate models for statistical natural language parsing. Existing dynamic programming algorithms used to parse sentences under these models have running time of O(|w|^4), where w is the input string. A 2-LCFG is splittable if the left arguments of a lexical head are always independent of the right arguments, and vice versa. When a 2-LCFGs is splittable, parsing time can be asymptotically improved to O(|w|^3). Testing this propertyis therefore of central interest to parsing efficiency. In this article, however, we show the negative result that splittability of 2-LCFGs is undecidable.Publisher PDFPeer reviewe

    Multi-Head Finite Automata: Characterizations, Concepts and Open Problems

    Full text link
    Multi-head finite automata were introduced in (Rabin, 1964) and (Rosenberg, 1966). Since that time, a vast literature on computational and descriptional complexity issues on multi-head finite automata documenting the importance of these devices has been developed. Although multi-head finite automata are a simple concept, their computational behavior can be already very complex and leads to undecidable or even non-semi-decidable problems on these devices such as, for example, emptiness, finiteness, universality, equivalence, etc. These strong negative results trigger the study of subclasses and alternative characterizations of multi-head finite automata for a better understanding of the nature of non-recursive trade-offs and, thus, the borderline between decidable and undecidable problems. In the present paper, we tour a fragment of this literature

    Digraph Complexity Measures and Applications in Formal Language Theory

    Full text link
    We investigate structural complexity measures on digraphs, in particular the cycle rank. This concept is intimately related to a classical topic in formal language theory, namely the star height of regular languages. We explore this connection, and obtain several new algorithmic insights regarding both cycle rank and star height. Among other results, we show that computing the cycle rank is NP-complete, even for sparse digraphs of maximum outdegree 2. Notwithstanding, we provide both a polynomial-time approximation algorithm and an exponential-time exact algorithm for this problem. The former algorithm yields an O((log n)^(3/2))- approximation in polynomial time, whereas the latter yields the optimum solution, and runs in time and space O*(1.9129^n) on digraphs of maximum outdegree at most two. Regarding the star height problem, we identify a subclass of the regular languages for which we can precisely determine the computational complexity of the star height problem. Namely, the star height problem for bideterministic languages is NP-complete, and this holds already for binary alphabets. Then we translate the algorithmic results concerning cycle rank to the bideterministic star height problem, thus giving a polynomial-time approximation as well as a reasonably fast exact exponential algorithm for bideterministic star height.Comment: 19 pages, 1 figur

    Phrase structure grammars as indicative of uniquely human thoughts

    Get PDF
    I argue that the ability to compute phrase structure grammars is indicative of a particular kind of thought. This type of thought that is only available to cognitive systems that have access to the computations that allow the generation and interpretation of the structural descriptions of phrase structure grammars. The study of phrase structure grammars, and formal language theory in general, is thus indispensable to studies of human cognition, for it makes explicit both the unique type of human thought and the underlying mechanisms in virtue of which this thought is made possible

    Concepts of structural underspecification in Bantu and Romance

    Get PDF

    An Introduction to Programming for Bioscientists: A Python-based Primer

    Full text link
    Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in the biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language's usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a 'variable', the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables, numerous exercises, and 19 pages of Supporting Information; currently in press at PLOS Computational Biolog

    VP-fronting in Czech and Polish : a case study in corpus-oriented grammar research

    Get PDF
    Fronting of an infinite VP across a finite main verb - akin to German "VP-topicalization" - can be found also in Czech and Polish. The paper discusses evidence from large corpora for this process and some of its properties, both syntactic and information-structural. Based on this case, criteria for more user-friedly searching and retrieval of corpus data in syntactic research are being developed
    corecore