31,649 research outputs found

    Cooperating Distributed Grammar Systems of Finite Index Working in Hybrid Modes

    Full text link
    We study cooperating distributed grammar systems working in hybrid modes in connection with the finite index restriction in two different ways: firstly, we investigate cooperating distributed grammar systems working in hybrid modes which characterize programmed grammars with the finite index restriction; looking at the number of components of such systems, we obtain surprisingly rich lattice structures for the inclusion relations between the corresponding language families. Secondly, we impose the finite index restriction on cooperating distributed grammar systems working in hybrid modes themselves, which leads us to new characterizations of programmed grammars of finite index.Comment: In Proceedings AFL 2014, arXiv:1405.527

    An approach to computing downward closures

    Full text link
    The downward closure of a word language is the set of all (not necessarily contiguous) subwords of its members. It is well-known that the downward closure of any language is regular. While the downward closure appears to be a powerful abstraction, algorithms for computing a finite automaton for the downward closure of a given language have been established only for few language classes. This work presents a simple general method for computing downward closures. For language classes that are closed under rational transductions, it is shown that the computation of downward closures can be reduced to checking a certain unboundedness property. This result is used to prove that downward closures are computable for (i) every language class with effectively semilinear Parikh images that are closed under rational transductions, (ii) matrix languages, and (iii) indexed languages (equivalently, languages accepted by higher-order pushdown automata of order 2).Comment: Full version of contribution to ICALP 2015. Comments welcom

    Computation of distances for regular and context-free probabilistic languages

    Get PDF
    Several mathematical distances between probabilistic languages have been investigated in the literature, motivated by applications in language modeling, computational biology, syntactic pattern matching and machine learning. In most cases, only pairs of probabilistic regular languages were considered. In this paper we extend the previous results to pairs of languages generated by a probabilistic context-free grammar and a probabilistic finite automaton.PostprintPeer reviewe

    Criticality in Formal Languages and Statistical Physics

    Full text link
    We show that the mutual information between two symbols, as a function of the number of symbols between the two, decays exponentially in any probabilistic regular grammar, but can decay like a power law for a context-free grammar. This result about formal languages is closely related to a well-known result in classical statistical mechanics that there are no phase transitions in dimensions fewer than two. It is also related to the emergence of power-law correlations in turbulence and cosmological inflation through recursive generative processes. We elucidate these physics connections and comment on potential applications of our results to machine learning tasks like training artificial recurrent neural networks. Along the way, we introduce a useful quantity which we dub the rational mutual information and discuss generalizations of our claims involving more complicated Bayesian networks.Comment: Replaced to match final published version. Discussion improved, references adde

    Descriptional Complexity of Three-Nonterminal Scattered Context Grammars: An Improvement

    Full text link
    Recently, it has been shown that every recursively enumerable language can be generated by a scattered context grammar with no more than three nonterminals. However, in that construction, the maximal number of nonterminals simultaneously rewritten during a derivation step depends on many factors, such as the cardinality of the alphabet of the generated language and the structure of the generated language itself. This paper improves the result by showing that the maximal number of nonterminals simultaneously rewritten during any derivation step can be limited by a small constant regardless of other factors

    Flexibility and Interaction at a Distance: A Mixed-Model Environment For Language Learning

    Get PDF
    This article reports on the process of design and development of two language courses for university students at beginning levels of competence. Following a preliminary experience in a low-tech environment for distance language learning and teaching, and a thorough review of the available literature, we identified two major challenges that would need to be addressed in our design: (1) a necessity to build sufficient flexibility into the materials to cater to a variety of learners' styles, interests and skill levels, therefore sustaining learners' motivation; and (2) a need to design materials that would present the necessary requisites of authenticity and interactivity identified in the examined literature, in spite of the reduced opportunities for face-to-face communication. In response to these considerations, we designed and developed learning materials and tasks to be distributed on CD-ROM, complemented by a WebCT component for added interactivity and task authenticity. Although only part of the original design was implemented, and further research is needed to assess the impact of our environment on learning outcomes, the results of preliminary evaluations are encouraging

    The Unsupervised Acquisition of a Lexicon from Continuous Speech

    Get PDF
    We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that have stymied previous grammar-induction procedures. The forward mapping from symbol sequences to the speech stream is modeled using features based on articulatory gestures. We present results on the acquisition of lexicons and language models from raw speech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other reported results with respect to segmentation performance and statistical efficiency.Comment: 27 page technical repor
    corecore