31,649 research outputs found
Cooperating Distributed Grammar Systems of Finite Index Working in Hybrid Modes
We study cooperating distributed grammar systems working in hybrid modes in
connection with the finite index restriction in two different ways: firstly, we
investigate cooperating distributed grammar systems working in hybrid modes
which characterize programmed grammars with the finite index restriction;
looking at the number of components of such systems, we obtain surprisingly
rich lattice structures for the inclusion relations between the corresponding
language families. Secondly, we impose the finite index restriction on
cooperating distributed grammar systems working in hybrid modes themselves,
which leads us to new characterizations of programmed grammars of finite index.Comment: In Proceedings AFL 2014, arXiv:1405.527
An approach to computing downward closures
The downward closure of a word language is the set of all (not necessarily
contiguous) subwords of its members. It is well-known that the downward closure
of any language is regular. While the downward closure appears to be a powerful
abstraction, algorithms for computing a finite automaton for the downward
closure of a given language have been established only for few language
classes.
This work presents a simple general method for computing downward closures.
For language classes that are closed under rational transductions, it is shown
that the computation of downward closures can be reduced to checking a certain
unboundedness property.
This result is used to prove that downward closures are computable for (i)
every language class with effectively semilinear Parikh images that are closed
under rational transductions, (ii) matrix languages, and (iii) indexed
languages (equivalently, languages accepted by higher-order pushdown automata
of order 2).Comment: Full version of contribution to ICALP 2015. Comments welcom
Computation of distances for regular and context-free probabilistic languages
Several mathematical distances between probabilistic languages have been investigated in the literature, motivated by applications in language modeling, computational biology, syntactic pattern matching and machine learning. In most cases, only pairs of probabilistic regular languages were considered. In this paper we extend the previous results to pairs of languages generated by a probabilistic context-free grammar and a probabilistic finite automaton.PostprintPeer reviewe
Criticality in Formal Languages and Statistical Physics
We show that the mutual information between two symbols, as a function of the
number of symbols between the two, decays exponentially in any probabilistic
regular grammar, but can decay like a power law for a context-free grammar.
This result about formal languages is closely related to a well-known result in
classical statistical mechanics that there are no phase transitions in
dimensions fewer than two. It is also related to the emergence of power-law
correlations in turbulence and cosmological inflation through recursive
generative processes. We elucidate these physics connections and comment on
potential applications of our results to machine learning tasks like training
artificial recurrent neural networks. Along the way, we introduce a useful
quantity which we dub the rational mutual information and discuss
generalizations of our claims involving more complicated Bayesian networks.Comment: Replaced to match final published version. Discussion improved,
references adde
Descriptional Complexity of Three-Nonterminal Scattered Context Grammars: An Improvement
Recently, it has been shown that every recursively enumerable language can be
generated by a scattered context grammar with no more than three nonterminals.
However, in that construction, the maximal number of nonterminals
simultaneously rewritten during a derivation step depends on many factors, such
as the cardinality of the alphabet of the generated language and the structure
of the generated language itself. This paper improves the result by showing
that the maximal number of nonterminals simultaneously rewritten during any
derivation step can be limited by a small constant regardless of other factors
Flexibility and Interaction at a Distance: A Mixed-Model Environment For Language Learning
This article reports on the process of design and development of two language courses for university students at beginning levels of competence. Following a preliminary experience in a low-tech environment for distance language learning and teaching, and a thorough review of the available literature, we identified two major challenges that would need to be addressed in our design:
(1) a necessity to build sufficient flexibility into the materials to cater to a variety of learners' styles, interests and skill levels, therefore sustaining learners' motivation; and
(2) a need to design materials that would present the necessary requisites of authenticity and interactivity identified in the examined literature, in spite of the reduced opportunities for face-to-face communication.
In response to these considerations, we designed and developed learning materials and tasks to be distributed on CD-ROM, complemented by a WebCT component for added interactivity and task authenticity. Although only part of the original design was implemented, and further research is needed to assess the impact of our environment on learning outcomes, the results of preliminary evaluations are encouraging
The Unsupervised Acquisition of a Lexicon from Continuous Speech
We present an unsupervised learning algorithm that acquires a
natural-language lexicon from raw speech. The algorithm is based on the optimal
encoding of symbol sequences in an MDL framework, and uses a hierarchical
representation of language that overcomes many of the problems that have
stymied previous grammar-induction procedures. The forward mapping from symbol
sequences to the speech stream is modeled using features based on articulatory
gestures. We present results on the acquisition of lexicons and language models
from raw speech, text, and phonetic transcripts, and demonstrate that our
algorithm compares very favorably to other reported results with respect to
segmentation performance and statistical efficiency.Comment: 27 page technical repor
- …