40,345 research outputs found
Weakly Restricted Stochastic Grammars
A new type of stochastic grammars is introduced for investigation: weakly restricted stochastic grammars. In this paper we will concentrate on the consistency problem. To find conditions for stochastic grammars to be consistent, the theory of multitype Galton-Watson branching processes and generating functions is of central importance.\ud
The unrestricted stochastic grammar formalism generates the same class of languages as the weakly restricted formalism. The inside-outside algorithm is adapted for use with weakly restricted grammars
Process Denition of Adhesive HLR Systems (Long Version)
Process models of graph transformation systems are based on the concept of occurrence grammars, which are a generalization of Petri net processes given by occurrence nets. Recently, subobject transformation systems were proposed as an abstract framework for occurrence grammars in adhesive categories, but they are restricted to monomorphic matches for transformation steps. In this paper we review the construction of STSs as processes for plain graph grammars and present an extension to weak adhesive HLR categories with non-monomorphic matching, such that e.g. attributed graph grammars are included
Sequential and asynchronous processes driven by stochastic or quantum grammars and their application to genomics: a survey
We present the formalism of sequential and asynchronous processes defined in
terms of random or quantum grammars and argue that these processes have
relevance in genomics. To make the article accessible to the
non-mathematicians, we keep the mathematical exposition as elementary as
possible, focusing on some general ideas behind the formalism and stating the
implications of the known mathematical results. We close with a set of open
challenging problems.Comment: Presented at the European Congress on Mathematical and Theoretical
Biology, Dresden 18--22 July 200
Exploiting lattice structures in shape grammar implementations
The ability to work with ambiguity and compute new designs based on both defined and emergent shapes are unique advantages of shape grammars. Realizing these benefits in design practice requires the implementation of general purpose shape grammar interpreters that support: (a) the detection of arbitrary subshapes in arbitrary shapes and (b) the application of shape rules that use these subshapes to create new shapes. The complexity of currently available interpreters results from their combination of shape computation (for subshape detection and the application of rules) with computational geometry (for the geometric operations need to generate new shapes). This paper proposes a shape grammar implementation method for three-dimensional circular arcs represented as rational quadratic BĂ©zier curves based on lattice theory that reduces this complexity by separating steps in a shape computation process from the geometrical operations associated with specific grammars and shapes. The method is demonstrated through application to two well-known shape grammars: Stiny's triangles grammar and Jowers and Earl's trefoil grammar. A prototype computer implementation of an interpreter kernel has been built and its application to both grammars is presented. The use of BĂ©zier curves in three dimensions opens the possibility to extend shape grammar implementations to cover the wider range of applications that are needed before practical implementations for use in real life product design and development processes become feasible
Assumptions behind grammatical approaches to code-switching: when the blueprint is a red herring
Many of the so-called âgrammarsâ of code-switching are based on various underlying assumptions, e.g. that informal speech can be adequately or appropriately described in terms of ââgrammarââ; that deep, rather than surface, structures are involved in code-switching; that one âlanguageâ is the âbaseâ or âmatrixâ; and that constraints derived from existing data are universal and predictive. We question these assumptions on several grounds. First, âgrammarâ is arguably distinct from the processes driving speech production. Second, the role of grammar is mediated by the variable, poly-idiolectal repertoires of bilingual speakers. Third, in many instances of CS the notion of a âbaseâ system is either irrelevant, or fails to explain the facts. Fourth, sociolinguistic factors frequently override âgrammaticalâ factors, as evidence from the same language pairs in different settings has shown. No principles proposed to date account for all the facts, and it seems unlikely that âgrammarâ, as conventionally conceived, can provide definitive answers. We conclude that rather than seeking universal, predictive grammatical rules, research on CS should focus on the variability of bilingual grammars
Phrase structure grammars as indicative of uniquely human thoughts
I argue that the ability to compute phrase structure grammars is indicative of a particular kind of thought. This type of thought that is only available to cognitive systems that have access to the computations that allow the generation and interpretation of the structural descriptions of phrase structure grammars. The study of phrase structure grammars, and formal language theory in general, is thus indispensable to studies of human cognition, for it makes explicit both the unique type of human thought and the underlying mechanisms in virtue of which this thought is made possible
Toric grammars: a new statistical approach to natural language modeling
We propose a new statistical model for computational linguistics. Rather than
trying to estimate directly the probability distribution of a random sentence
of the language, we define a Markov chain on finite sets of sentences with many
finite recurrent communicating classes and define our language model as the
invariant probability measures of the chain on each recurrent communicating
class. This Markov chain, that we call a communication model, recombines at
each step randomly the set of sentences forming its current state, using some
grammar rules. When the grammar rules are fixed and known in advance instead of
being estimated on the fly, we can prove supplementary mathematical properties.
In particular, we can prove in this case that all states are recurrent states,
so that the chain defines a partition of its state space into finite recurrent
communicating classes. We show that our approach is a decisive departure from
Markov models at the sentence level and discuss its relationships with Context
Free Grammars. Although the toric grammars we use are closely related to
Context Free Grammars, the way we generate the language from the grammar is
qualitatively different. Our communication model has two purposes. On the one
hand, it is used to define indirectly the probability distribution of a random
sentence of the language. On the other hand it can serve as a (crude) model of
language transmission from one speaker to another speaker through the
communication of a (large) set of sentences
A Note on Zipf's Law, Natural Languages, and Noncoding DNA regions
In Phys. Rev. Letters (73:2, 5 Dec. 94), Mantegna et al. conclude on the
basis of Zipf rank frequency data that noncoding DNA sequence regions are more
like natural languages than coding regions. We argue on the contrary that an
empirical fit to Zipf's ``law'' cannot be used as a criterion for similarity to
natural languages. Although DNA is a presumably an ``organized system of
signs'' in Mandelbrot's (1961) sense, an observation of statistical features of
the sort presented in the Mantegna et al. paper does not shed light on the
similarity between DNA's ``grammar'' and natural language grammars, just as the
observation of exact Zipf-like behavior cannot distinguish between the
underlying processes of tossing an sided die or a finite-state branching
process.Comment: compressed uuencoded postscript file: 14 page
On the Vocabulary of Grammar-Based Codes and the Logical Consistency of Texts
The article presents a new interpretation for Zipf-Mandelbrot's law in
natural language which rests on two areas of information theory. Firstly, we
construct a new class of grammar-based codes and, secondly, we investigate
properties of strongly nonergodic stationary processes. The motivation for the
joint discussion is to prove a proposition with a simple informal statement: If
a text of length describes independent facts in a repetitive way
then the text contains at least different words, under
suitable conditions on . In the formal statement, two modeling postulates
are adopted. Firstly, the words are understood as nonterminal symbols of the
shortest grammar-based encoding of the text. Secondly, the text is assumed to
be emitted by a finite-energy strongly nonergodic source whereas the facts are
binary IID variables predictable in a shift-invariant way.Comment: 24 pages, no figure
- âŠ