19,946 research outputs found
Merging two Hierarchies of Internal Contextual Grammars with Subregular Selection
In this paper, we continue the research on the power of contextual grammars
with selection languages from subfamilies of the family of regular languages.
In the past, two independent hierarchies have been obtained for external and
internal contextual grammars, one based on selection languages defined by
structural properties (finite, monoidal, nilpotent, combinational, definite,
ordered, non-counting, power-separating, suffix-closed, commutative, circular,
or union-free languages), the other one based on selection languages defined by
resources (number of non-terminal symbols, production rules, or states needed
for generating or accepting them). In a previous paper, the language families
of these hierarchies for external contextual grammars were compared and the
hierarchies merged. In the present paper, we compare the language families of
these hierarchies for internal contextual grammars and merge these hierarchies.Comment: In Proceedings NCMA 2023, arXiv:2309.07333. arXiv admin note: text
overlap with arXiv:2309.02768, arXiv:2208.1472
Semi-bracketed contextual grammars
Bracketed and fully bracketed contextual grammars were introduced to bring the concept of a tree structure to the strings by associating a pair of parentheses to the adjoined contexts in the derivation. In this paper, we show that these grammars fail to generate all the basic non-context-free languages, thus cannot be a syntactical model for natural languages. To overcome this failure, we introduce a new class of fully bracketed contextual grammars, called the semi-bracketed contextual grammars, where the selectors can also be non-minimally Dyck covered language. We see that the tree structure to the derived strings is still preserved in this variant. when this new grammar is combined with the maximality feature, the generative power of these grammars is increased to the extend of covering the family of context-free languages and some basic non-context-free languages, thus possessing many properties of the so called `MCS formalism'
A new automata for parsing semi-bracketed contextual grammars
Bracketed and fully bracketed contextual grammars were introduced to bring the concept of tree structure to the strings by associating a pair of parentheses to the adjoined contexts in the derivation. But these grammars fail to generate the basic non-context free languages thus unable to provide a syntactical representation to natural languages. To overcome this problem, a new variant called semi-bracketed contextual grammar was introduced recently, where the selectors can also be non-minimally Dyck covered strings. The membership problem for the new variant is left unsolved. In this paper, we propose a parsing algorithm (for non-projected strings) of maximal semi-bracketed contextual grammars. In this process, we introduce a new automaton called k-queue Self Modifying Weighted Automata (k-quSMWA)
Latent-Variable PCFGs: Background and Applications
Latent-variable probabilistic context-free grammars are
latent-variable models that are based on context-free grammars.
Nonterminals are associated with latent states that provide
contextual information during the top-down rewriting process of
the grammar.
We survey a few of the techniques used to estimate such grammars
and to parse text with them. We also give an overview of what the latent
states represent for English Penn treebank parsing, and provide
an overview of extensions and related models to these grammars
On external presentations of infinite graphs
The vertices of a finite state system are usually a subset of the natural
numbers. Most algorithms relative to these systems only use this fact to select
vertices.
For infinite state systems, however, the situation is different: in
particular, for such systems having a finite description, each state of the
system is a configuration of some machine. Then most algorithmic approaches
rely on the structure of these configurations. Such characterisations are said
internal. In order to apply algorithms detecting a structural property (like
identifying connected components) one may have first to transform the system in
order to fit the description needed for the algorithm. The problem of internal
characterisation is that it hides structural properties, and each solution
becomes ad hoc relatively to the form of the configurations.
On the contrary, external characterisations avoid explicit naming of the
vertices. Such characterisation are mostly defined via graph transformations.
In this paper we present two kind of external characterisations:
deterministic graph rewriting, which in turn characterise regular graphs,
deterministic context-free languages, and rational graphs. Inverse substitution
from a generator (like the complete binary tree) provides characterisation for
prefix-recognizable graphs, the Caucal Hierarchy and rational graphs. We
illustrate how these characterisation provide an efficient tool for the
representation of infinite state systems
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
Structural selection in implicit learning of artificial grammars
In the contextual cueing paradigm, Endo and Takeda (in Percept Psychophys 66:293–302, 2004) provided evidence that implicit learning involves selection of the aspect of a structure that is most useful to one’s task. The present study attempted to replicate this finding in artificial grammar learning to investigate whether or not implicit learning commonly involves such a selection. Participants in Experiment 1 were presented with an induction task that could be facilitated by several characteristics of the exemplars. For some participants, those characteristics included a perfectly predictive feature. The results suggested that the aspect of the structure that was most useful to the induction task was selected and learned implicitly. Experiment 2 provided evidence that, although salience affected participants’ awareness of the perfectly predictive feature, selection for implicit learning was mainly based on usefulness
- …