53,979 research outputs found
Parsing Based on State Grammars
Tato práce se zabývá syntaktickou analýzou založenou na stavových gramatikách. Cílem je vytvořit program schopný načíst gramatiku ze vstupního souboru. Na základě této gramatiky vytvořit LL tabulku a následně i provést syntaktickou analýzu zadaného vstupu. Na těchto základech pak studovat vlastnosti metod syntaktické analýzy, založené na těchto gramatikách. Testování probíhá i na gramatických strukturách, které nejsou bezkontextové.This thesis's main focus is parsing, based on state grammars. The goal is to create a program, that will be able to load the grammar from input file. Based on a loaded grammar, the program will create an LL table and parse an input file using this table. The next goal is to study properties of parsing, based on state grammars, while using a created program as a stand point. Part of the testing will also be grammar structures which are not context-free.
Developmental constraints on learning artificial grammars with fixed, flexible and free word order
Human learning, although highly flexible and efficient, is constrained in ways that facilitate or impede the acquisition of certain systems of information. Some such constraints, active during infancy and childhood, have been proposed to account for the apparent ease with which typically developing children acquire language. In a series of experiments, we investigated the role of developmental constraints on learning artificial grammars with a distinction between shorter and relatively frequent words (‘function words,’ F-words) and longer and less frequent words (‘content words,’ C-words). We constructed 4 finite-state grammars, in which the order of F-words, relative to C-words, was either fixed (F-words always occupied the same positions in a string), flexible (every F-word always followed a C-word), or free. We exposed adults (N = 84) and kindergarten children (N = 100) to strings from each of these artificial grammars, and we assessed their ability to recognize strings with the same structure, but a different vocabulary. Adults were better at recognizing strings when regularities were available (i.e., fixed and flexible order grammars), while children were better at recognizing strings from the grammars consistent with the attested distribution of function and content words in natural languages (i.e., flexible and free order grammars). These results provide evidence for a link between developmental constraints on learning and linguistic typology
Parsing Based on State Grammars
Tato práce popisuje vlastnosti stavových gramatik a n-limitovaných stavových gramatik s důrazem na nedeterminismus v analýze takových gramatik. Zejména se zaměřuje na problémy způsobené povolením vymazávacích pravidel a možných výskytů rekurze. Na základě analýzy těchto problémů nabízí možná řešení, která jsou posléze uplatněna při návrhu prakticky zaměřené metody paralelní syntaktické analýzy. Tato metoda je výrazně rychlejší, než sekvenční analýza s návratem.This work describes the characteristics of state grammars and n-limited state grammars with a focus on non-determinism in a parsing proccess based on such grammars. In particular, it focuses on the problems caused by enabling erasing productions or by possible occur of recursion. This work also describes possible solutions to non-deterministic problems, which are used in the design of a parallel parsing method. This method is significantly faster than sequence analysis based on backtracking.
Parsing Based on State Grammars
V této bakalářské práci je zaveden syntaxí řízený překlad za pomocí stavových gramatik. Teoretická část práce je zaměřená na zavedení teoretických modelů potřebných pro pochopení syntaktické analýzy za pomocí stavových gramatik. Mezi nejdůležitejší z teoretických formálních modelů v této práci patří hluboký zásobníkový převodník a překladová gramatika vytvořená ze stavové gramatiky, které lze využít k syntaktické analýze. Praktická část práce sa zaměřuje hlavně na syntaktickou analýzu zdola nahoru pomocí stavových gramatik a její implementaci.Syntax-directed translation based on state grammars is introduced in this bachelor's thesis. Theoretical section of this thesis is focused on the introduction of theoretical models that are necessary for understanding syntax analysis based on state grammars. The most important theoretical formal models in this thesis include deep pushdown transducer and translation grammar based on state grammar, which can be used in syntax analysis. Practical section of this thesis is focused on bottom-up syntax analysis using state grammar and its implementation.
On state-alternating context-free grammars
AbstractState-alternating context-free grammars are introduced, and the language classes obtained from them are compared to the classes of the Chomsky hierarchy as well as to some well-known complexity classes. In particular, state-alternating context-free grammars are compared to alternating context-free grammars (Theoret. Comput. Sci. 67 (1989) 75–85) and to alternating pushdown automata. Further, various derivation strategies are considered, and their influence on the expressive power of (state-) alternating context-free grammars is investigated
Toric grammars: a new statistical approach to natural language modeling
We propose a new statistical model for computational linguistics. Rather than
trying to estimate directly the probability distribution of a random sentence
of the language, we define a Markov chain on finite sets of sentences with many
finite recurrent communicating classes and define our language model as the
invariant probability measures of the chain on each recurrent communicating
class. This Markov chain, that we call a communication model, recombines at
each step randomly the set of sentences forming its current state, using some
grammar rules. When the grammar rules are fixed and known in advance instead of
being estimated on the fly, we can prove supplementary mathematical properties.
In particular, we can prove in this case that all states are recurrent states,
so that the chain defines a partition of its state space into finite recurrent
communicating classes. We show that our approach is a decisive departure from
Markov models at the sentence level and discuss its relationships with Context
Free Grammars. Although the toric grammars we use are closely related to
Context Free Grammars, the way we generate the language from the grammar is
qualitatively different. Our communication model has two purposes. On the one
hand, it is used to define indirectly the probability distribution of a random
sentence of the language. On the other hand it can serve as a (crude) model of
language transmission from one speaker to another speaker through the
communication of a (large) set of sentences
Macro Grammars and Holistic Triggering for Efficient Semantic Parsing
To learn a semantic parser from denotations, a learning algorithm must search
over a combinatorially large space of logical forms for ones consistent with
the annotated denotations. We propose a new online learning algorithm that
searches faster as training progresses. The two key ideas are using macro
grammars to cache the abstract patterns of useful logical forms found thus far,
and holistic triggering to efficiently retrieve the most relevant patterns
based on sentence similarity. On the WikiTableQuestions dataset, we first
expand the search space of an existing model to improve the state-of-the-art
accuracy from 38.7% to 42.7%, and then use macro grammars and holistic
triggering to achieve an 11x speedup and an accuracy of 43.7%.Comment: EMNLP 201
Minimally-Supervised Morphological Segmentation using Adaptor Grammars
This paper explores the use of Adaptor Grammars, a nonparametric Bayesian modelling framework, for minimally supervised morphological segmentation. We compare three training methods: unsupervised training, semi-supervised training, and a novel model selection method. In the model selection method, we train unsupervised Adaptor Grammars using an over-articulated metagrammar, then use a small labelled data set to select which potential morph boundaries identified by the metagrammar should be returned in the final output. We evaluate on five languages and show that semi-supervised training provides a boost over unsupervised training, while the model selection method yields the best average results over all languages and is competitive with state-of-the-art semi-supervised systems. Moreover, this method provides the potential to tune performance according to different evaluation metrics or downstream tasks.12 page(s
- …