17 research outputs found
Random Generation of Nondeterministic Finite-State Tree Automata
Algorithms for (nondeterministic) finite-state tree automata (FTAs) are often
tested on random FTAs, in which all internal transitions are equiprobable. The
run-time results obtained in this manner are usually overly optimistic as most
such generated random FTAs are trivial in the sense that the number of states
of an equivalent minimal deterministic FTA is extremely small. It is
demonstrated that nontrivial random FTAs are obtained only for a narrow band of
transition probabilities. Moreover, an analytic analysis yields a formula to
approximate the transition probability that yields the most complex random
FTAs, which should be used in experiments.Comment: In Proceedings TTATT 2013, arXiv:1311.5058. Andreas Maletti and
Daniel Quernheim were financially supported by the German Research Foundation
(DFG) grant MA/4959/1-
Statistical language models within the algebra of weighted rational languages
Statistical language models are an important tool in natural language processing. They represent prior knowledge about a certain language which is usually gained from a set of samples called a corpus. In this paper, we present a novel way of creating N-gram language models using weighted finite automata. The construction of these models is formalised within the algebra underlying weighted finite automata and expressed in terms of weighted rational languages and transductions. Besides the algebra we make use of five special constant weighted transductions which rely only on the alphabet and the model parameter N. In addition, we discuss efficient implementations of these transductions in terms of virtual constructions
Recommended from our members
The Acquisition of Programming Skills from Textbooks
We present a computer model for the acquistion of programming languages from textbooks. Starting from a verbal description of the notational conventions that are used to describe the syntactic form of programming commands, a meta grammar is generated that parses concrete command descriptions and builds up grammar rules for that commands. These rules are realized as definite clause grammar rules that captures the syntax of these commands. They can be used to parse and generate syntactically correct examples of a command. However, to solve real programming problems also the semantics of a command and of its parameters needs to be acquired. This is accomplished by the natural language parsing of the explanations given in the text and the augmentation of the definite clause command grammars with semantic structures
The Acquisition of Programming Skills from Textbooks
We present a computer model for the acquistion of programming languages from textbooks. Starting from a verbal description of the notational conventions that are used to describe the syntactic form of programming commands, a meta grammar is generated that parses concrete command descriptions and builds up grammar rules for that commands. These rules are realized as definite clause grammar rules that captures the syntax of these commands. They can be used to parse and generate syntactically correct examples of a command. However, to solve real programming problems also the semantics of a command and of its parameters needs to be acquired. This is accomplished by the natural language parsing of the explanations given in the text and the augmentation of the definite clause command grammars with semantic structures
Weaving the Semantic Web: Extracting and Representing the Content of Pathology Reports
Schlangen D, Hanneforth T, Stede M. Weaving the Semantic Web: Extracting and Representing the Content of Pathology Reports. In: Proceedings of the GLDV Conference 2005 (GLDV05). Bonn, Germany; 2005
Pushing for weighted tree automata
A weight normalization procedure, commonly called pushing, is introduced for
weighted tree automata (wta) over commutative semifields. The normalization
preserves the recognized weighted tree language even for nondeterministic wta,
but it is most useful for bottom-up deterministic wta, where it can be used for
minimization and equivalence testing. In both applications a careful selection
of the weights to be redistributed followed by normalization allows a reduction
of the general problem to the corresponding problem for bottom-up deterministic
unweighted tree automata. This approach was already successfully used by Mohri
and Eisner for the minimization of deterministic weighted string automata.
Moreover, the new equivalence test for two wta and runs in time
, where and are the states of and ,
respectively, which improves the previously best run-time