33 research outputs found
Derived-Term Automata of Multitape Expressions with Composition
Rational expressions are powerful tools to define automata, but often restricted to single-tape automata. Our goal is to unleash their expressive power for transducers, and more generally, any multitape automaton; for instance
(a + |x+b + |y) â . We generalize the construction of the derived-term automaton by using expansions. This approach generates small automata, and even allows us to support a composition operator
Automata and rational expressions
This text is an extended version of the chapter 'Automata and rational
expressions' in the AutoMathA Handbook that will appear soon, published by the
European Science Foundation and edited by JeanEricPin
Regular Expressions and Transducers over Alphabet-invariant and User-defined Labels
We are interested in regular expressions and transducers that represent word
relations in an alphabet-invariant way---for example, the set of all word pairs
u,v where v is a prefix of u independently of what the alphabet is. Current
software systems of formal language objects do not have a mechanism to define
such objects. We define transducers in which transition labels involve what we
call set specifications, some of which are alphabet invariant. In fact, we give
a more broad definition of automata-type objects, called labelled graphs, where
each transition label can be any string, as long as that string represents a
subset of a certain monoid. Then, the behaviour of the labelled graph is a
subset of that monoid. We do the same for regular expressions. We obtain
extensions of a few classic algorithmic constructions on ordinary regular
expressions and transducers at the broad level of labelled graphs and in such a
way that the computational efficiency of the extended constructions is not
sacrificed. For regular expressions with set specs we obtain the corresponding
partial derivative automata. For transducers with set specs we obtain further
algorithms that can be applied to questions about independent regular
languages, in particular the witness version of the independent property
satisfaction question
Sound and complete axiomatizations of coalgebraic language equivalence
Coalgebras provide a uniform framework to study dynamical systems, including
several types of automata. In this paper, we make use of the coalgebraic view
on systems to investigate, in a uniform way, under which conditions calculi
that are sound and complete with respect to behavioral equivalence can be
extended to a coarser coalgebraic language equivalence, which arises from a
generalised powerset construction that determinises coalgebras. We show that
soundness and completeness are established by proving that expressions modulo
axioms of a calculus form the rational fixpoint of the given type functor. Our
main result is that the rational fixpoint of the functor , where is a
monad describing the branching of the systems (e.g. non-determinism, weights,
probability etc.), has as a quotient the rational fixpoint of the
"determinised" type functor , a lifting of to the category of
-algebras. We apply our framework to the concrete example of weighted
automata, for which we present a new sound and complete calculus for weighted
language equivalence. As a special case, we obtain non-deterministic automata,
where we recover Rabinovich's sound and complete calculus for language
equivalence.Comment: Corrected version of published journal articl
Scalable verification of probabilistic networks
This paper presents McNetKAT, a scalable tool for verifying
probabilistic network programs. McNetKAT is based on a
new semantics for the guarded and history-free fragment
of Probabilistic NetKAT in terms of finite-state, absorbing
Markov chains. This view allows the semantics of all programs to be computed exactly, enabling construction of an
automatic verification tool. Domain-specific optimizations
and a parallelizing backend enable McNetKAT to analyze
networks with thousands of nodes, automatically reasoning
about general properties such as probabilistic program equivalence and refinement, as well as networking properties such
as resilience to failures. We evaluate McNetKATâs scalability using real-world topologies, compare its performance
against state-of-the-art tools, and develop an extended case
study on a recently proposed data center network design
Elements of computability, decidability, and complexity (Third edition)
These lecture notes are intended to introduce the reader to the
basic notions of computability theory, decidability, and complexity. More
information on these subjects can be found in classical books such as [Cut80,Dav58,Her69,HoU79,Rog67].
The results reported in these notes are taken from those books and in various
parts we closely follow their style of presentation. The reader is encouraged
to look at those books for improving his/her knowledge on these topics. Some
parts of the chapter on complexity are taken from the lecture notes of a
beautiful course given by Prof. Leslie Valiant at Edinburgh University,
Scotland, in 1979. It was, indeed, a very stimulating and enjoyable course.
For the notions of Predicate Calculus we have used in this book the reader
may refer to [Men87].
I would like to thank Dr. Maurizio Proietti at IASI-CNR (Roma, Italy),
my colleagues, and my students at the University of Roma Tor Vergata and,
in particular, Michele Martone. They have been for me a source of continuous
inspiration and enthusiasm.
Finally, I would like to thank Dr. Gioacchino Onorati and Lorenzo Costantini
of the Aracne Publishing Company for their helpful cooperation
Elements of computability, decidability, and complexity (Third edition)
These lecture notes are intended to introduce the reader to the
basic notions of computability theory, decidability, and complexity. More
information on these subjects can be found in classical books such as [Cut80,Dav58,Her69,HoU79,Rog67].
The results reported in these notes are taken from those books and in various
parts we closely follow their style of presentation. The reader is encouraged
to look at those books for improving his/her knowledge on these topics. Some
parts of the chapter on complexity are taken from the lecture notes of a
beautiful course given by Prof. Leslie Valiant at Edinburgh University,
Scotland, in 1979. It was, indeed, a very stimulating and enjoyable course.
For the notions of Predicate Calculus we have used in this book the reader
may refer to [Men87].
I would like to thank Dr. Maurizio Proietti at IASI-CNR (Roma, Italy),
my colleagues, and my students at the University of Roma Tor Vergata and,
in particular, Michele Martone. They have been for me a source of continuous
inspiration and enthusiasm.
Finally, I would like to thank Dr. Gioacchino Onorati and Lorenzo Costantini
of the Aracne Publishing Company for their helpful cooperation
Contributions to the Theory of Finite-State Based Grammars
This dissertation is a theoretical study of finite-state based grammars used in natural language processing. The study is concerned with certain varieties of finite-state intersection grammars (FSIG) whose parsers define regular relations between surface strings and annotated surface strings. The study focuses on the following three aspects of FSIGs:
(i) Computational complexity of grammars under limiting parameters In the study, the computational complexity in practical natural language processing is approached through performance-motivated parameters on structural complexity. Each parameter splits some grammars in the Chomsky hierarchy into an infinite set of subset approximations. When the approximations are regular, they seem to fall into the logarithmic-time hierarchyand the dot-depth hierarchy of star-free regular languages. This theoretical result is important and possibly relevant to grammar induction.
(ii) Linguistically applicable structural representations Related to the linguistically applicable representations of syntactic entities, the study contains new bracketing schemes that cope with dependency links, left- and right branching, crossing dependencies and spurious ambiguity. New grammar representations that resemble the Chomsky-SchĂŒtzenberger representation of context-free languages are presented in the study, and they include, in particular, representations for mildly context-sensitive non-projective dependency grammars whose performance-motivated approximations are linear time parseable.
(iii) Compilation and simplification of linguistic constraints Efficient compilation methods for certain regular operations such as generalized restriction are presented. These include an elegant algorithm that has already been adopted as the approach in a proprietary finite-state tool. In addition to the compilation methods, an approach to on-the-fly simplifications of finite-state representations for parse forests is sketched.
These findings are tightly coupled with each other under the theme of locality. I argue that the findings help us to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing.
Avainsanat: syntactic parsing, finite-state automata, dependency grammar, first-order logic, linguistic performance, star-free regular approximations, mildly context-sensitive grammar
Proceedings of the Eindhoven FASTAR Days 2004 : Eindhoven, The Netherlands, September 3-4, 2004
The Eindhoven FASTAR Days (EFD) 2004 were organized by the Software Construction group of the Department of Mathematics and Computer Science at the Technische Universiteit Eindhoven. On September 3rd and 4th 2004, over thirty participants|hailing from the Czech Republic, Finland, France, The Netherlands, Poland and South Africa|gathered at the Department to attend the EFD. The EFD were organized in connection with the research on finite automata by the FASTAR Research Group, which is centered in Eindhoven and at the University of Pretoria, South Africa. FASTAR (Finite Automata Systems|Theoretical and Applied Research) is an in- ternational research group that aims to lead in all areas related to finite state systems. The work in FASTAR includes both core and applied parts of this field. The EFD therefore focused on the field of finite automata, with an emphasis on practical aspects and applications. Eighteen presentations, mostly on subjects within this field, were given, by researchers as well as students from participating universities and industrial research facilities. This report contains the proceedings of the conference, in the form of papers for twelve of the presentations at the EFD. Most of them were initially reviewed and distributed as handouts during the EFD. After the EFD took place, the papers were revised for publication in these proceedings. We would like to thank the participants for their attendance and presentations, making the EFD 2004 as successful as they were. Based on this success, it is our intention to make the EFD into a recurring event. Eindhoven, December 2004 Loek Cleophas Bruce W. Watso
ON EXPRESSIVENESS, INFERENCE, AND PARAMETER ESTIMATION OF DISCRETE SEQUENCE MODELS
Huge neural autoregressive sequence models have achieved impressive performance across different applications, such as NLP, reinforcement learning, and bioinformatics. However, some lingering problems (e.g., consistency and coherency of generated texts) continue to exist, regardless of the parameter count. In the first part of this thesis, we chart a taxonomy of the expressiveness of various sequence model families (Ch 3). In particular, we put forth complexity-theoretic proofs that string latent-variable sequence models are strictly more expressive than energy-based sequence models, which in turn are more expressive than autoregressive sequence models. Based on these findings, we introduce residual energy-based sequence models, a family of energy-based sequence models (Ch 4) whose sequence weights can be evaluated efficiently, and also perform competitively against autoregressive models. However, we show how unrestricted energy-based sequence models can suffer from uncomputability; and how such a problem is generally unfixable without knowledge of the true sequence distribution (Ch 5).
In the second part of the thesis, we study practical sequence model families and algorithms based on theoretical findings in the first part of the thesis. We introduce neural particle smoothing (Ch 6), a family of approximate sampling methods that work with conditional latent variable models. We also introduce neural finite-state transducers (Ch 7), which extend weighted finite state transducers with the introduction of mark strings, allowing scoring transduction paths in a finite state transducer with a neural network. Finally, we propose neural regular expressions (Ch 8), a family of neural sequence models that are easy to engineer, allowing a user to design flexible weighted relations using Marked FSTs, and combine these weighted relations together with various operations