46 research outputs found
MSO definable string transductions and two-way finite state transducers
String transductions that are definable in monadic second-order (mso) logic
(without the use of parameters) are exactly those realized by deterministic
two-way finite state transducers. Nondeterministic mso definable string
transductions (i.e., those definable with the use of parameters) correspond to
compositions of two nondeterministic two-way finite state transducers that have
the finite visit property. Both families of mso definable string transductions
are characterized in terms of Hennie machines, i.e., two-way finite state
transducers with the finite visit property that are allowed to rewrite their
input tape.Comment: 63 pages, LaTeX2e. Extended abstract presented at 26-th ICALP, 199
Languages Generated by Iterated Idempotencies.
The rewrite relation with parameters m and n and with the possible length limit = k or :::; k we denote by w~, =kW~· or ::;kw~ respectively. The idempotency languages generated from a starting word w by the respective operations are wDAlso other special cases of idempotency languages besides duplication have come up in different contexts. The investigations of Ito et al. about insertion and deletion, Le., operations that are also observed in DNA molecules, have established that w5 and w~ both preserve regularity.Our investigations about idempotency relations and languages start out from the case of a uniform length bound. For these relations =kW~ the conditions for confluence are characterized completely. Also the question of regularity is -k n answered for aH the languages w- D 1 are more complicated and belong to the class of context-free languages.For a generallength bound, i.e."for the relations :"::kW~, confluence does not hold so frequently. This complicatedness of the relations results also in more complicated languages, which are often non-regular, as for example the languages WWithout any length bound, idempotency relations have a very complicated structure. Over alphabets of one or two letters we still characterize the conditions for confluence. Over three or more letters, in contrast, only a few cases are solved. We determine the combinations of parameters that result in the regularity of wDIn a second chapter sorne more involved questions are solved for the special case of duplication. First we shed sorne light on the reasons why it is so difficult to determine the context-freeness ofduplication languages. We show that they fulfiH aH pumping properties and that they are very dense. Therefore aH the standard tools to prove non-context-freness do not apply here.The concept of root in Formal Language ·Theory is frequently used to describe the reduction of a word to another one, which is in sorne sense elementary.For example, there are primitive roots, periodicity roots, etc. Elementary in connection with duplication are square-free words, Le., words that do not contain any repetition. Thus we define the duplication root of w to consist of aH the square-free words, from which w can be reached via the relation w~.Besides sorne general observations we prove the decidability of the question, whether the duplication root of a language is finite.Then we devise acode, which is robust under duplication of its code words.This would keep the result of a computation from being destroyed by dupli cations in the code words. We determine the exact conditions, under which infinite such codes exist: over an alphabet of two letters they exist for a length bound of 2, over three letters already for a length bound of 1.Also we apply duplication to entire languages rather than to single words; then it is interesting to determine, whether regular and context-free languages are closed under this operation. We show that the regular languages are closed under uniformly bounded duplication, while they are not closed under duplication with a generallength bound. The context-free languages are closed under both operations.The thesis concludes with a list of open problems related with the thesis' topics
Logic and Automata
Mathematical logic and automata theory are two scientific disciplines with a fundamentally close relationship. The authors of Logic and Automata take the occasion of the sixtieth birthday of Wolfgang Thomas to present a tour d'horizon of automata theory and logic. The twenty papers in this volume cover many different facets of logic and automata theory, emphasizing the connections to other disciplines such as games, algorithms, and semigroup theory, as well as discussing current challenges in the field
26. Theorietag Automaten und Formale Sprachen 23. Jahrestagung Logik in der Informatik: Tagungsband
Der Theorietag ist die Jahrestagung der Fachgruppe Automaten und Formale Sprachen der Gesellschaft für Informatik und fand erstmals 1991 in Magdeburg statt. Seit dem Jahr 1996 wird der Theorietag von einem eintägigen Workshop mit eingeladenen Vorträgen begleitet. Die Jahrestagung der Fachgruppe Logik in der Informatik der Gesellschaft für Informatik fand erstmals 1993 in Leipzig statt. Im Laufe beider Jahrestagungen finden auch die jährliche Fachgruppensitzungen statt. In diesem Jahr wird der Theorietag der Fachgruppe Automaten und Formale Sprachen erstmalig zusammen mit der Jahrestagung der Fachgruppe Logik in der Informatik abgehalten. Organisiert wurde die gemeinsame Veranstaltung von der Arbeitsgruppe Zuverlässige Systeme des Instituts für Informatik an der Christian-Albrechts-Universität Kiel vom 4. bis 7. Oktober im Tagungshotel Tannenfelde bei Neumünster. Während des Tre↵ens wird ein Workshop für alle Interessierten statt finden. In Tannenfelde werden • Christoph Löding (Aachen) • Tomás Masopust (Dresden) • Henning Schnoor (Kiel) • Nicole Schweikardt (Berlin) • Georg Zetzsche (Paris) eingeladene Vorträge zu ihrer aktuellen Arbeit halten. Darüber hinaus werden 26 Vorträge von Teilnehmern und Teilnehmerinnen gehalten, 17 auf dem Theorietag Automaten und formale Sprachen und neun auf der Jahrestagung Logik in der Informatik. Der vorliegende Band enthält Kurzfassungen aller Beiträge. Wir danken der Gesellschaft für Informatik, der Christian-Albrechts-Universität zu Kiel und dem Tagungshotel Tannenfelde für die Unterstützung dieses Theorietags. Ein besonderer Dank geht an das Organisationsteam: Maike Bradler, Philipp Sieweck, Joel Day. Kiel, Oktober 2016 Florin Manea, Dirk Nowotka und Thomas Wilk
Probabilistic graph formalisms for meaning representations
In recent years, many datasets have become available that represent natural language
semantics as graphs. To use these datasets in natural language processing (NLP), we
require probabilistic models of graphs. Finite-state models have been very successful
for NLP tasks on strings and trees because they are probabilistic and composable. Are
there equivalent models for graphs? In this thesis, we survey several graph formalisms,
focusing on whether they are probabilistic and composable, and we contribute several
new results. In particular, we study the directed acyclic graph automata languages
(DAGAL), the monadic second-order graph languages (MSOGL), and the hyperedge
replacement languages (HRL). We prove that DAGAL cannot be made probabilistic,
we explain why MSOGL also most likely cannot be made probabilistic, and we review
the fact that HRL are not composable. We then review a subfamily of HRL and
MSOGL: the regular graph languages (RGL; Courcelle 1991), which have not been
widely studied, and particularly have not been studied in an NLP context. Although
Courcelle (1991) only sketches a proof, we present a full, more NLP-accessible proof
that RGL are a subfamily of MSOGL. We prove that RGL are probabilistic and composable,
and we provide a novel Earley-style parsing algorithm for them that runs in
time linear in the size of the input graph. We compare RGL to two other new formalisms:
the restricted DAG languages (RDL; Bj¨orklund et al. 2016) and the tree-like
languages (TLL; Matheja et al. 2015). We show that RGL and RDL are incomparable;
TLL and RDL are incomparable; and either RGL are incomparable to TLL, or RGL
are contained within TLL. This thesis provides a clearer picture of this field from an
NLP perspective, and suggests new theoretical and empirical research directions