Search CORE

285 research outputs found

Strict Locality and Phonological Maps

Author: Chandlee Jane
Heinz Jeffrey
Publication venue: Haverford Scholarship
Publication date: 01/01/2018
Field of study

Haverford College: Haverford Scholarship

Canonical Algebraic Generators in Automata Learning

Author: Zetzsche Stefan
Publication venue
Publication date: 08/08/2023
Field of study

arXiv.org e-Print Archive

Canonical Algebraic Generators in Automata Learning

Author: Zetzsche Stefan Jens
Publication venue: UCL (University College London)
Publication date: 28/08/2023
Field of study

Many methods for the verification of complex computer systems require the existence of a tractable mathematical abstraction of the system, often in the form of an automaton. In reality, however, such a model is hard to come up with, in particular manually. Automata learning is a technique that can automatically infer an automaton model from a system -- by observing its behaviour. The majority of automata learning algorithms is based on the so-called L* algorithm. The acceptor learned by L* has an important property: it is canonical, in the sense that, it is, up to isomorphism, the unique deterministic finite automaton of minimal size accepting a given regular language. Establishing a similar result for other classes of acceptors, often with side-effects, is of great practical importance. Non-deterministic finite automata, for instance, can be exponentially more succinct than deterministic ones, allowing verification to scale. Unfortunately, identifying a canonical size-minimal non-deterministic acceptor of a given regular language is in general not possible: it can happen that a regular language is accepted by two non-isomorphic non-deterministic finite automata of minimal size. In particular, it thus is unclear which one of the automata should be targeted by a learning algorithm. In this thesis, we further explore the issue and identify (sub-)classes of acceptors that admit canonical size-minimal representatives. In more detail, the contributions of this thesis are three-fold. First, we expand the automata (learning) theory of Guarded Kleene Algebra with Tests (GKAT), an efficiently decidable logic expressive enough to model simple imperative programs. In particular, we present GL*, an algorithm that learns the unique size-minimal GKAT automaton for a given deterministic language, and prove that GL* is more efficient than an existing variation of L*. We implement both algorithms in OCaml, and compare them on example programs. Second, we present a category-theoretical framework based on generators, bialgebras, and distributive laws, which identifies, for a wide class of automata with side-effects in a monad, canonical target models for automata learning. Apart from recovering examples from the literature, we discover a new canonical acceptor of regular languages, and present a unifying minimality result. Finally, we show that the construction underlying our framework is an instance of a more general theory. First, we see that deriving a minimal bialgebra from a minimal coalgebra can be realized by applying a monad on a category of subobjects with respect to an epi-mono factorisation system. Second, we explore the abstract theory of generators and bases for algebras over a monad: we discuss bases for bialgebras, the product of bases, generalise the representation theory of linear maps, and compare our ideas to a coalgebra-based approach

UCL Discovery

Bibliographie

Author
Publication venue
Publication date: 01/01/1984
Field of study

University of Szeged

Proceedings of the 21st Conference on Formal Methods in Computer-Aided Design – FMCAD 2021

Author
Publication venue: TU Wien Academic Press
Publication date: 18/10/2021
Field of study

The Conference on Formal Methods in Computer-Aided Design (FMCAD) is an annual conference on the theory and applications of formal methods in hardware and system verification. FMCAD provides a leading forum to researchers in academia and industry for presenting and discussing groundbreaking methods, technologies, theoretical results, and tools for reasoning formally about computing systems. FMCAD covers formal aspects of computer-aided system design including verification, specification, synthesis, and testing

Directory of Open Access Books (DOAB)

Tools and Algorithms for the Construction and Analysis of Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access two-volume set constitutes the proceedings of the 26th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The total of 60 regular papers presented in these volumes was carefully reviewed and selected from 155 submissions. The papers are organized in topical sections as follows: Part I: Program verification; SAT and SMT; Timed and Dynamical Systems; Verifying Concurrent Systems; Probabilistic Systems; Model Checking and Reachability; and Timed and Probabilistic Systems. Part II: Bisimulation; Verification and Efficiency; Logic and Proof; Tools and Case Studies; Games and Automata; and SV-COMP 2020

OAPEN Library

Unsupervised multilingual learning

Author: Snyder Benjamin, Ph. D. Massachusetts Institute of Technology
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2010
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 241-254).For centuries, scholars have explored the deep links among human languages. In this thesis, we present a class of probabilistic models that exploit these links as a form of naturally occurring supervision. These models allow us to substantially improve performance for core text processing tasks, such as morphological segmentation, part-of-speech tagging, and syntactic parsing. Besides these traditional NLP tasks, we also present a multilingual model for lost language deciphersment. We test this model on the ancient Ugaritic language. Our results show that we can automatically uncover much of the historical relationship between Ugaritic and Biblical Hebrew, a known related language.by Benjamin Snyder.Ph.D

DSpace@MIT

Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks

Author: Baliga Nitin S
Bonneau Richard
Reiss David J
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The learning of global genetic regulatory networks from expression data is a severely under-constrained problem that is aided by reducing the dimensionality of the search space by means of clustering genes into putatively co-regulated groups, as opposed to those that are simply co-expressed. Be cause genes may be co-regulated only across a subset of all observed experimental conditions, biclustering (clustering of genes and conditions) is more appropriate than standard clustering. Co-regulated genes are also often functionally (physically, spatially, genetically, and/or evolutionarily) associated, and such a priori known or pre-computed associations can provide support for appropriately grouping genes. One important association is the presence of one or more common cis-regulatory motifs. In organisms where these motifs are not known, their de novo detection, integrated into the clustering algorithm, can help to guide the process towards more biologically parsimonious solutions. RESULTS: We have developed an algorithm, cMonkey, that detects putative co-regulated gene groupings by integrating the biclustering of gene expression data and various functional associations with the de novo detection of sequence motifs. CONCLUSION: We have applied this procedure to the archaeon Halobacterium NRC-1, as part of our efforts to decipher its regulatory network. In addition, we used cMonkey on public data for three organisms in the other two domains of life: Helicobacter pylori, Saccharomyces cerevisiae, and Escherichia coli. The biclusters detected by cMonkey both recapitulated known biology and enabled novel predictions (some for Halobacterium were subsequently confirmed in the laboratory). For example, it identified the bacteriorhodopsin regulon, assigned additional genes to this regulon with apparently unrelated function, and detected its known promoter motif. We have performed a thorough comparison of cMonkey results against other clustering methods, and find that cMonkey biclusters are more parsimonious with all available evidence for co-regulation

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Foundations of Software Science and Computation Structures

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book constitutes the proceedings of the 24th International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2021, which was held during March 27 until April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The 28 regular papers presented in this volume were carefully reviewed and selected from 88 submissions. They deal with research on theories and methods to support the analysis, integration, synthesis, transformation, and verification of programs and software systems

OAPEN Library

Sublinear-Time Cellular Automata and Connections to Complexity Theory

Author: Casagrande Viapiana Modanese Augusto
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 24/11/2022
Field of study

Im Gebiet des verteilten Rechnens werden Modelle untersucht, in denen sich mehrere Berechnungseinheiten koordinieren, um zusammen ein gemeinsames Ziel zu erreichen, wobei sie aber nur über begrenzte Ressourcen verfügen — sei diese Zeit-, Platz- oder Kommunikationskapazitäten. Das Hauptuntersuchungsobjekt dieser Dissertation ist das wohl einfachste solche Modell überhaupt: (eindimensionale) Zellularautomaten. Unser Ziel ist es, einen besseren Überblick über die Fähigkeiten und Einschränkungen des Modells und ihrer Varianten zu erlangen in dem Fall, dass die gesamte Bearbeitungszeit deutlich kleiner als die Größe der Eingabe ist (d. h. Sublinear-Zeit). Wir führen unsere Analyse von dem Standpunkt der Komplexitätstheorie und stellen dabei auch Bezüge zwischen Zellularautomaten und anderen Gebieten wie verteiltes Rechnen und Streaming-Algorithmen her. Sublinear-Zeit Zellularautomaten. Ein Zellularautomat (ZA) besteht aus identischen Zellen, die entlang einer Linie aneinandergereiht sind. Jede Zelle ist im Wesentlichen eine sehr primitive Berechnungseinheit (nämlich ein deterministischer endlicher Automat), die mit deren beiden Nachbarn interagieren kann. Die Berechnung entsteht durch die Aktualisierung der Zustände der Zellen gemäß derselben Zustandsüberführungsfunktion, die gleichzeitig überall im Automaten angewendet wird. Die von uns betrachteten Varianten sind unter anderem schrumpfende ZAs, die (gewissermaßen) dynamisch rekonfigurierbar sind, sowie eine probabilistische Variante, in der jede Zelle mit Zugriff auf eine faire Münze ausgestattet ist. Trotz überragendem Interesse an Linear- und Real-Zeit-ZAs scheint der Fall von Sublinear-Zeit im Großen und Ganzen von der wissenschaftlichen Gemeinschaft vernachlässigt worden zu sein. Wir arbeiten die überschaubare Anzahl an Vorarbeiten zu dem Thema auf, die vorhanden ist, und entwickeln die daraus stammenden Techniken weiter, sodass deren Spektrum an Anwendungsmöglichkeiten wesentlich breiter wird. Durch diese Bemühungen entsteht unter anderem ein Zeithierarchiesatz für das deterministische Modell. Außerdem übertragen wir Techniken zum Beweis unterer Schranken aus der Komplexitätstheorie auf das Modell der schrumpfenden ZAs und entwickeln neue Techniken, die auf probabilistische Sublinear-Zeit-ZAs zugeschnitten sind. Ein Bezug zu Härte-Magnifizierung. Ein Bezug zu Komplexitätstheorie, die wir im Laufe unserer Untersuchungen herstellen, ist ein Satz über Härte-Magnifizierung (engl. hardness magnification) für schrumpfende ZAs. Hier bezieht sich Härte-Magnifizierung auf eine Reihe neuerer Arbeiten, die bezeugen, dass selbst geringfügig nicht-triviale untere Schranken sehr beeindruckende Konsequenzen in der Komplexitätstheorie haben können. Unser Satz ist eine Abwandlung eines neuen Ergebnisses von McKay, Murray und Williams (STOC, 2019) für Streaming-Algorithmen. Wie wir zeigen kann die Aussage dabei genauso in Bezug auf schrumpfende ZAs formuliert werden, was sie auch beweisbar verstärkt. Eine Verbindung zu Sliding-Window Algorithmen. Wir verknüpfen das verteilte Zellularautomatenmodell mit dem sequenziellen Streaming-Algorithmen-Modell. Wie wir zeigen, können (gewisse Varianten von) ZAs von Streaming-Algorithmen simuliert werden, die bestimmten Lokalitätseinschränkungen unterliegen. Konkret ist der aktuelle Zustand des Algorithmus vollkommen bestimmt durch den Inhalt eines Fensters fester Größe, das wenige letzte Symbole enthält, die vom Algorithmus verarbeitet worden sind. Dementsprechend nennen wir diese eingeschränkte Form eines Streaming-Algorithmus einen Sliding-Window-Algorithmus. Wir zeigen, dass Sliding-Window-Algorithmen ZAs sehr effizient simulieren können und insbesondere in einer solchen Art und Weise, dass deren Platzkomplexität eng mit der Zeitkomplexität des simulierten ZA verbunden ist. Derandomisierungsergebnisse. Wir zeigen Derandomisierungsergebnisse für das Modell von Sliding-Window-Algorithmen, die Zufall aus einer binären Zufallsquelle beziehen. Dazu stützen wir uns auf die robuste Maschinerie von Branching-Programmen, die den gängigen Ansatz zur Derandomisierung von Platz-beschränkten Maschinen in der Komplexitätstheorie darstellen. Als eine Anwendung stellen sich Derandomisierungsergebnisse für probabilistische Sublinear-Zeit-ZAs heraus, die durch die oben genannten Verknüpfung erlangt werden. Vorhersageproblem für Pilz-Sandhaufen. Ein letztes Problem, das wir behandeln und das auch einen Bezug zu Sublinear-Zeitkomplexität im Rahmen von Zellularautomaten hat (obwohl nicht zu Sublinear-Zeit-Zellularautomaten selber), ist das Vorhersageproblem für Sandhaufen-Zellularautomaten. Diese Automaten sind basierend auf zweidimensionalen ZAs definiert und modellieren einen deterministischen Prozess, in dem sich Partikel (in der Regel denkt man an Sandkörnern) durch den Raum verbreiten. Das Vorhersageproblem fragt ob, gegeben eine Zellennummer

y

und eine initiale Konfiguration für den Sandhaufen, die Zelle mit Nummer

y

irgendwann vor einer gewissen Zeitschranke einen von Null verschiedenen Zustand erreichen wird. Die Komplexität dieses mindestens zwei Jahrzehnte alten Vorhersageproblems ist für zweidimensionelle Sandhaufen bemerkenswerterweise nach wie vor offen. Wir lösen diese Frage im Wesentlichen für eine neue Variante von Sandhaufen namens Pilz-Sandhaufen, die von Goles u. a. (Phys. Lett. A, 2020) vorgeschlagen worden ist. Unser Ergebnis ist besonders relevant, weil es innovative Erkenntnisse und neue Techniken liefert, die für die Lösung des offenen Problems im allgemeinen Fall von hoher Relevanz sein könnten

KITopen