491 research outputs found

    A foundation for synthesising programming language semantics

    Get PDF
    Programming or scripting languages used in real-world systems are seldom designed with a formal semantics in mind from the outset. Therefore, the first step for developing well-founded analysis tools for these systems is to reverse-engineer a formal semantics. This can take months or years of effort. Could we automate this process, at least partially? Though desirable, automatically reverse-engineering semantics rules from an implementation is very challenging, as found by Krishnamurthi, Lerner and Elberty. They propose automatically learning desugaring translation rules, mapping the language whose semantics we seek to a simplified, core version, whose semantics are much easier to write. The present thesis contains an analysis of their challenge, as well as the first steps towards a solution. Scaling methods with the size of the language is very difficult due to state space explosion, so this thesis proposes an incremental approach to learning the translation rules. I present a formalisation that both clarifies the informal description of the challenge by Krishnamurthi et al, and re-formulates the problem, shifting the focus to the conditions for incremental learning. The central definition of the new formalisation is the desugaring extension problem, i.e. extending a set of established translation rules by synthesising new ones. In a synthesis algorithm, the choice of search space is important and non-trivial, as it needs to strike a good balance between expressiveness and efficiency. The rest of the thesis focuses on defining search spaces for translation rules via typing rules. Two prerequisites are required for comparing search spaces. The first is a series of benchmarks, a set of source and target languages equipped with intended translation rules between them. The second is an enumerative synthesis algorithm for efficiently enumerating typed programs. I show how algebraic enumeration techniques can be applied to enumerating well-typed translation rules, and discuss the properties expected from a type system for ensuring that typed programs be efficiently enumerable. The thesis presents and empirically evaluates two search spaces. A baseline search space yields the first practical solution to the challenge. The second search space is based on a natural heuristic for translation rules, limiting the usage of variables so that they are used exactly once. I present a linear type system designed to efficiently enumerate translation rules, where this heuristic is enforced. Through informal analysis and empirical comparison to the baseline, I then show that using linear types can speed up the synthesis of translation rules by an order of magnitude

    Cyclic proof systems for modal fixpoint logics

    Get PDF
    This thesis is about cyclic and ill-founded proof systems for modal fixpoint logics, with and without explicit fixpoint quantifiers.Cyclic and ill-founded proof-theory allow proofs with infinite branches or paths, as long as they satisfy some correctness conditions ensuring the validity of the conclusion. In this dissertation we design a few cyclic and ill-founded systems: a cyclic one for the weak Grzegorczyk modal logic K4Grz, based on our explanation of the phenomenon of cyclic companionship; and ill-founded and cyclic ones for the full computation tree logic CTL* and the intuitionistic linear-time temporal logic iLTL. All systems are cut-free, and the cyclic ones for K4Grz and iLTL have fully finitary correctness conditions.Lastly, we use a cyclic system for the modal mu-calculus to obtain a proof of the uniform interpolation property for the logic which differs from the original, automata-based one

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Canonical Algebraic Generators in Automata Learning

    Get PDF
    Many methods for the verification of complex computer systems require the existence of a tractable mathematical abstraction of the system, often in the form of an automaton. In reality, however, such a model is hard to come up with, in particular manually. Automata learning is a technique that can automatically infer an automaton model from a system -- by observing its behaviour. The majority of automata learning algorithms is based on the so-called L* algorithm. The acceptor learned by L* has an important property: it is canonical, in the sense that, it is, up to isomorphism, the unique deterministic finite automaton of minimal size accepting a given regular language. Establishing a similar result for other classes of acceptors, often with side-effects, is of great practical importance. Non-deterministic finite automata, for instance, can be exponentially more succinct than deterministic ones, allowing verification to scale. Unfortunately, identifying a canonical size-minimal non-deterministic acceptor of a given regular language is in general not possible: it can happen that a regular language is accepted by two non-isomorphic non-deterministic finite automata of minimal size. In particular, it thus is unclear which one of the automata should be targeted by a learning algorithm. In this thesis, we further explore the issue and identify (sub-)classes of acceptors that admit canonical size-minimal representatives. In more detail, the contributions of this thesis are three-fold. First, we expand the automata (learning) theory of Guarded Kleene Algebra with Tests (GKAT), an efficiently decidable logic expressive enough to model simple imperative programs. In particular, we present GL*, an algorithm that learns the unique size-minimal GKAT automaton for a given deterministic language, and prove that GL* is more efficient than an existing variation of L*. We implement both algorithms in OCaml, and compare them on example programs. Second, we present a category-theoretical framework based on generators, bialgebras, and distributive laws, which identifies, for a wide class of automata with side-effects in a monad, canonical target models for automata learning. Apart from recovering examples from the literature, we discover a new canonical acceptor of regular languages, and present a unifying minimality result. Finally, we show that the construction underlying our framework is an instance of a more general theory. First, we see that deriving a minimal bialgebra from a minimal coalgebra can be realized by applying a monad on a category of subobjects with respect to an epi-mono factorisation system. Second, we explore the abstract theory of generators and bases for algebras over a monad: we discuss bases for bialgebras, the product of bases, generalise the representation theory of linear maps, and compare our ideas to a coalgebra-based approach

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Software doping – Theory and detection

    Get PDF
    Software is doped if it contains a hidden functionality that is intentionally included by the manufacturer and is not in the interest of the user or society. This thesis complements this informal definition by a set of formal cleanness definitions that characterise the absence of software doping. These definitions reflect common expectations on clean software behaviour and are applicable to many types of software, from printers to cars to discriminatory AI systems. We use these definitions to propose white-box and black-box analysis techniques to detect software doping. In particular, we present a provably correct, model-based testing algorithm that is intertwined with a probabilistic-falsification-based test input selection technique. We identify and explain how to overcome the challenges that are specific to real-world software doping tests and analyses. The most prominent example of software doping in recent years is the Diesel Emissions Scandal. We demonstrate the strength of our cleanness definitions and analysis techniques by applying them to emission cleaning systems of diesel cars. All our car related research is unified in a Car Data Platform. The mobile app LolaDrives is one building block of this platform; it supports conducting real-driving emissions tests and provides feedback to the user in how far a trip satisfies driving conditions that are defined by official regulations.Software ist gedopt wenn sie eine versteckte Funktionalität enthält, die vom Hersteller beabsichtigt ist und deren Existenz nicht im Interesse des Benutzers oder der Gesellschaft ist. Die vorliegende Arbeit ergänzt diese nicht formale Definition um eine Menge von Cleanness-Definitionen, die die Abwesenheit von Software Doping charakterisieren. Diese Definitionen spiegeln allgemeine Erwartungen an "sauberes" Softwareverhalten wider und sie sind auf viele Arten von Software anwendbar, vom Drucker über Autos bis hin zu diskriminierenden KI-Systemen. Wir verwenden diese Definitionen um sowohl white-box, als auch black-box Analyseverfahren zur Verfügung zu stellen, die in der Lage sind Software Doping zu erkennen. Insbesondere stellen wir einen korrekt bewiesenen Algorithmus für modellbasierte Tests vor, der eng verflochten ist mit einer Test-Input-Generierung basierend auf einer Probabilistic-Falsification-Technik. Wir identifizieren Hürden hinsichtlich Software-Doping-Tests in der echten Welt und erklären, wie diese bewältigt werden können. Das bekannteste Beispiel für Software Doping in den letzten Jahren ist der Diesel-Abgasskandal. Wir demonstrieren die Fähigkeiten unserer Cleanness-Definitionen und Analyseverfahren, indem wir diese auf Abgasreinigungssystem von Dieselfahrzeugen anwenden. Unsere gesamte auto-basierte Forschung kommt in der Car Data Platform zusammen. Die mobile App LolaDrives ist eine Kernkomponente dieser Plattform; sie unterstützt bei der Durchführung von Abgasmessungen auf der Straße und gibt dem Fahrer Feedback inwiefern eine Fahrt den offiziellen Anforderungen der EU-Norm der Real-Driving Emissions entspricht

    Rely-guarantee Reasoning about Concurrent Reactive Systems: The PiCore Framework, Languages Integration and Applications

    Full text link
    The rely-guarantee approach is a promising way for compositional verification of concurrent reactive systems (CRSs), e.g. concurrent operating systems, interrupt-driven control systems and business process systems. However, specifications using heterogeneous reaction patterns, different abstraction levels, and the complexity of real-world CRSs are still challenging the rely-guarantee approach. This article proposes PiCore, a rely-guarantee reasoning framework for formal specification and verification of CRSs. We design an event specification language supporting complex reaction structures and its rely-guarantee proof system to detach the specification and logic of reactive aspects of CRSs from event behaviours. PiCore parametrizes the language and its rely-guarantee system for event behaviour using a rely-guarantee interface and allows to easily integrate 3rd-party languages via rely-guarantee adapters. By this design, we have successfully integrated two existing languages and their rely-guarantee proof systems without any change of their specification and proofs. PiCore has been applied to two real-world case studies, i.e. formal verification of concurrent memory management in Zephyr RTOS and a verified translation for a standardized Business Process Execution Language (BPEL) to PiCore.Comment: Submission to ACM Transactions on Programming Languages and Systems in 202

    Canonical Algebraic Generators in Automata Learning

    Full text link
    Many methods for the verification of complex computer systems require the existence of a tractable mathematical abstraction of the system, often in the form of an automaton. In reality, however, such a model is hard to come up with, in particular manually. Automata learning is a technique that can automatically infer an automaton model from a system -- by observing its behaviour. The majority of automata learning algorithms is based on the so-called L* algorithm. The acceptor learned by L* has an important property: it is canonical, in the sense that, it is, up to isomorphism, the unique deterministic finite automaton of minimal size accepting a given regular language. Establishing a similar result for other classes of acceptors, often with side-effects, is of great practical importance. Non-deterministic finite automata, for instance, can be exponentially more succinct than deterministic ones, allowing verification to scale. Unfortunately, identifying a canonical size-minimal non-deterministic acceptor of a given regular language is in general not possible: it can happen that a regular language is accepted by two non-isomorphic non-deterministic finite automata of minimal size. In particular, it thus is unclear which one of the automata should be targeted by a learning algorithm. In this thesis, we further explore the issue and identify (sub-)classes of acceptors that admit canonical size-minimal representatives.Comment: PhD thesi

    Jornadas Nacionales de Investigación en Ciberseguridad: actas de las VIII Jornadas Nacionales de Investigación en ciberseguridad: Vigo, 21 a 23 de junio de 2023

    Get PDF
    Jornadas Nacionales de Investigación en Ciberseguridad (8ª. 2023. Vigo)atlanTTicAMTEGA: Axencia para a modernización tecnolóxica de GaliciaINCIBE: Instituto Nacional de Cibersegurida

    Naval Postgraduate School Academic Catalog - February 2023

    Get PDF
    corecore