8 research outputs found

    Perfect Is the Enemy of Good: Best-Effort Program Synthesis

    Get PDF
    Program synthesis promises to help software developers with everyday tasks by generating code snippets automatically from input-output examples and other high-level specifications. The conventional wisdom is that a synthesizer must always satisfy the specification exactly. We conjecture that this all-or-nothing paradigm stands in the way of adopting program synthesis as a developer tool: in practice, the user-written specification often contains errors or is simply too hard for the synthesizer to solve within a reasonable time; in these cases, the user is left with a single over-fitted result or, more often than not, no result at all. In this paper we propose a new program synthesis paradigm we call best-effort program synthesis, where the synthesizer returns a ranked list of partially-valid results, i.e. programs that satisfy some part of the specification. To support this paradigm, we develop best-effort enumeration, a new synthesis algorithm that extends a popular program enumeration technique with the ability to accumulate and return multiple partially-valid results with minimal overhead. We implement this algorithm in a tool called BESTER, and evaluate it on 79 synthesis benchmarks from the literature. Contrary to the conventional wisdom, our evaluation shows that BESTER returns useful results even when the specification is flawed or too hard: i) for all benchmarks with an error in the specification, the top three BESTER results contain the correct solution, and ii) for most hard benchmarks, the top three results contain non-trivial fragments of the correct solution. We also performed an exploratory user study, which confirms our intuition that partially-valid results are useful: the study shows that programmers use the output of the synthesizer for comprehension and often incorporate it into their solutions

    Saggitarius: A DSL for Specifying Grammatical Domains

    Full text link
    Common data types like dates, addresses, phone numbers and tables can have multiple textual representations, and many heavily-used languages, such as SQL, come in several dialects. These variations can cause data to be misinterpreted, leading to silent data corruption, failure of data processing systems, or even security vulnerabilities. Saggitarius is a new language and system designed to help programmers reason about the format of data, by describing grammatical domains -- that is, sets of context-free grammars that describe the many possible representations of a datatype. We describe the design of Saggitarius via example and provide a relational semantics. We show how Saggitarius may be used to analyze a data set: given example data, it uses an algorithm based on semi-ring parsing and MaxSAT to infer which grammar in a given domain best matches that data. We evaluate the effectiveness of the algorithm on a benchmark suite of 110 example problems, and we demonstrate that our system typically returns a satisfying grammar within a few seconds with only a small number of examples. We also delve deeper into a more extensive case study on using Saggitarius for CSV dialect detection. Despite being general-purpose, we find that Saggitarius offers comparable results to hand-tuned, specialized tools; in the case of CSV, it infers grammars for 84% of benchmarks within 60 seconds, and has comparable accuracy to custom-built dialect detection tools.Comment: OOPSLA 202

    Formal Foundations for Information-Preserving Model Synchronization Processes Based on Triple Graph Grammars

    Get PDF
    Zwischen verschiedenen Artefakten, die Informationen teilen, wieder Konsistenz herzustellen, nachdem eines von ihnen geĂ€ndert wurde, ist ein wichtiges Problem, das in verschiedenen Bereichen der Informatik auftaucht. Mit dieser Dissertation legen wir eine Lösung fĂŒr das grundlegende Modellsynchronisationsproblem vor. Bei diesem Problem ist ein Paar solcher Artefakte (Modelle) gegeben, von denen eines geĂ€ndert wurde; Aufgabe ist die Wiederherstellung der Konsistenz. Tripelgraphgrammatiken (TGGs) sind ein etablierter und geeigneter Formalismus, um dieses und verwandte Probleme anzugehen. Da sie auf der algebraischen Theorie der Graphtransformation und dem (Double-)Pushout Zugang zu Ersetzungssystemen basieren, sind sie besonders geeignet, um Lösungen zu entwickeln, deren Eigenschaften formal bewiesen werden können. Doch obwohl TGG-basierte AnsĂ€tze etabliert sind, leiden viele von ihnen unter dem Problem des Informationsverlustes. Wenn ein Modell geĂ€ndert wurde, können wĂ€hrend eines Synchronisationsprozesses Informationen verloren gehen, die nur im zweiten Modell vorliegen. Das liegt daran, dass solche Synchronisationsprozesse darauf zurĂŒckfallen Konsistenz dadurch wiederherzustellen, dass sie das geĂ€nderte Modell (bzw. große Teile von ihm) neu ĂŒbersetzen. Wir schlagen einen TGG-basierten Ansatz vor, der fortgeschrittene Features von TGGs unterstĂŒtzt (Attribute und negative Constraints), durchgĂ€ngig formalisiert ist, implementiert und inkrementell in dem Sinne ist, dass er den Informationsverlust im Vergleich mit vorherigen AnsĂ€tzen drastisch reduziert. Bisher gibt es keinen TGG-basierten Ansatz mit vergleichbaren Eigenschaften. Zentraler Beitrag dieser Dissertation ist es, diesen Ansatz formal auszuarbeiten und seine wesentlichen Eigenschaften, nĂ€mlich Korrektheit, VollstĂ€ndigkeit und Termination, zu beweisen. Die entscheidende neue Idee unseres Ansatzes ist es, Reparaturregeln anzuwenden. Dies sind spezielle Regeln, die es erlauben, Änderungen an einem Modell direkt zu propagieren anstatt auf NeuĂŒbersetzung zurĂŒckzugreifen. Um diese Reparaturregeln erstellen und anwenden zu können, entwickeln wir grundlegende BeitrĂ€ge zur Theorie der algebraischen Graphtransformation. ZunĂ€chst entwickeln wir eine neue Art der sequentiellen Komposition von Regeln. Im Gegensatz zur gewöhnlichen Komposition, die zu Regeln fĂŒhrt, die Elemente löschen und dann wieder neu erzeugen, können wir Regeln herleiten, die solche Elemente stattdessen bewahren. Technisch gesehen findet der Synchronisationsprozess, den wir entwickeln, außerdem in der Kategorie der partiellen Tripelgraphen statt und nicht in der der normalen Tripelgraphen. Daher mĂŒssen wir sicherstellen, dass die fĂŒr Double-Pushout-Ersetzungssysteme ausgearbeitete Theorie immer noch gĂŒltig ist. Dazu entwickeln wir eine (kategorientheoretische) Konstruktion neuer Kategorien aus gegebenen und zeigen, dass (i) diese Konstruktion die Axiome erhĂ€lt, die nötig sind, um die Theorie fĂŒr Double-Pushout-Ersetzungssysteme zu entwickeln, und (ii) partielle Tripelgraphen als eine solche Kategorie konstruiert werden können. Zusammen ermöglichen diese beiden grundsĂ€tzlichen BeitrĂ€ge es uns, unsere Lösung fĂŒr das grundlegende Modellsynchronisationsproblem vollstĂ€ndig formal auszuarbeiten und ihre zentralen Eigenschaften zu beweisen.Restoring consistency between different information-sharing artifacts after one of them has been changed is an important problem that arises in several areas of computer science. In this thesis, we provide a solution to the basic model synchronization problem. There, a pair of such artifacts (models), one of which has been changed, is given and consistency shall be restored. Triple graph grammars (TGGs) are an established and suitable formalism to address this and related problems. Being based on the algebraic theory of graph transformation and (double-)pushout rewriting, they are especially suited to develop solutions whose properties can be formally proven. Despite being established, many TGG-based solutions do not satisfactorily deal with the problem of information loss. When one model is changed, in the process of restoring consistency such solutions may lose information that is only present in the second model because the synchronization process resorts to restoring consistency by re-translating (large parts of) the updated model. We introduce a TGG-based approach that supports advanced features of TGGs (attributes and negative constraints), is comprehensively formalized, implemented, and is incremental in the sense that it drastically reduces the amount of information loss compared to former approaches. Up to now, a TGG-based approach with these characteristics is not available. The central contribution of this thesis is to formally develop that approach and to prove its essential properties, namely correctness, completeness, and termination. The crucial new idea in our approach is the use of repair rules, which are special rules that allow one to directly propagate changes from one model to the other instead of resorting to re-translation. To be able to construct and apply these repair rules, we contribute more fundamentally to the theory of algebraic graph transformation. First, we develop a new kind of sequential rule composition. Whereas the conventional composition of rules leads to rules that delete and re-create elements, we can compute rules that preserve such elements instead. Furthermore, technically the setting in which the synchronization process we develop takes place is the category of partial triple graphs and not the one of ordinary triple graphs. Hence, we have to ensure that the elaborate theory of double-pushout rewriting still applies. Therefore, we develop a (category-theoretic) construction of new categories from given ones and show that (i) this construction preserves the axioms that are necessary to develop the theory of double-pushout rewriting and (ii) partial triple graphs can be constructed as such a category. Together, those two more fundamental contributions enable us to develop our solution to the basic model synchronization problem in a fully formal manner and to prove its central properties

    Proceedings of the 22nd Conference on Formal Methods in Computer-Aided Design – FMCAD 2022

    Get PDF
    The Conference on Formal Methods in Computer-Aided Design (FMCAD) is an annual conference on the theory and applications of formal methods in hardware and system verification. FMCAD provides a leading forum to researchers in academia and industry for presenting and discussing groundbreaking methods, technologies, theoretical results, and tools for reasoning formally about computing systems. FMCAD covers formal aspects of computer-aided system design including verification, specification, synthesis, and testing

    Proceedings of the 22nd Conference on Formal Methods in Computer-Aided Design – FMCAD 2022

    Get PDF
    The Conference on Formal Methods in Computer-Aided Design (FMCAD) is an annual conference on the theory and applications of formal methods in hardware and system verification. FMCAD provides a leading forum to researchers in academia and industry for presenting and discussing groundbreaking methods, technologies, theoretical results, and tools for reasoning formally about computing systems. FMCAD covers formal aspects of computer-aided system design including verification, specification, synthesis, and testing

    Synthesizing symmetric lenses

    No full text
    corecore