33 research outputs found

    Witness-based validation of verification results with applications to software-model checking

    Get PDF
    In the scientific world, formal verification is an established engineering technique to ensure the correctness of hardware and software systems. Because formal verification is an arduous and error-prone endeavor, automated solutions are desirable, and researchers continue to develop new algorithms and optimize existing ones to push the boundaries of what can be verified automatically. These efforts do not go unnoticed by the industry. Hardware-circuit designs, flight-control systems, and operating-system drivers are just a few examples of systems where formal verification is already part of the quality-assurance repertoire. Nevertheless, the primary fields of application for formal verification are mainly those where errors carry a high risk of significant damage, either financial or physical, because the costs of formal verification are considered to be too high for most other projects, despite the fact that the research community has made vast advancements regarding the effectiveness and efficiency of formal verification techniques in the last decades. We present and address two potential reasons for this discrepancy that we identified in the field of automated formal software verification. (1) Even for experts in the field, it is often difficult to decide which of the multitude of available techniques is the most suitable solution they should recommend to solve a given verification problem. Moreover, even if a suitable solution is found for a given system, there is no guarantee that the solution is sustainable as the system evolves. Consequently, the cost of finding and maintaining a suitable approach for applying formal software verification to real-world systems is high. (2) Even assuming that a suitable and maintainable solution for applying formal software verification to a given system is found and verification results could be obtained, developers of the system still require further guidance towards making practical use of these results, which often differ significantly from the results they obtain from classical quality-assurance techniques they are familiar with, such as testing. To mitigate the first issue, using the open-source software-verification framework CPAchecker, we investigate several popular formal software-verification techniques such as predicate abstraction, Impact, bounded model checking, k -induction, and PDR, and perform an extensive and rigorous experimental study to identify their strengths and weaknesses regarding their comparative effectiveness and efficiency when applied to a large and established benchmark set, to provide a basis for choosing the best technique for a given problem. To mitigate the second issue, we propose a concrete standard format for the representation and communication of verification results that raises the bar from plain "yes" or "no" answers to verification witnesses, which are valuable artifacts of the verification process that contain detailed information discovered during the analysis. We then use these verification witnesses for several applications: To increase the trust in verification results, we irst develop several independent validators based on violation witnesses, i.e. verification witnesses that represent bugs detected by a verifier. We then extend our validators to also erify the verification results obtained from a successful verification, which are represented y correctness witnesses. Lastly, we also develop an interactive web service to store and retrieve these verification witnesses, to provide online validation to quickly de-prioritize likely wrong results, and to graphically visualize the witnesses, as an example of how verification can be integrated into a development process. Since the introduction of our proposed standard format for verification witnesses, it has been adopted by over thirty different software verifiers, and our witness-based result-validation tools have become a core component in the scoring process of the International Competition on Software Verification.In der Welt der Wissenschaft gilt die Formale Verifikation als etablierte Methode, die Korrektheit von Hard- und Software zu gewährleisten. Da die Anwendung formaler Verifikation jedoch selbst ein beschwerliches und fehlerträchtiges Unterfangen darstellt, ist es erstrebenswert, automatisierte Lösungen dafür zu finden. Forscher entwickeln daher immer wieder neue Algorithmen Formaler Verifikation oder verbessern bereits existierende Algorithmen, um die Grenzen der Automatisierbarkeit Formaler Verifikation weiter und weiter zu dehnen. Auch die Industrie ist bereits auf diese Anstrengungen aufmerksam geworden. Flugsteuerungssysteme, Betriebssystemtreiber und Entwürfe von Hardware-Schaltungen sind nur einzelne Beispiele von Systemen, bei denen Formale Verifikation bereits heute einen festen Stammplatz im Arsenal der Qualitätssicherungsmaßnahmen eingenommen hat. Trotz alledem bleiben die primären Einsatzgebiete Formaler Verifikation jene, in denen Fehler ein hohes Risiko finanzieller oder physischer Schäden bergen, da in anderen Projekten die Kosten des Einsatzes Formaler Verifikation in der Regel als zu hoch empfunden werden, unbeachtet der Tatsache, dass es der Forschungsgemeinschaft in den letzten Jahrzehnten gelungen ist, enorme Fortschritte bei der Verbesserung der Effektivität und Effizienz Formaler Verifikationstechniken zu machen. Wir präsentieren und diskutieren zwei potenzielle Ursachen für diese Diskrepanz zwischen Forschung und Industrie, die wir auf dem Gebiet der Automatisierten Formalen Softwareverifikation identifiziert haben. (1) Sogar Fachleuten fällt es oft schwer, zu entscheiden, welche der zahlreichen verfügbaren Methoden sie als vielversprechendste Lösung eines gegebenen Verifikationsproblems empfehlen sollten. Darüber hinaus gibt es selbst dann, wenn eine passende Lösung für ein gegebenes System gefunden wird, keine Garantie, dass sich diese Lösung im Laufe der Evolution des Systems als Nachhaltig erweisen wird. Daher sind sowohl die Wahl als auch der Unterhalt eines passenden Ansatzes zur Anwendung Formaler Softwareverifikation auf reale Systeme kostspielige Unterfangen. (2) Selbst unter der Annahme, dass eine passende und wartbare Lösung zur Anwendung Formaler Softwareverifikation auf ein gegebenes System gefunden und Verifikationsergebnisse erzielt werden, benötigen die Entwickler des Systems immer noch weitere Unterstützung, um einen praktischen Nutzen aus den Ergebnissen ziehen zu können, die sich oft maßgeblich unterscheiden von den Ergebnissen jener klassischen Qualitätssicherungssysteme, mit denen sie vertraut sind, wie beispielsweise dem Testen. Um das erste Problem zu entschärfen, untersuchen wir unter Verwendung des Open-Source-Softwareverifikationsystems CPAchecker mehrere beliebte Formale Softwareverifikationsmethoden, wie beispielsweise Prädikatenabstraktion, Impact, Bounded-Model-Checking, k-Induktion und PDR, und führen umfangreiche und gründliche experimentelle Studien auf einem großen und etablierten Konvolut an Beispielprogrammen durch, um die Stärken und Schwächen dieser Methoden hinsichtlich ihrer relativen Effektivität und Effizienz zu ermitteln und daraus eine Entscheidungsgrundlage für die Wahl der besten Lösung für ein gegebenes Problem abzuleiten. Um das zweite Problem zu entschärfen, schlagen wir ein konkretes Standardformat zur Modellierung und zum Austausch von Verifikationsergebnissen vor, welches die Ansprüche an Verifikationsergebnisse anhebt, weg von einfachen "ja/nein"-Antworten und hin zu Verifikationszeugen (Verification Witnesses), bei denen es sich um wertvolle Produkte des Verifikationsprozesses handelt und die detaillierte, während der Analyse entdeckte Informationen enthalten. Wir stellen mehrere Anwendungsbeispiele für diese Verifikationszeugen vor: Um das Vertrauen in Verifikationsergebnisse zu erhöhen, entwickeln wir zunächst mehrere, voneinander unabhängige Validatoren, die Verletzungszeugen (Violation Witnesses) verwenden, also Verifikationszeugen, welche von einem Verifikationswerkzeug gefundene Spezifikationsverletzungen darstellen, Diese Validatoren erweitern wir anschließend so, dass sie auch in der Lage sind, die Verifikationsergebnisse erfolgreicher Verifikationen, also Korrektheitsbehauptungen, die durch Korrektheitszeugen (Correctness Witnesses) dokumentiert werden, nachzuvollziehen. Schlussendlich entwickeln wir als Beispiel für die Integrierbarkeit Formaler Verifikation in den Entwicklungsprozess einen interaktiven Webservice für die Speicherung und den Abruf von Verifikationzeugen, um einen Online-Validierungsdienst zur schnellen Depriorisierung mutmaßlich falscher Verifikationsergebnisse anzubieten und Verifikationszeugen graphisch darzustellen. Unser Vorschlag für ein Standardformat für Verifikationszeugen wurde inzwischen von mehr als dreißig verschiedenen Softwareverifikationswerkzeugen übernommen und unsere zeugen-basierten Validierungswerkzeuge sind zu einer Kernkomponente des Bewertungsschemas des Internationalen Softwareverifikationswettbewerbs geworden

    Genetic Improvement of Software: a Comprehensive Survey

    Get PDF
    Genetic improvement (GI) uses automated search to find improved versions of existing software. We present a comprehensive survey of this nascent field of research with a focus on the core papers in the area published between 1995 and 2015. We identified core publications including empirical studies, 96% of which use evolutionary algorithms (genetic programming in particular). Although we can trace the foundations of GI back to the origins of computer science itself, our analysis reveals a significant upsurge in activity since 2012. GI has resulted in dramatic performance improvements for a diverse set of properties such as execution time, energy and memory consumption, as well as results for fixing and extending existing system functionality. Moreover, we present examples of research work that lies on the boundary between GI and other areas, such as program transformation, approximate computing, and software repair, with the intention of encouraging further exchange of ideas between researchers in these fields

    Solvers for Type Recovery and Decompilation of Binaries

    Get PDF
    Reconstructing the meaning of a program from its binary is known as reverse engineering. Since reverse engineering is ultimately a search for meaning, there is growing interest in inferring a type (a meaning) for the elements of a binary in a consistent way. Currently there is no consensus on how best to achieve this, with the few existing approaches utilising ad-hoc techniques which lack any formal basis. Moreover, previous work does not answer (or even ask) the fundamental question of what it means for recovered types to be correct. This thesis demonstrates how solvers for Satisfiability Modulo Theories (SMT) and Constraint Handling Rules (CHR) can be leveraged to solve the type reconstruction problem. In particular, an approach based on a new SMT theory of rational tree constraints is developed and evaluated. The resulting solver, based on the reification mechanisms of Prolog, is shown to scale well, and leads to a reification driven SMT framework that supports rapid implementation of SMT solvers for different theories in just a few hundred lines of code. The question of how to guarantee semantic relevance for reconstructed types is answered with a new and semantically-founded approach that provides strong guarantees for the reconstructed types. Key to this approach is the derivation of a witness program in a type safe high-level language alongside the reconstructed types. This witness has the same semantics as the binary, is type correct by construction, and it induces a (justifiable) type assignment on the binary. Moreover, the approach, implemented using CHR, yields a type-directed decompiler. Finally, to evaluate the flexibility of reificiation-based SMT solving, the SMT framework is instantiated with theories of general linear inequalities, integer difference problems and octagons. The integer difference solver is shown to perform competitively with state-of-the-art SMT solvers. Two new algorithms for incremental closure of the octagonal domain are presented and proven correct. These are shown to be both conceptually simple, and offer improved performance over existing algorithms. Although not directly related to reverse engineering, these results follow from the work on SMT solver construction

    A seamless framework for formal reasoning on specifications : model derivation, verification and comparison

    Get PDF
    While formal methods have been demonstrated to be favourable to the construction of reliable systems, they also present us with several limitations. Most of the efforts regarding formal reasoning are concerned with model correctness for critical systems, while other properties, including model validity, have seen little development, especially in the context of non-critical systems. We set to advance model validation by relating a software model with the corresponding requirements it is intended to capture. This requires us to express both requirements and models in a common formal language, which in turn will enable not only model validation, but also model generation and comparison. We present a novel framework (TOMM) that integrates the formalization of class diagrams and requirements, along with a set of formal theories to validate, infer, and compare class models. We introduce SpeCNL, a controlled domain independent subset of English sentences, and a document structure named ConSpec. The combination of both allows us to express and formalize functional requirements related to class models. Our formal framework is accompanied by a proof-of-concept tool that integrates language and image processing libraries, as well as formal methods, to aid the usage and evaluation of our theories. In addition, we provide an implementation that performs partial extraction of relevant information from the graphical representations of class diagrams. Though different approaches to model validation exist, they assume the existence of formal specifications for the model to be checked. In contrast, our approach has been shown to deal with informal specifications and seamlessly validate, generate and compare class models

    Genetic Improvement of Software: a Comprehensive Survey

    Get PDF
    Genetic improvement uses automated search to find improved versions of existing software. We present a comprehensive survey of this nascent field of research with a focus on the core papers in the area published between 1995 and 2015. We identified core publications including empirical studies, 96% of which use evolutionary algorithms (genetic programming in particular). Although we can trace the foundations of genetic improvement back to the origins of computer science itself, our analysis reveals a significant upsurge in activity since 2012. Genetic improvement has resulted in dramatic performance improvements for a diverse set of properties such as execution time, energy and memory consumption, as well as results for fixing and extending existing system functionality. Moreover, we present examples of research work that lies on the boundary between genetic improvement and other areas, such as program transformation, approximate computing, and software repair, with the intention of encouraging further exchange of ideas between researchers in these fields

    Towards Practical Predicate Analysis

    Get PDF
    Software model checking is a successful technique for automated program verification. Several of the most widely used approaches for software model checking are based on solving first-order-logic formulas over predicates using SMT solvers, e.g., predicate abstraction, bounded model checking, k-induction, and lazy abstraction with interpolants. We define a configurable framework for predicate-based analyses that allows expressing each of these approaches. This unifying framework highlights the differences between the approaches, producing new insights, and facilitates research of further algorithms and their combinations, as witnessed by several research projects that have been conducted on top of this framework. In addition to this theoretical contribution, we provide a mature implementation of our framework in the software verifier that allows applying all of the mentioned approaches to practice. This implementation is used by other research groups, e.g., to find bugs in the Linux kernel, and has proven its competitiveness by winning gold medals in the International Competition on Software Verification. Tools and approaches for software model checking like our predicate analysis are typically evaluated using performance benchmarking on large sets of verification tasks. We have identified several pitfalls that can silently arise during benchmarking, and we have found that the benchmarking techniques and tools that are used by many researchers do not guarantee valid results in practice, but may produce arbitrarily large measurement errors. Furthermore, certain hardware characteristics can also have nondeterministic influence on the measurements. In order to being able to properly evaluate our framework for software verification, we study the effects of these hardware characteristics, and define a list of the most important requirements that need to be ensured for reliable benchmarking. We present as solution an open-source benchmarking framework BenchExec, which in contrast to other benchmarking tools fulfills all our requirements and aims at making reliable benchmarking easy. BenchExec was already adopted by several research groups and the International Competition on Software Verification. Using the power of BenchExec we conduct an experimental evaluation of our unifying framework for predicate analysis. We study the effect of varying the SMT solver and the way program semantics are encoded in formulas across several verification algorithms and find that these technical choices can significantly influence the results of experimental studies of verification approaches. This is valuable information for both researchers who study verification approaches as well as for users who apply them in practice. Our comprehensive study of 120 different configurations would not have been possible without our highly flexible and configurable unifying framework for predicate analysis and shows that the latter is a valuable base for conducting experiments. Furthermore, we show using a comparison against top-ranking verifiers from the International Competition on Software Verification that our implementation is highly competitive and can outperform the state of the art

    Software test and evaluation study phase I and II : survey and analysis

    Get PDF
    Issued as Final report, Project no. G-36-661 (continues G-36-636; includes A-2568

    Structured parallelism discovery with hybrid static-dynamic analysis and evaluation technique

    Get PDF
    Parallel computer architectures have dominated the computing landscape for the past two decades; a trend that is only expected to continue and intensify, with increasing specialization and heterogeneity. This creates huge pressure across the software stack to produce programming languages, libraries, frameworks and tools which will efficiently exploit the capabilities of parallel computers, not only for new software, but also revitalizing existing sequential code. Automatic parallelization, despite decades of research, has had limited success in transforming sequential software to take advantage of efficient parallel execution. This thesis investigates three approaches that use commutativity analysis as the enabler for parallelization. This has the potential to overcome limitations of traditional techniques. We introduce the concept of liveness-based commutativity for sequential loops. We examine the use of a practical analysis utilizing liveness-based commutativity in a symbolic execution framework. Symbolic execution represents input values as groups of constraints, consequently deriving the output as a function of the input and enabling the identification of further program properties. We employ this feature to develop an analysis and discern commutativity properties between loop iterations. We study the application of this approach on loops taken from real-world programs in the OLDEN and NAS Parallel Benchmark (NPB) suites, and identify its limitations and related overheads. Informed by these findings, we develop Dynamic Commutativity Analysis (DCA), a new technique that leverages profiling information from program execution with specific input sets. Using profiling information, we track liveness information and detect loop commutativity by examining the code’s live-out values. We evaluate DCA against almost 1400 loops of the NPB suite, discovering 86% of them as parallelizable. Comparing our results against dependence-based methods, we match the detection efficacy of two dynamic and outperform three static approaches, respectively. Additionally, DCA is able to automatically detect parallelism in loops which iterate over Pointer-Linked Data Structures (PLDSs), taken from wide range of benchmarks used in the literature, where all other techniques we considered failed. Parallelizing the discovered loops, our methodology achieves an average speedup of 3.6× across NPB (and up to 55×) and up to 36.9× for the PLDS-based loops on a 72-core host. We also demonstrate that our methodology, despite relying on specific input values for profiling each program, is able to correctly identify parallelism that is valid for all potential input sets. Lastly, we develop a methodology to utilize liveness-based commutativity, as implemented in DCA, to detect latent loop parallelism in the shape of patterns. Our approach applies a series of transformations which subsequently enable multiple applications of DCA over the generated multi-loop code section and match its loop commutativity outcomes against the expected criteria for each pattern. Applying our methodology on sets of sequential loops, we are able to identify well-known parallel patterns (i.e., maps, reduction and scans). This extends the scope of parallelism detection to loops, such as those performing scan operations, which cannot be determined as parallelizable by simply evaluating liveness-based commutativity conditions on their original form

    JPL Quarterly Technical Review, Volume 2, No. 3

    Get PDF
    Aerospace engineering including materials, orbits and trajectories, power sources, satellite geodesy, and structural engineerin
    corecore