    Thoughts about using Constraint Solvers in Action

    SMT solvers power many automated security analysis tools today. Nevertheless, a smooth integration of SMT solvers into programs is still a challenge that lead to different approaches for doing it the right way. In this paper, we review the state of the art for interacting with constraint solvers. Based on the different ideas found in literature we deduce requirements for a constraint solving service simplifying the integration challenge. We identify that for some of those ideas, it is required to run large scale experiments for evaluating some of the ideas behind the requirements empirically. We show that the platform is capable of running such an experiment for the case of measuring the impacts of seeds on the solver runtime

    Accelerating array constraints in symbolic execution

    Despite significant recent advances, the effectiveness of symbolic execution is limited when used to test complex, real-world software. One of the main scalability challenges is related to constraint solv- ing: large applications and long exploration paths lead to complex constraints, often involving big arrays indexed by symbolic expres- sions. In this paper, we propose a set of semantics-preserving trans- formations for array operations that take advantage of contextual information collected during symbolic execution. Our transforma- tions lead to simpler encodings and hence better performance in constraint solving. The results we obtain are encouraging: we show, through an extensive experimental analysis, that our transforma- tions help to significantly improve the performance of symbolic execution in the presence of arrays. We also show that our transfor- mations enable the analysis of new code, which would be otherwise out of reach for symbolic execution

    Enhancing dynamic symbolic execution via loop summarisation, segmented memory and pending constraints

    Software has become ubiquitous and its impact is still increasing. The more software is created, the more bugs get introduced into it. With software’s increasing omnipresence, these bugs have a high probability of negative impact on everyday life. There are many efforts aimed at improving software correctness, among which symbolic execution, a program analysis technique that aims to systematically explore all program paths. In this thesis we present three techniques for enhancing symbolic execution. We first present a counterexample-guided inductive synthesis approach to summarise a class of loops, called memoryless loops using standard library functions. Our approach can summarize two thirds of memoryless loops we gathered on a set of open-source programs. These loop summaries can be used to: 1) enhance symbolic execution, 2) optimise native code and 3) refactor code. We then propose a technique that avoids expensive forking by using a segmented memory model. In this model, we split memory into segments using pointer alias analysis, so that each symbolic pointer refers to objects in a single segment. This results in a memory model where forking due to symbolic pointer dereferences is reduced. We evaluate our segmented memory model on benchmarks such as SQLite, m4 and make and observe significant decreases in execution time and memory usage. Finally, we present pending constraints, which can enhance scalability of symbolic execution by aggressively prioritising execution paths that are already known to be feasible either via cached solver solutions or seeds. The execution of other paths is deferred until no paths are known to be feasible without using the constraint solver. We evaluate our technique on nine applications, including SQLite3, make and tcpdump, and show it can achieve higher coverage for both seeded and non-seeded exploration.Open Acces

    Técnicas de prueba avanzadas para la generación de casos de prueba

    Software testing is a crucial phase in software development, particularly in contexts such as critical systems, where even minor errors can have severe consequences. The advent of Industry 4.0 brings new challenges, with software present in almost all industrial systems. Overcoming technical limitations, as well as limited development times and budgets, is a major challenge that software testing faces nowadays. Such limitations can result in insufficient attention being paid to it. The Bay of Cadiz’s industrial sector is known for its world-leading technological projects, with facilities and staff fully committed to innovation. The close relationship between these companies and the University of Cadiz allows for a constant exchange between industry and academia. This PhD thesis aims to identify the most important elements of software testing in Industry 4.0, based on close industrial experience and the latest state-of-the-art work. This allows us to break down the software testing process in a context where large teams work on large-scale, changing projects with numerous dependencies. It also allows us to estimate the percentage benefit that a solution could provide to test engineers throughout the process. Our results indicate a need for non-commercial, flexible, and adaptable solutions for the automation of software testing, capable of meeting the constantly changing needs of industry projects. This work provides a comprehensive study on the industry’s needs and motivates the development of two new solutions using state-of-the-art technologies, which are rarely present in industrial work. These results include a tool, ASkeleTon, which implements a procedure for generating test harnesses based on the Abstract Syntax Tree (AST) and a study examining the ability of the Dynamic Symbolic Execution (DSE) testing technique to generate test data capable of detecting potential faults in software. This study leads to the creation of a novel family of testing techniques, called mutationinspired symbolic execution (MISE), which combines DSE with mutation testing (MT) to produce test data capable of detecting more potential faults than DSE alone. The findings of this work can serve as a reference for future research on software testing in Industry 4.0. The solutions developed in this PhD thesis are able to automate essential tasks in software testing, resulting in significant potential benefits. These benefits are not only for the industry, but the creation of the new family of testing techniques also represents a promising line of research for the scientific community, benefiting all software projects regardless of their field of application.La prueba del software es una de las etapas más importantes durante el desarrollo de software, especialmente en determinados tipos de contextos como el de los sistemas críticos, donde el más mínimo fallo puede conllevar la más grave de las consecuencias. Nuevos paradigmas tecnológicos como la Industria 4.0 conllevan desafíos que nunca antes se habían planteado, donde el software está presente en prácticamente todos los sistemas industriales. Uno de los desafíos más importantes a los que se enfrenta la prueba del software consiste en superar las limitaciones técnicas además de los tiempos de desarrollo y presupuestos limitados, que provocan que en ocasiones no se le preste la atención que merece. El tejido industrial de la Bahía de Cádiz es conocido por sacar adelante proyectos tecnológicos punteros a nivel mundial, con unas instalaciones y un personal totalmente implicado con la innovación. Las buenas relaciones de este conjunto de empresas con la Universidad de Cádiz, sumadas a la cercanía geográfica, permiten que haya una conversación constante entre la industria y la academia. Este trabajo de tesis persigue identificar los elementos más importantes del desarrollo de la prueba del software en la Industria 4.0 en base a una experiencia industrial cercana, además de a los últimos trabajos del estado del arte. Esto permite identificar cada etapa en la que se desglosa la prueba del software en un contexto donde trabajan equipos muy grandes con proyectos de gran envergadura, cambiantes y con multitud de dependencias. Esto permite, además, estimar el porcentaje de beneficio que podría suponer una solución que ayude a los ingenieros de prueba durante todo el proceso. Gracias a los resultados de esta experiencia descubrimos que existe la necesidad de soluciones para la automatización de la prueba del software que sean no comerciales, flexibles y adaptables a las constantes necesidades cambiantes entre los proyectos de la industria. Este trabajo aporta un estudio completo sobre las necesidades de la industria en relación a la prueba del software. Los resultados motivan el desarrollo de dos nuevas soluciones que utilizan tecnologías del estado del arte, ampliamente usadas en trabajos académicos, pero raramente presentes en trabajos industriales. En este sentido, se presentan dos resultados principales que incluyen una herramienta que implementa un procedimiento para la generación de arneses de prueba basada en el Árbol de Sintaxis Abstracta (AST) a la que llamamos ASkeleTon y un estudio donde se comprueba la capacidad de la técnica de pruebas Ejecución Simbólica Dinámica (DSE, por sus siglas en inglés) para generar datos de prueba capaces de detectar fallos potenciales en el software. Este estudio deriva en la creación de una novedosa familia de técnicas de prueba a la que llamamos mutation-inspired symbolic execution (MISE) que combina DSE con la prueba de mutaciones (MT, por sus siglas en inglés) para conseguir un conjunto de datos de prueba capaz de detectar más fallos potenciales que DSE por sí sola. Las soluciones desarrolladas en este trabajo de tesis son capaces de automatizar parte de la prueba del software, resultando en unos beneficios potenciales importantes. No solo se aportan beneficios a la industria, sino que la creación de la nueva familia de técnicas de prueba supone una línea de investigación prometedora para la comunidad científica, siendo beneficiados todos los proyectos software independientemente de su ámbito de aplicación

    Automated Approaches for Program Verification and Repair

    Formal methods techniques, such as verification, analysis, and synthesis,allow programmers to prove properties of their programs, or automatically derive programs from specifications. Making such techniques usable requires care: they must provide useful debugging information, be scalable, and enable automation. This dissertation presents automated analysis and synthesis techniques to ease the debugging of modular verification systems and allow easy access to constraint solvers from functional code. Further, it introduces machine learning based techniques to improve the scalability of off-the-shelf syntax-guided synthesis solvers and techniques to reduce the burden of network administrators writing and analyzing firewalls. We describe the design and implementationof a symbolic execution engine, G2, for non-strict functional languages such as Haskell. We extend G2 to both debug and automate the process of modular verification, and give Haskell programmers easy access to constraints solvers via a library named G2Q. Modular verifiers, such as LiquidHaskell, Dafny, and ESC/Java,allow programmers to write and prove specifications of their code. When a modular verifier fails to verify a program, it is not necessarily because of an actual bug in the program. This is because when verifying a function f, modular verifiers consider only the specification of a called function g, not the actual definition of g. Thus, a modular verifier may fail to prove a true specification of f if the specification of g is too weak. We present a technique, counterfactual symbolic execution, to aid in the debugging of modular verification failures. The approach uses symbolic execution to find concrete counterexamples, in the case of an actual inconsistency between a program and a specification; and abstract counterexamples, in the case that a function specification is too weak. Further, a counterexample-guided inductive synthesis (CEGIS) loop based technique is introduced to fully automate the process of modular verification, by using found counterexamples to automatically infer needed function specifications. The counterfactual symbolic execution and automated specification inference techniques are implemented in G2, and evaluated on existing LiquidHaskell errors and programs. We also leveraged G2 to build a library, G2Q, which allows writing constraint solving problemsdirectly as Haskell code. Users of G2Q can embed specially marked Haskell constraints (Boolean expressions) into their normal Haskell code, while marking some of the variables in the constraint as symbolic. Then, at runtime, G2Q automatically derives values for the symbolic variables that satisfy the constraint, and returns those values to the outside code. Unlike other constraint solving solutions, such as directly calling an SMT solver, G2Q uses symbolic execution to unroll recursive function definitions, and guarantees that the use of G2Q constraints will preserve type correctness. We further consider the problem of synthesizing functions viaa class of tools known as syntax-guided synthesis (SyGuS) solvers. We introduce a machine learning based technique to preprocess SyGuS problems, and reduce the space that the solver must search for a solution in. We demonstrate that the technique speeds up an existing SyGuS solver, CVC4, on a set of SyGuS solver benchmarks. Finally, we describe techniques to ease analysis and repair of firewalls.Firewalls are widely deployed to manage network security. However, firewall systems provide only a primitive interface, in which the specification is given as an ordered list of rules. This makes it hard to manually track and maintain the behavior of a firewall. We introduce a formal semantics for iptables firewall rules via a translation to first-order logic with uninterpreted functions and linear integer arithmetic, which allows encoding of firewalls into a decidable logic. We then describe techniques to automate the analysis and repair of firewalls using SMT solvers, based on user provided specifications of the desired behavior. We evaluate this approach with real world case studies collected from StackOverflow users

    Fundamental Approaches to Software Engineering

    Tools and Algorithms for the Construction and Analysis of Systems

    This open access two-volume set constitutes the proceedings of the 26th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The total of 60 regular papers presented in these volumes was carefully reviewed and selected from 155 submissions. The papers are organized in topical sections as follows: Part I: Program verification; SAT and SMT; Timed and Dynamical Systems; Verifying Concurrent Systems; Probabilistic Systems; Model Checking and Reachability; and Timed and Probabilistic Systems. Part II: Bisimulation; Verification and Efficiency; Logic and Proof; Tools and Case Studies; Games and Automata; and SV-COMP 2020

    Multi-solver Support in Symbolic Execution

    Abstract. One of the main challenges of dynamic symbolic execution— an automated program analysis technique which has been successfully employed to test a variety of software—is constraint solving. A key decision in the design of a symbolic execution tool is the choice of a constraint solver. While different solvers have different strengths, for most queries, it is not possible to tell in advance which solver will perform better. In this paper, we argue that symbolic execution tools can, and should, make use of multiple constraint solvers. These solvers can be run competitively in parallel, with the symbolic execution engine using the result from the best-performing solver. We present empirical data obtained by running the symbolic execution engine KLEE on a set of real programs, and use it to highlight several important characteristics of the constraint solving queries generated during symbolic execution. In particular, we show the importance of constraint caching and counterexample values on the (relative) performance of KLEE configured to use different SMT solvers. We have implemented multi-solver support in KLEE, using the metaSMT framework, and explored how different state-of-the-art solvers compare on a large set of constraint-solving queries. We also report on our ongoing experience building a parallel portfolio solver in KLEE.