242 research outputs found

    A Survey of Symbolic Execution Techniques

    Get PDF
    Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the last four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience. The present survey has been accepted for publication at ACM Computing Surveys. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5Fv

    Automated Software Transplantation

    Get PDF
    Automated program repair has excited researchers for more than a decade, yet it has yet to find full scale deployment in industry. We report our experience with SAPFIX: the first deployment of automated end-to-end fault fixing, from test case design through to deployed repairs in production code. We have used SAPFIX at Facebook to repair 6 production systems, each consisting of tens of millions of lines of code, and which are collectively used by hundreds of millions of people worldwide. In its first three months of operation, SAPFIX produced 55 repair candidates for 57 crashes reported to SAPFIX, of which 27 have been deem as correct by developers and 14 have been landed into production automatically by SAPFIX. SAPFIX has thus demonstrated the potential of the search-based repair research agenda by deploying, to hundreds of millions of users worldwide, software systems that have been automatically tested and repaired. Automated software transplantation (autotransplantation) is a form of automated software engineering, where we use search based software engineering to be able to automatically move a functionality of interest from a ‘donor‘ program that implements it into a ‘host‘ program that lacks it. Autotransplantation is a kind of automated program repair where we repair the ‘host‘ program by augmenting it with the missing functionality. Automated software transplantation would open many exciting avenues for software development: suppose we could autotransplant code from one system into another, entirely unrelated, system, potentially written in a different programming language. Being able to do so might greatly enhance the software engineering practice, while reducing the costs. Automated software transplantation manifests in two different flavors: monolingual, when the languages of the host and donor programs is the same, or multilingual when the languages differ. This thesis introduces a theory of automated software transplantation, and two algorithms implemented in two tools that achieve this: µSCALPEL for monolingual software transplantation and τSCALPEL for multilingual software transplantation. Leveraging lightweight annotation, program analysis identifies an organ (interesting behavior to transplant); testing validates that the organ exhibits the desired behavior during its extraction and after its implantation into a host. We report encouraging results: in 14 of 17 monolingual transplantation experiments involving 6 donors and 4 hosts, popular real-world systems, we successfully autotransplanted 6 new functionalities; and in 10 out of 10 multlingual transplantation experiments involving 10 donors and 10 hosts, popular real-world systems written in 4 different programming languages, we successfully autotransplanted 10 new functionalities. That is, we have passed all the test suites that validates the new functionalities behaviour and the fact that the initial program behaviour is preserved. Additionally, we have manually checked the behaviour exercised by the organ. Autotransplantation is also very useful: in just 26 hours computation time we successfully autotransplanted the H.264 video encoding functionality from the x264 system to the VLC media player, a task that is currently done manually by the developers of VLC, since 12 years ago. We autotransplanted call graph generation and indentation for C programs into Kate, (a popular KDE based test editor used as an IDE by a lot of C developers) two features currently missing from Kate, but requested by the users of Kate. Autotransplantation is also efficient: the total runtime across 15 monolingual transplants is 5 hours and a half; the total runtime across 10 multilingual transplants is 33 hours

    Derailer: interactive security analysis for web applications

    Get PDF
    Derailer is an interactive tool for finding security bugs in web applications. Using symbolic execution, it enumerates the ways in which application data might be exposed. The user is asked to examine these exposures and classify the conditions under which they occur as security-related or not; in so doing, the user effectively constructs a specification of the application's security policy. The tool then highlights exposures missing security checks, which tend to be security bugs. We have tested Derailer's scalability on several large open-source Ruby on Rails applications. We have also applied it to a large number of student projects (designed with different security policies in mind), exposing a variety of security bugs that eluded human reviewers.National Science Foundation (U.S.) (Grant 0707612

    A Survey on Automated Software Vulnerability Detection Using Machine Learning and Deep Learning

    Full text link
    Software vulnerability detection is critical in software security because it identifies potential bugs in software systems, enabling immediate remediation and mitigation measures to be implemented before they may be exploited. Automatic vulnerability identification is important because it can evaluate large codebases more efficiently than manual code auditing. Many Machine Learning (ML) and Deep Learning (DL) based models for detecting vulnerabilities in source code have been presented in recent years. However, a survey that summarises, classifies, and analyses the application of ML/DL models for vulnerability detection is missing. It may be difficult to discover gaps in existing research and potential for future improvement without a comprehensive survey. This could result in essential areas of research being overlooked or under-represented, leading to a skewed understanding of the state of the art in vulnerability detection. This work address that gap by presenting a systematic survey to characterize various features of ML/DL-based source code level software vulnerability detection approaches via five primary research questions (RQs). Specifically, our RQ1 examines the trend of publications that leverage ML/DL for vulnerability detection, including the evolution of research and the distribution of publication venues. RQ2 describes vulnerability datasets used by existing ML/DL-based models, including their sources, types, and representations, as well as analyses of the embedding techniques used by these approaches. RQ3 explores the model architectures and design assumptions of ML/DL-based vulnerability detection approaches. RQ4 summarises the type and frequency of vulnerabilities that are covered by existing studies. Lastly, RQ5 presents a list of current challenges to be researched and an outline of a potential research roadmap that highlights crucial opportunities for future work

    Test generation for high coverage with abstraction refinement and coarsening (ARC)

    Get PDF
    Testing is the main approach used in the software industry to expose failures. Producing thorough test suites is an expensive and error prone task that can greatly benefit from automation. Two challenging problems in test automation are generating test input and evaluating the adequacy of test suites: the first amounts to producing a set of test cases that accurately represent the software behavior, the second requires defining appropriate metrics to evaluate the thoroughness of the testing activities. Structural testing addresses these problems by measuring the amount of code elements that are executed by a test suite. The code elements that are not covered by any execution are natural candidates for generating further test cases, and the measured coverage rate can be used to estimate the thoroughness of the test suite. Several empirical studies show that test suites achieving high coverage rates exhibit a high failure detection ability. However, producing highly covering test suites automatically is hard as certain code elements are executed only under complex conditions while other might be not reachable at all. In this thesis we propose Abstraction Refinement and Coarsening (ARC), a goal oriented technique that combines static and dynamic software analysis to automatically generate test suites with high code coverage. At the core of our approach there is an abstract program model that enables the synergistic application of the different analysis components. In ARC we integrate Dynamic Symbolic Execution (DSE) and abstraction refinement to precisely direct test generation towards the coverage goals and detect infeasible elements. ARC includes a novel coarsening algorithm for improved scalability. We implemented ARC-B, a prototype tool that analyses C programs and produces test suites that achieve high branch coverage. Our experiments show that the approach effectively exploits the synergy between symbolic testing and reachability analysis outperforming state of the art test generation approaches. We evaluated ARC-B on industry relevant software, and exposed previously unknown failures in a safety-critical software component

    Técnicas de prueba avanzadas para la generación de casos de prueba

    Get PDF
    Software testing is a crucial phase in software development, particularly in contexts such as critical systems, where even minor errors can have severe consequences. The advent of Industry 4.0 brings new challenges, with software present in almost all industrial systems. Overcoming technical limitations, as well as limited development times and budgets, is a major challenge that software testing faces nowadays. Such limitations can result in insufficient attention being paid to it. The Bay of Cadiz’s industrial sector is known for its world-leading technological projects, with facilities and staff fully committed to innovation. The close relationship between these companies and the University of Cadiz allows for a constant exchange between industry and academia. This PhD thesis aims to identify the most important elements of software testing in Industry 4.0, based on close industrial experience and the latest state-of-the-art work. This allows us to break down the software testing process in a context where large teams work on large-scale, changing projects with numerous dependencies. It also allows us to estimate the percentage benefit that a solution could provide to test engineers throughout the process. Our results indicate a need for non-commercial, flexible, and adaptable solutions for the automation of software testing, capable of meeting the constantly changing needs of industry projects. This work provides a comprehensive study on the industry’s needs and motivates the development of two new solutions using state-of-the-art technologies, which are rarely present in industrial work. These results include a tool, ASkeleTon, which implements a procedure for generating test harnesses based on the Abstract Syntax Tree (AST) and a study examining the ability of the Dynamic Symbolic Execution (DSE) testing technique to generate test data capable of detecting potential faults in software. This study leads to the creation of a novel family of testing techniques, called mutationinspired symbolic execution (MISE), which combines DSE with mutation testing (MT) to produce test data capable of detecting more potential faults than DSE alone. The findings of this work can serve as a reference for future research on software testing in Industry 4.0. The solutions developed in this PhD thesis are able to automate essential tasks in software testing, resulting in significant potential benefits. These benefits are not only for the industry, but the creation of the new family of testing techniques also represents a promising line of research for the scientific community, benefiting all software projects regardless of their field of application.La prueba del software es una de las etapas más importantes durante el desarrollo de software, especialmente en determinados tipos de contextos como el de los sistemas críticos, donde el más mínimo fallo puede conllevar la más grave de las consecuencias. Nuevos paradigmas tecnológicos como la Industria 4.0 conllevan desafíos que nunca antes se habían planteado, donde el software está presente en prácticamente todos los sistemas industriales. Uno de los desafíos más importantes a los que se enfrenta la prueba del software consiste en superar las limitaciones técnicas además de los tiempos de desarrollo y presupuestos limitados, que provocan que en ocasiones no se le preste la atención que merece. El tejido industrial de la Bahía de Cádiz es conocido por sacar adelante proyectos tecnológicos punteros a nivel mundial, con unas instalaciones y un personal totalmente implicado con la innovación. Las buenas relaciones de este conjunto de empresas con la Universidad de Cádiz, sumadas a la cercanía geográfica, permiten que haya una conversación constante entre la industria y la academia. Este trabajo de tesis persigue identificar los elementos más importantes del desarrollo de la prueba del software en la Industria 4.0 en base a una experiencia industrial cercana, además de a los últimos trabajos del estado del arte. Esto permite identificar cada etapa en la que se desglosa la prueba del software en un contexto donde trabajan equipos muy grandes con proyectos de gran envergadura, cambiantes y con multitud de dependencias. Esto permite, además, estimar el porcentaje de beneficio que podría suponer una solución que ayude a los ingenieros de prueba durante todo el proceso. Gracias a los resultados de esta experiencia descubrimos que existe la necesidad de soluciones para la automatización de la prueba del software que sean no comerciales, flexibles y adaptables a las constantes necesidades cambiantes entre los proyectos de la industria. Este trabajo aporta un estudio completo sobre las necesidades de la industria en relación a la prueba del software. Los resultados motivan el desarrollo de dos nuevas soluciones que utilizan tecnologías del estado del arte, ampliamente usadas en trabajos académicos, pero raramente presentes en trabajos industriales. En este sentido, se presentan dos resultados principales que incluyen una herramienta que implementa un procedimiento para la generación de arneses de prueba basada en el Árbol de Sintaxis Abstracta (AST) a la que llamamos ASkeleTon y un estudio donde se comprueba la capacidad de la técnica de pruebas Ejecución Simbólica Dinámica (DSE, por sus siglas en inglés) para generar datos de prueba capaces de detectar fallos potenciales en el software. Este estudio deriva en la creación de una novedosa familia de técnicas de prueba a la que llamamos mutation-inspired symbolic execution (MISE) que combina DSE con la prueba de mutaciones (MT, por sus siglas en inglés) para conseguir un conjunto de datos de prueba capaz de detectar más fallos potenciales que DSE por sí sola. Las soluciones desarrolladas en este trabajo de tesis son capaces de automatizar parte de la prueba del software, resultando en unos beneficios potenciales importantes. No solo se aportan beneficios a la industria, sino que la creación de la nueva familia de técnicas de prueba supone una línea de investigación prometedora para la comunidad científica, siendo beneficiados todos los proyectos software independientemente de su ámbito de aplicación

    Execution Synthesis: A Technique for Automating the Debugging of Software

    Get PDF
    Debugging real systems is hard, requires deep knowledge of the target code, and is time-consuming. Bug reports rarely provide sufficient information for debugging, thus forcing developers to turn into detectives searching for an explanation of how the program could have arrived at the reported failure state. This thesis introduces execution synthesis, a technique for automating this detective work: given a program and a bug report, execution synthesis automatically produces an execution of the program that leads to the reported bug symptoms. Using a combination of static analysis and symbolic execution, the technique “synthesizes” a thread schedule and various required program inputs that cause the bug to manifest. The synthesized execution can be played back deterministically in a regular debugger, like gdb. This is particularly useful in debugging concurrency bugs, because it transforms otherwise non-deterministic bugs into bugs that can be deterministically observed in a debugger. Execution synthesis requires no runtime recording, and no program or hardware modifications, thus incurring no runtime overhead. This makes it practical for use in production systems. This thesis includes a theoretical analysis of execution synthesis as well as empirical evidence that execution synthesis is successful in starting from mere bug reports and reproducing on its own concurrency and memory safety bugs in real systems, taking on the order of minutes. This thesis also introduces reverse execution synthesis, an automated debugging technique that takes a coredump obtained after a failure and automatically computes the suffix of an execution that leads to that coredump. Reverse execution synthesis generates the necessary information to then play back this suffix in a debugger deterministically as many times as needed to complete the debugging process. Since it synthesizes an execution suffix instead of the entire execution, reverse execution is particularly well suited for arbitrarily long executions in which the failure and its root cause occur within a short time span, so developers can use a short execution suffix to debug the problem. The thesis also shows how execution synthesis can be combined with recording techniques in order to automatically classify data races and to efficiently debug deadlock bugs

    Finding real bugs in big programs with incorrectness logic

    Get PDF
    Incorrectness Logic (IL) has recently been advanced as a logical theory for compositionally proving the presence of bugs—dual to Hoare Logic, which is used to compositionally prove their absence. Though IL was motivated in large part by the aim of providing a logical foundation for bug-catching program analyses, it has remained an open question: is IL useful only retrospectively (to explain existing analyses), or can it actually be useful in developing new analyses which can catch real bugs in big programs? In this work, we develop Pulse-X, a new, automatic program analysis for catching memory errors, based on ISL, a recent synthesis of IL and separation logic. Using Pulse-X, we have found 15 new real bugs in OpenSSL, which we have reported to OpenSSL maintainers and have since been fixed. In order not to be overwhelmed with potential but false error reports, we develop a compositional bug-reporting criterion based on a distinction between latent and manifest errors, which references the under-approximate ISL abstractions computed by Pulse-X, and we investigate the fix rate resulting from application of this criterion. Finally, to probe the potential practicality of our bug-finding method, we conduct a comparison to Infer, a widely used analyzer which has proven useful in industrial engineering practice
    corecore