22 research outputs found

    A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes

    Full text link
    Memory corruption is a serious class of software vulnerabilities, which requires careful attention to be detected and removed from applications before getting exploited and harming the system users. Symbolic execution is a well-known method for analyzing programs and detecting various vulnerabilities, e.g., memory corruption. Although this method is sound and complete in theory, it faces some challenges, such as path explosion, when applied to real-world complex programs. In this paper, we present a method for improving the efficiency of symbolic execution and detecting four classes of memory corruption vulnerabilities in executable codes, i.e., heap-based buffer overflow, stack-based buffer overflow, use-after-free, and double-free. We perform symbolic execution only on test units rather than the whole program to avoid path explosion. In our method, test units are considered parts of the program's code, which might contain vulnerable statements and are statically identified based on the specifications of memory corruption vulnerabilities. Then, each test unit is symbolically executed to calculate path and vulnerability constraints of each statement of the unit, which determine the conditions on unit input data for executing that statement or activating vulnerabilities in it, respectively. Solving these constraints gives us input values for the test unit, which execute the desired statements and reveal vulnerabilities in them. Finally, we use machine learning to approximate the correlation between system and unit input data. Thereby, we generate system inputs that enter the program, reach vulnerable instructions in the desired test unit, and reveal vulnerabilities in them. This method is implemented as a plugin for angr framework and evaluated using a group of benchmark programs. The experiments show its superiority over similar tools in accuracy and performance

    FRAMEWORK SYNTHESIS FOR SYMBOLIC EXECUTION OF EVENT-DRIVEN FRAMEWORKS

    Get PDF
    Symbolic execution is a powerful program analysis technique, but it is very challenging to apply to programs built using event-driven frameworks, such as Android. The main reason is that the framework code itself is too complex to symbolically execute. The standard solution is to manually create a framework model that is simpler and more amenable to symbolic execution. However, developing and maintaining such a model by hand is difficult and error-prone. We claim that we can leverage program synthesis to introduce a high-degree of automation to the process of framework modeling. To support this thesis, we present three pieces of work. First, we introduced SymDroid, a symbolic executor for Android. While Android apps are written in Java, they are compiled to Dalvik bytecode format. Instead of analyzing an app’s Java source, which may not be available, or decompiling from Dalvik back to Java, which requires significant engineering effort and introduces yet another source of potential bugs in an analysis, SymDroid works directly on Dalvik bytecode. Second, we introduced Pasket, a new system that takes a first step toward automatically generating Java framework models to support symbolic execution. Pasket takes as input the framework API and tutorial programs that exercise the framework. From these artifacts and Pasket's internal knowledge of design patterns, Pasket synthesizes an executable framework model by instantiating design patterns, such that the behavior of a synthesized model on the tutorial programs matches that of the original framework. Lastly, in order to scale program synthesis to framework models, we devised adaptive concretization, a novel program synthesis algorithm that combines the best of the two major synthesis strategies: symbolic search, i.e., using SAT or SMT solvers, and explicit search, e.g., stochastic enumeration of possible solutions. Adaptive concretization parallelizes multiple sub-synthesis problems by partially concretizing highly influential unknowns in the original synthesis problem. Thanks to adaptive concretization, Pasket can generate a large-scale model, e.g., thousands lines of code. In addition, we have used an Android model synthesized by Pasket and found that the model is sufficient to allow SymDroid to execute a range of apps

    A Semantic Testing Approach for Deep Neural Networks Using Bayesian Network Abstraction

    Get PDF
    The studies presented in this thesis are directed at investigating the internal decision process of Deep Neural Networks (DNNs) and testing their performance based on feature impor- tance weights. Deep learning models have achieved state-of-the-art performance in a variety of machine learning tasks, which has led to their integration into safety-critical domains such as autonomous vehicles. The susceptibility of deep learning models to adversarial examples raises serious concerns about their application in safety-critical contexts. Most existing testing methodologies have failed to consider the interactions between neurons and the semantic representations formed in the DNN during the training process. This thesis designed weight-based semantic testing metrics that first modelled the internal behaviour of the DNNs into Bayesian networks and the contribution of the hidden features to their decisions into importance weight. Moreover, it measured the test data coverage according to the weight of the features. These approaches were followed to answer the main research question, "Is it a better measure of trustworthiness to measure the coverage of the semantic aspect of deep neural networks and treat each internal component according to its contribu- tion value to the decision when testing these learning models’ performance than relying on traditional structural unweighted measures?". This thesis makes three main contributions to the field of machine learning. First, the thesis proposes a novel technique for estimating the importance of a neural network’s latent features through its abstracted behaviour into a Bayesian Network (BN). The algo- rithm analysed the sensitivity of each extracted feature to distributional shifts by observing changes in BN distribution. The experimental results showed that computing the distance between two BN probability distributions, clean as well as perturbed by interval-shifts or adversarial attacks, can detect the distribution shift wherever it exists. The hidden features were assigned weight scores according to the computed sensitivity distances. Secondly, to further justify the contribution of each latent feature to the classification decision, the ab- stract scheme of the BN was extended to perform a prediction. The performance of the BN in predicting input classification labels was shown to be a decent approximator of the original DNN. Moreover, feature perturbation on the BN classifier demonstrated that each feature influenced prediction accuracy differently, thereby validating the presented feature importance assumption. Lastly, the developed feature importance measure was used to assess the extent to which a given test dataset exercises high-level features that have been learned by hidden layers of the DNN, taking into account significant representations as a priority when generating new test inputs. The evaluation was conducted to compare the initial and final coverage of the proposed weighting approach with normal BN-based feature coverage. The testing coverage experiments indicated that the proposed weight metrics achieved higher coverage compared to the original feature metrics while maintain- ing the effectiveness of finding adversarial samples during the test case generation process. Furthermore, the weight metrics guaranteed that the achieved testing percent covered the most crucial components, where the test generation algorithm was directed to synthesise new input targeting features with higher importance scores. Hence, the evidence of DNNs’ trustworthy behaviour is subsequently furthered through this study

    Search-based Unit Test Generation for Evolving Software

    Get PDF
    Search-based software testing has been successfully applied to generate unit test cases for object-oriented software. Typically, in search-based test generation approaches, evolutionary search algorithms are guided by code coverage criteria such as branch coverage to generate tests for individual coverage objectives. Although it has been shown that this approach can be effective, there remain fundamental open questions. In particular, which criteria should test generation use in order to produce the best test suites? Which evolutionary algorithms are more effective at generating test cases with high coverage? How to scale up search-based unit test generation to software projects consisting of large numbers of components, evolving and changing frequently over time? As a result, the applicability of search-based test generation techniques in practice is still fundamentally limited. In order to answer these fundamental questions, we investigate the following improvements to search-based testing. First, we propose the simultaneous optimisation of several coverage criteria at the same time using an evolutionary algorithm, rather than optimising for individual criteria. We then perform an empirical evaluation of different evolutionary algorithms to understand the influence of each one on the test optimisation problem. We then extend a coverage-based test generation with a non-functional criterion to increase the likelihood of detecting faults as well as helping developers to identify the locations of the faults. Finally, we propose several strategies and tools to efficiently apply search-based test generation techniques in large and evolving software projects. Our results show that, overall, the optimisation of several coverage criteria is efficient, there is indeed an evolutionary algorithm that clearly works better for test generation problem than others, the extended coverage-based test generation is effective at revealing and localising faults, and our proposed strategies, specifically designed to test entire software projects in a continuous way, improve efficiency and lead to higher code coverage. Consequently, the techniques and toolset presented in this thesis - which provides support to all contributions here described - brings search-based software testing one step closer to practical usage, by equipping software engineers with the state of the art in automated test generation
    corecore