4,252 research outputs found

    A Model-Based AI-Driven Test Generation System

    Get PDF
    Achieving high software quality today involves manual analysis, test planning, documentation of testing strategy and test cases, and development of automated test scripts to support regression testing. This thesis is motivated by the opportunity to bridge the gap between current test automation and true test automation by investigating learning-based solutions to software testing. We present an approach that combines a trainable web component classifier, a test case description language, and a trainable test generation and execution system that can learn to generate new test cases. Training data was collected and hand-labeled across 7 systems, 95 web pages, and 17,360 elements. A total of 250 test flows were also manually hand-crafted for training purposes. Various machine learning algorithms were evaluated. Results showed that Random Forest classifiers performed well on several web component classification problems. In addition, Long Short-Term Memory neural networks were able to model and generate new valid test flows

    The Effect of Code Obfuscation on Authorship Attribution of Binary Computer Files

    Get PDF
    In many forensic investigations, questions linger regarding the identity of the authors of the software specimen. Research has identified methods for the attribution of binary files that have not been obfuscated, but a significant percentage of malicious software has been obfuscated in an effort to hide both the details of its origin and its true intent. Little research has been done around analyzing obfuscated code for attribution. In part, the reason for this gap in the research is that deobfuscation of an unknown program is a challenging task. Further, the additional transformation of the executable file introduced by the obfuscator modifies or removes features from the original executable that would have been used in the author attribution process. Existing research has demonstrated good success in attributing the authorship of an executable file of unknown provenance using methods based on static analysis of the specimen file. With the addition of file obfuscation, static analysis of files becomes difficult, time consuming, and in some cases, may lead to inaccurate findings. This paper presents a novel process for authorship attribution using dynamic analysis methods. A software emulated system was fully instrumented to become a test harness for a specimen of unknown provenance, allowing for supervised control, monitoring, and trace data collection during execution. This trace data was used as input into a supervised machine learning algorithm trained to identify stylometric differences in the specimen under test and provide predictions on who wrote the specimen. The specimen files were also analyzed for authorship using static analysis methods to compare prediction accuracies with prediction accuracies gathered from this new, dynamic analysis based method. Experiments indicate that this new method can provide better accuracy of author attribution for files of unknown provenance, especially in the case where the specimen file has been obfuscated

    Towards model structuring based on flow diagram decomposition

    Get PDF
    The key challenge of model transformations in model-driven development is in transforming higher-level abstract models into more concrete ones that can be used to generate implementation level models, including executable business process representations and program code. Many of the modelling languages (like UML Activity Diagrams or BPMN) use unstructured flow graphs to describe the operation sequence of a business process. If a structured language is chosen as the executable representation, it is difficult to compile the unstructured flows into structured statements. Even if a target language structure contains goto-like statements it is often simpler and more efficient to deal with programs that have structured control flow to make the executable representation more understandable. In this paper, we take a first step towards an implementation of existing decomposition methods using graph transformations, and we evaluate their effectiveness with a view to readability and essential complexity measures

    The 11th Conference of PhD Students in Computer Science

    Get PDF

    Addressing Complexity and Intelligence in Systems Dependability Evaluation

    Get PDF
    Engineering and computing systems are increasingly complex, intelligent, and open adaptive. When it comes to the dependability evaluation of such systems, there are certain challenges posed by the characteristics of “complexity” and “intelligence”. The first aspect of complexity is the dependability modelling of large systems with many interconnected components and dynamic behaviours such as Priority, Sequencing and Repairs. To address this, the thesis proposes a novel hierarchical solution to dynamic fault tree analysis using Semi-Markov Processes. A second aspect of complexity is the environmental conditions that may impact dependability and their modelling. For instance, weather and logistics can influence maintenance actions and hence dependability of an offshore wind farm. The thesis proposes a semi-Markov-based maintenance model called “Butterfly Maintenance Model (BMM)” to model this complexity and accommodate it in dependability evaluation. A third aspect of complexity is the open nature of system of systems like swarms of drones which makes complete design-time dependability analysis infeasible. To address this aspect, the thesis proposes a dynamic dependability evaluation method using Fault Trees and Markov-Models at runtime.The challenge of “intelligence” arises because Machine Learning (ML) components do not exhibit programmed behaviour; their behaviour is learned from data. However, in traditional dependability analysis, systems are assumed to be programmed or designed. When a system has learned from data, then a distributional shift of operational data from training data may cause ML to behave incorrectly, e.g., misclassify objects. To address this, a new approach called SafeML is developed that uses statistical distance measures for monitoring the performance of ML against such distributional shifts. The thesis develops the proposed models, and evaluates them on case studies, highlighting improvements to the state-of-the-art, limitations and future work
    • …
    corecore