2,939 research outputs found

    Target Directed Event Sequence Generation for Android Applications

    Full text link
    Testing is a commonly used approach to ensure the quality of software, of which model-based testing is a hot topic to test GUI programs such as Android applications (apps). Existing approaches mainly either dynamically construct a model that only contains the GUI information, or build a model in the view of code that may fail to describe the changes of GUI widgets during runtime. Besides, most of these models do not support back stack that is a particular mechanism of Android. Therefore, this paper proposes a model LATTE that is constructed dynamically with consideration of the view information in the widgets as well as the back stack, to describe the transition between GUI widgets. We also propose a label set to link the elements of the LATTE model to program snippets. The user can define a subset of the label set as a target for the testing requirements that need to cover some specific parts of the code. To avoid the state explosion problem during model construction, we introduce a definition "state similarity" to balance the model accuracy and analysis cost. Based on this model, a target directed test generation method is presented to generate event sequences to effectively cover the target. The experiments on several real-world apps indicate that the generated test cases based on LATTE can reach a high coverage, and with the model we can generate the event sequences to cover a given target with short event sequences

    On Improving (Non)Functional Testing

    Get PDF
    Software testing is commonly classified into two categories, nonfunctional testing and functional testing. The goal of nonfunctional testing is to test nonfunctional requirements, such as performance and reliability. Performance testing is one of the most important types of nonfunctional testing, one goal of which is to detect the phenomena that an Application Under Testing (AUT) exhibits unexpectedly worse performance (e.g., lower throughput) with some input data. During performance testing, a critical challenge is to understand the AUT’s behaviors with large numbers of combinations of input data and find the particular subset of inputs leading to performance bottlenecks. However, enumerating those particular inputs and identifying those bottlenecks are always laborious and intellectually intensive. In addition, for an evolving software system, some code changes may accidentally degrade performance between two software versions, it is even more challenging to find problematic changes (out of a large number of committed changes) may lead to performance regressions under certain test inputs. This dissertation presents a set of approaches to automatically find specific combinations of input data for exposing performance bottlenecks and further analyze execution traces to identify performance bottlenecks. In addition, this dissertation also provides an approach that automatically estimates the impact of code changes on performance degradation between two released software versions to identify the problematic ones likely leading to performance regressions. Functional testing is used to test the functional correctness of AUTs. Developers commonly write test suites for AUTs to test different functionalities and locate functional faults. During functional testing, developers rely on some strategies to order test cases to achieve certain objectives, such as exposing faults faster, which is known as Test Case Prioritization (TCP). TCP techniques are commonly classified into two categories, dynamic and static techniques. A set of empirical studies has been conducted to examine and understand different TCP techniques, but there is a clear gap in existing studies. No study has compared static techniques against dynamic techniques and comprehensively examined the impact of test granularity, program size, fault characteristics, and the similarities in terms of fault detection on TCP techniques. Thus, this dissertation presents an empirical study to thoroughly compare static and dynamic TCP techniques in terms of effectiveness, efficiency, and similarity of uncovered faults at different granularities on a large set of real-world programs, and further analyze the potential impact of program size and fault characteristics on TCP evaluation. Moreover, in the prior work, TCP techniques have been typically evaluated against synthetic software defects, called mutants. For this reason, it is currently unclear whether TCP performance on mutants would be representative of the performance achieved on real faults. to answer this fundamental question, this dissertation presents the first empirical study that investigates TCP performance when applied to both real-world faults and mutation faults for understanding the representativeness of mutants

    Sapienz: Multi-objective automated testing for android applications

    Get PDF
    We introduce Sapienz, an approach to Android testing that uses multi-objective search-based testing to automatically explore and optimise test sequences, minimising length, while simultaneously maximising coverage and fault revelation. Sapienz combines random fuzzing, systematic and search-based exploration, exploiting seeding and multi-level instrumentation. Sapienz significantly outperforms (with large effect size) both the state-of-the-art technique Dynodroid and the widely-used tool, Android Monkey, in 7/10 experiments for coverage, 7/10 for fault detection and 10/10 for fault-revealing sequence length. When applied to the top 1, 000 Google Play apps, Sapienz found 558 unique, previously unknown crashes. So far we have managed to make contact with the developers of 27 crashing apps. Of these, 14 have confirmed that the crashes are caused by real faults. Of those 14, six already have developer-confirmed fixes

    Deep Reinforcement Learning Driven Applications Testing

    Get PDF
    Applications have become indispensable in our lives, and ensuring their correctness is now a critical issue. Automatic system test case generation can significantly improve the testing process for these applications, which has recently motivated researchers to work on this problem, defining various approaches. However, most state-of-the-art approaches automatically generate test cases leveraging symbolic execution or random exploration techniques. This led to techniques that lose efficiency when dealing with an increasing number of program constraints and become inapplicable when conditions are too challenging to solve or even to formulate. This Ph.D. thesis proposes addressing current techniques' limitations by exploiting Deep Reinforcement Learning. Deep Reinforcement Learning (Deep RL) is a machine learning technique that does not require a labeled training set as input since the learning process is guided by the positive or negative reward experienced during the tentative execution of a task. Hence, it can be used to dynamically learn how to build a test suite based on the feedback obtained during past successful or unsuccessful attempts. This dissertation presents three novel techniques that exploit this intuition: ARES, RONIN, and IFRIT. Since functional testing and security testing are complementary, this Ph.D. thesis explores both testing techniques using the same approach for test cases generation. ARES is a Deep RL approach for functional testing of Android apps. RONIN addresses the issue of generating exploits for a subset of Android ICC vulnerabilities. Subsequently, to better expose the bugs discovered by previous techniques, this thesis presents IFRIT, a focused testing approach capable of increasing the number of test cases that can reach a specific target (i.e., a precise section or statement of an application) and their diversity. IFRIT has the ultimate goal of exposing faults affecting the given program point

    Deep Reinforcement Learning for Black-Box Testing of Android Apps

    Full text link
    The state space of Android apps is huge and its thorough exploration during testing remains a major challenge. In fact, the best exploration strategy is highly dependent on the features of the app under test. Reinforcement Learning (RL) is a machine learning technique that learns the optimal strategy to solve a task by trial and error, guided by positive or negative reward, rather than by explicit supervision. Deep RL is a recent extension of RL that takes advantage of the learning capabilities of neural networks. Such capabilities make Deep RL suitable for complex exploration spaces such as the one of Android apps. However, state of the art, publicly available tools only support basic, tabular RL. We have developed ARES, a Deep RL approach for black-box testing of Android apps. Experimental results show that it achieves higher coverage and fault revelation than the baselines, which include state of the art RL based tools, such as TimeMachine and Q-Testing. We also investigated qualitatively the reasons behind such performance and we have identified the key features of Android apps that make Deep RL particularly effective on them to be the presence of chained and blocking activities

    Enhancing Automated GUI Exploration Techniques for Android Mobile Applications

    Get PDF
    Mobile software applications ("apps") are used by billions of smartphone owners worldwide. The demand for quality to these apps has grown together with their spread. Therefore, effective techniques and tools are being requested to support developers in mobile app quality engineering activities. Automation tools can facilitate these activities since they can save humans from routine, time consuming and error prone manual tasks. Automated GUI exploration techniques are widely adopted by researchers and practitioners in the context of mobile apps for supporting critical engineering tasks such as reverse engineering, testing, and network traffic signature generation. These techniques iteratively exercise a running app by exploiting the information that the app exposes at runtime through its GUI to derive the set of input events to be fired. Although several automated GUI exploration techniques have been proposed in the literature, they suffer from some limitations that may hinder them from a thorough app exploration. This dissertation proposes two novel solutions that contribute to the literature in Software Engineering towards improving existing automated GUI exploration techniques for mobile software applications. The former is a fully automated GUI exploration technique that aims to detect issues tied to the app instances lifecycle, a mobile-specific feature that allows users to smoothly navigate through an app and switch between apps. In particular, this technique addresses the issues of crashes and GUI failures, that consists in the manifestation of unexpected GUI states. This work includes two exploratory studies that prove that GUI failures are a widespread problem in the context of mobile apps. The latter solution is a hybrid exploration technique that combines automated GUI exploration with capture and replay through machine learning. It exploits app-specific knowledge that only human users can provide in order to explore relevant parts of the application that can be reached only by firing complex sequences of input events on specific GUIs and by choosing specific input values. Both the techniques have been implemented in tools that target the Android Operating System, that is today the world’s most popular mobile operating system. The effectiveness of the proposed techniques is demonstrated through experimental evaluations performed on real mobile apps

    Deep Reinforcement Learning for Black-box Testing of Android Apps

    Get PDF
    The state space of Android apps is huge, and its thorough exploration during testing remains a significant challenge. The best exploration strategy is highly dependent on the features of the app under test. Reinforcement Learning (RL) is a machine learning technique that learns the optimal strategy to solve a task by trial and error, guided by positive or negative reward, rather than explicit supervision. Deep RL is a recent extension of RL that takes advantage of the learning capabilities of neural networks. Such capabilities make Deep RL suitable for complex exploration spaces such as one of Android apps. However, state-of-the-art, publicly available tools only support basic, Tabular RL. We have developed ARES, a Deep RL approach for black-box testing of Android apps. Experimental results show that it achieves higher coverage and fault revelation than the baselines, including state-of-the-art tools, such as TimeMachine and Q-Testing. We also investigated the reasons behind such performance qualitatively, and we have identified the key features of Android apps that make Deep RL particularly effective on them to be the presence of chained and blocking activities. Moreover, we have developed FATE to fine-tune the hyperparameters of Deep RL algorithms on simulated apps, since it is computationally expensive to carry it out on real apps

    Fundamental Approaches to Software Engineering

    Get PDF
    computer software maintenance; computer software selection and evaluation; formal logic; formal methods; formal specification; programming languages; semantics; software engineering; specifications; verificatio
    • …
    corecore