2,022 research outputs found

    Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions

    Full text link
    Automated Graphical User Interface (GUI) testing plays a crucial role in ensuring app quality, especially as mobile applications have become an integral part of our daily lives. Despite the growing popularity of learning-based techniques in automated GUI testing due to their ability to generate human-like interactions, they still suffer from several limitations, such as low testing coverage, inadequate generalization capabilities, and heavy reliance on training data. Inspired by the success of Large Language Models (LLMs) like ChatGPT in natural language understanding and question answering, we formulate the mobile GUI testing problem as a Q&A task. We propose GPTDroid, asking LLM to chat with the mobile apps by passing the GUI page information to LLM to elicit testing scripts, and executing them to keep passing the app feedback to LLM, iterating the whole process. Within this framework, we have also introduced a functionality-aware memory prompting mechanism that equips the LLM with the ability to retain testing knowledge of the whole process and conduct long-term, functionality-based reasoning to guide exploration. We evaluate it on 93 apps from Google Play and demonstrate that it outperforms the best baseline by 32% in activity coverage, and detects 31% more bugs at a faster rate. Moreover, GPTDroid identify 53 new bugs on Google Play, of which 35 have been confirmed and fixed.Comment: Accepted by IEEE/ACM International Conference on Software Engineering 2024 (ICSE 2024). arXiv admin note: substantial text overlap with arXiv:2305.0943

    LLM for Test Script Generation and Migration: Challenges, Capabilities, and Opportunities

    Full text link
    This paper investigates the application of large language models (LLM) in the domain of mobile application test script generation. Test script generation is a vital component of software testing, enabling efficient and reliable automation of repetitive test tasks. However, existing generation approaches often encounter limitations, such as difficulties in accurately capturing and reproducing test scripts across diverse devices, platforms, and applications. These challenges arise due to differences in screen sizes, input modalities, platform behaviors, API inconsistencies, and application architectures. Overcoming these limitations is crucial for achieving robust and comprehensive test automation. By leveraging the capabilities of LLMs, we aim to address these challenges and explore its potential as a versatile tool for test automation. We investigate how well LLMs can adapt to diverse devices and systems while accurately capturing and generating test scripts. Additionally, we evaluate its cross-platform generation capabilities by assessing its ability to handle operating system variations and platform-specific behaviors. Furthermore, we explore the application of LLMs in cross-app migration, where it generates test scripts across different applications and software environments based on existing scripts. Throughout the investigation, we analyze its adaptability to various user interfaces, app architectures, and interaction patterns, ensuring accurate script generation and compatibility. The findings of this research contribute to the understanding of LLMs' capabilities in test automation. Ultimately, this research aims to enhance software testing practices, empowering app developers to achieve higher levels of software quality and development efficiency.Comment: Accepted by the 23rd IEEE International Conference on Software Quality, Reliability, and Security (QRS 2023

    Translation from Visual to Layout-based Android Test Cases: a Proof of Concept

    Get PDF
    Layout-based and Visual GUI testing are two approaches for testing mobile GUIs, both with individual benefits and drawbacks. Previous research has presented approaches to translate Layout-based scripts to hirdgen scripts but not the vice versa. The objective of this work is to provide Proof of Concept of the effectiveness of automatic translation between existing Visual test scripts to Layout-based test scripts. A tool architecture is presented and implemented in a tool capable of translating most hirdgen interactions with the GUI of an Android app into Layout-based instructions and oracles for the Espresso testing tool. We validate our approach on two test suites of our own creation, consisting of 30 test cases each. The measured success rate of the translation is 96.7% (58 working test cases out of 60 applications of the translator). The study provides support for the feasibility of a translation-based approach from Visual to Layout-based test cases. However, additional work is needed to make the approach applicable in real-world scenarios or larger open-source test suites

    Software Testing with Large Language Model: Survey, Landscape, and Vision

    Full text link
    Pre-trained large language models (LLMs) have recently emerged as a breakthrough technology in natural language processing and artificial intelligence, with the ability to handle large-scale datasets and exhibit remarkable performance across a wide range of tasks. Meanwhile, software testing is a crucial undertaking that serves as a cornerstone for ensuring the quality and reliability of software products. As the scope and complexity of software systems continue to grow, the need for more effective software testing techniques becomes increasingly urgent, and making it an area ripe for innovative approaches such as the use of LLMs. This paper provides a comprehensive review of the utilization of LLMs in software testing. It analyzes 52 relevant studies that have used LLMs for software testing, from both the software testing and LLMs perspectives. The paper presents a detailed discussion of the software testing tasks for which LLMs are commonly used, among which test case preparation and program repair are the most representative ones. It also analyzes the commonly used LLMs, the types of prompt engineering that are employed, as well as the accompanied techniques with these LLMs. It also summarizes the key challenges and potential opportunities in this direction. This work can serve as a roadmap for future research in this area, highlighting potential avenues for exploration, and identifying gaps in our current understanding of the use of LLMs in software testing.Comment: 20 pages, 11 figure

    Automated, Cost-effective, and Update-driven App Testing

    Get PDF
    Apps' pervasive role in our society led to the definition of test automation approaches to ensure their dependability. However, state-of-the-art approaches tend to generate large numbers of test inputs and are unlikely to achieve more than 50% method coverage. In this paper, we propose a strategy to achieve significantly higher coverage of the code affected by updates with a much smaller number of test inputs, thus alleviating the test oracle problem. More specifically, we present ATUA, a model-based approach that synthesizes App models with static analysis, integrates a dynamically-refined state abstraction function and combines complementary testing strategies, including (1) coverage of the model structure, (2) coverage of the App code, (3) random exploration, and (4) coverage of dependencies identified through information retrieval. Its model-based strategy enables ATUA to generate a small set of inputs that exercise only the code affected by the updates. In turn, this makes common test oracle solutions more cost-effective as they tend to involve human effort. A large empirical evaluation, conducted with 72 App versions belonging to nine popular Android Apps, has shown that ATUA is more effective and less effort intensive than state-of-the-art approaches when testing App updates

    Test Cases Evolution of Mobile Applications: Model Driven Approach

    Get PDF
    AELOS_HCERES2020 , NAOMOD_HCERES2020Mobile Applications Developers, with large freedom given to them, focus on satisfying market requirements and on pleasing consumer’s desires. They are forced to be creative and productive in a short period of time. As a result, billions of powerful mobile applications are displayed every day. Therefore, every mobile application needs to continually change and make an incremental evolution in order to survive and preserve its ranking among the top applications in the market. Mobile apps Testers hold a heavy responsibility on their shoulders, the intrinsic nature of agile swift change of mobile apps pushes them to be meticulous, to be aware that things can be different at any time, and to be prepared for unpredicted crashes. Therefore, starting the generation or the creation of test cases from scratch and selecting each time the overridden or the overloaded test cases is a tedious operation. In software testing the time allocated for testing and correcting defects is important for every software development (regularly half the time). This time can be reduced by the introduction of tools and the adoption of new testing methods. In the field of mobile development, new concerns should be taken into account; among the most important ones are the heterogeneity of execution environments and the fragmentation of terminals which have different impacts on the functionality, performance, and connectivity. This project studies the evolution of mobile applications and its impact on the evolution of test cases from their creation until their expiration stage. A detailed case study of a native open source Android application is provided; describing many aspects of design, development, testing in addition to the analysis of the process of mobile apps evolution. This project based on model driven engineering approach where the models are serialized using the standard XMI. It presents a protocol for the adaptation of test cases under certain restrictions

    Mining Android Crash Fixes in the Absence of Issue- and Change-Tracking Systems

    Get PDF
    Android apps are prone to crash. This often arises from the misuse of Android framework APIs, making it harder to debug since official Android documentation does not discuss thoroughly potential exceptions.Recently, the program repair community has also started to investigate the possibility to fix crashes automatically. Current results, however, apply to limited example cases. In both scenarios of repair, the main issue is the need for more example data to drive the fix processes due to the high cost in time and effort needed to collect and identify fix examples. We propose in this work a scalable approach, CraftDroid, to mine crash fixes by leveraging a set of 28 thousand carefully reconstructed app lineages from app markets, without the need for the app source code or issue reports. We developed a replicative testing approach that locates fixes among app versions which output different runtime logs with the exact same test inputs. Overall, we have mined 104 relevant crash fixes, further abstracted 17 fine-grained fix templates that are demonstrated to be effective for patching crashed apks. Finally, we release ReCBench, a benchmark consisting of 200 crashed apks and the crash replication scripts, which the community can explore for evaluating generated crash-inducing bug patches
    • …
    corecore