223 research outputs found
Enhancing Automated GUI Exploration Techniques for Android Mobile Applications
Mobile software applications ("apps") are used by billions of smartphone owners worldwide. The demand for quality to these apps has grown together with their spread. Therefore, effective techniques and tools are being requested to support developers in mobile app quality engineering activities.
Automation tools can facilitate these activities since they can save humans from routine, time consuming and error prone manual tasks. Automated GUI exploration techniques are widely adopted by researchers and practitioners in the context of mobile apps for supporting critical engineering tasks such as reverse engineering, testing, and network traffic signature generation. These techniques iteratively exercise a running app by exploiting the information that the app exposes at runtime through its GUI to derive the set of input events to be fired.
Although several automated GUI exploration techniques have been proposed in the literature, they suffer from some limitations that may hinder them from a thorough app exploration.
This dissertation proposes two novel solutions that contribute to the literature in Software Engineering towards improving existing automated GUI exploration techniques for mobile software applications.
The former is a fully automated GUI exploration technique that aims to detect issues tied to the app instances lifecycle, a mobile-specific feature that allows users to smoothly navigate through an app and switch between apps. In particular, this technique addresses the issues of crashes and GUI failures, that consists in the manifestation of unexpected GUI states. This work includes two exploratory studies that prove that GUI failures are a widespread problem in the context of mobile apps.
The latter solution is a hybrid exploration technique that combines automated GUI exploration with capture and replay through machine learning. It exploits app-specific knowledge that only human users can provide in order to explore relevant parts of the application that can be reached only by firing complex sequences of input events on specific GUIs and by choosing specific input values.
Both the techniques have been implemented in tools that target the Android Operating System, that is today the world’s most popular mobile operating system. The effectiveness of the proposed techniques is demonstrated through experimental evaluations performed on real mobile apps
Recommended from our members
Techniques for Efficient and Effective Mobile Testing
The booming mobile app market attracts a large number of developers. As a result, the competition is extremely tough. This fierce competition leads to high standards required for mobile apps, which mandates efficient and effective testing. Efficient testing requires little effort to use, while effective testing checks that the app under test behaves as expected. Manual testing is highly effective, but it is costly. Automatic testing should come to the rescue, but current automatic methods are either ineffective or inefficient. Methods using implicit specifications – for instance, “an app should not crash” for catching fail-stop errors – are ineffective because they cannot find semantic problems. Methods using explicit specifications such as test scripts are inefficient because they require huge developer effort to create and maintain specifications. In this thesis, we present our two approaches for solving these challenges. We first built the AppDoctor system which efficiently tests mobile apps. It quickly explores an app then slowly but accurately verifies the potential problems to identify bugs without introducing false positives. It uses dependencies discovered between actions to simplify its reports. Our second approach, implemented in the AppFlow system, leverages the ample opportunity of reusing test cases between apps to gain efficiency without losing effectiveness. It allows common UI elements to be used in test scripts then recognizes these UI elements in real apps using a machine learning approach. The system also allows tests to be specified in reusable pieces, and provides a system to synthesize complete test cases from these reusable pieces. It enables robust tests to be created and reused across apps in the same category. The combination of these two approaches enables a developer to quickly test an app on a great number of combinations of actions for fail-stop problems, and effortlessly and efficiently test the app on most common scenarios for semantic problems. This combination covers most of her test requirements and greatly reduces her burden in testing the app
LLM for Test Script Generation and Migration: Challenges, Capabilities, and Opportunities
This paper investigates the application of large language models (LLM) in the
domain of mobile application test script generation. Test script generation is
a vital component of software testing, enabling efficient and reliable
automation of repetitive test tasks. However, existing generation approaches
often encounter limitations, such as difficulties in accurately capturing and
reproducing test scripts across diverse devices, platforms, and applications.
These challenges arise due to differences in screen sizes, input modalities,
platform behaviors, API inconsistencies, and application architectures.
Overcoming these limitations is crucial for achieving robust and comprehensive
test automation.
By leveraging the capabilities of LLMs, we aim to address these challenges
and explore its potential as a versatile tool for test automation. We
investigate how well LLMs can adapt to diverse devices and systems while
accurately capturing and generating test scripts. Additionally, we evaluate its
cross-platform generation capabilities by assessing its ability to handle
operating system variations and platform-specific behaviors. Furthermore, we
explore the application of LLMs in cross-app migration, where it generates test
scripts across different applications and software environments based on
existing scripts.
Throughout the investigation, we analyze its adaptability to various user
interfaces, app architectures, and interaction patterns, ensuring accurate
script generation and compatibility. The findings of this research contribute
to the understanding of LLMs' capabilities in test automation. Ultimately, this
research aims to enhance software testing practices, empowering app developers
to achieve higher levels of software quality and development efficiency.Comment: Accepted by the 23rd IEEE International Conference on Software
Quality, Reliability, and Security (QRS 2023
Learning the language of apps
To explore the functionality of an app, automated test generators systematically identify and interact with its user interface (UI) elements. A key challenge is to synthesize inputs which effectively and efficiently cover app behavior. To do so, a test generator has to choose which elements to interact with but, which interactions to do on each element and which input values to type. In summary, to better test apps, a test generator should know the app's language, that is, the language of its graphical interactions and the language of its textual inputs. In this work, we show how a test generator can learn the language of apps and how this knowledge is modeled to create tests. We demonstrate how to learn the language of the graphical input prior to testing by combining machine learning and static analysis, and how to refine this knowledge during testing using reinforcement learning. In our experiments, statically learned models resulted in 50\% less ineffective actions an average increase in test (code) coverage of 19%, while refining these through reinforcement learning resulted in an additional test (code) coverage of up to 20%. We learn the language of textual inputs, by identifying the semantics of input fields in the UI and querying the web for real-world values. In our experiments, real-world values increase test (code) coverage ~10%; Finally, we show how to use context-free grammars to integrate both languages into a single representation (UI grammar), giving back control to the user. This representation can then be: mined from existing tests, associated to the app source code, and used to produce new tests. 82% test cases produced by fuzzing our UI grammar can reach a UI element within the app and 70% of them can reach a specific code location.Automatisierte Testgeneratoren identifizieren systematisch Elemente der Benutzeroberfläche und interagieren mit ihnen, um die Funktionalität einer App zu erkunden. Eine wichtige Herausforderung besteht darin, Eingaben zu synthetisieren, die das App-Verhalten effektiv und effizient abdecken. Dazu muss ein Testgenerator auswählen, mit welchen Elementen interagiert werden soll, welche Interaktionen jedoch für jedes Element ausgeführt werden sollen und welche Eingabewerte eingegeben werden sollen. Um Apps besser testen zu können, sollte ein Testgenerator die Sprache der App kennen, dh die Sprache ihrer grafischen Interaktionen und die Sprache ihrer Texteingaben. In dieser Arbeit zeigen wir, wie ein Testgenerator die Sprache von Apps lernen kann und wie dieses Wissen modelliert wird, um Tests zu erstellen. Wir zeigen, wie die Sprache der grafischen Eingabe lernen vor dem Testen durch maschinelles Lernen und statische Analyse kombiniert und wie dieses Wissen weiter verfeinern beim Testen Verstärkung Lernen verwenden. In unseren Experimenten führten statisch erlernte Modelle zu 50% weniger ineffektiven Aktionen, was einer durchschnittlichen Erhöhung der Testabdeckung (Code) von 19% entspricht, während die Verfeinerung dieser durch verstärkendes Lernen zu einer zusätzlichen Testabdeckung (Code) von bis zu 20% führte. Wir lernen die Sprache der Texteingaben, indem wir die Semantik der Eingabefelder in der Benutzeroberfläche identifizieren und das Web nach realen Werten abfragen. In unseren Experimenten erhöhen reale Werte die Testabdeckung (Code) um ca. 10%; Schließlich zeigen wir, wie kontextfreien Grammatiken verwenden beide Sprachen in einer einzigen Darstellung (UI Grammatik) zu integrieren, wieder die Kontrolle an den Benutzer zu geben. Diese Darstellung kann dann: aus vorhandenen Tests gewonnen, dem App-Quellcode zugeordnet und zur Erstellung neuer Tests verwendet werden. 82% Testfälle, die durch Fuzzing unserer UI-Grammatik erstellt wurden, können ein UI-Element in der App erreichen, und 70% von ihnen können einen bestimmten Code-Speicherort erreichen
Continuous, Evolutionary and Large-Scale: A New Perspective for Automated Mobile App Testing
Mobile app development involves a unique set of challenges including device
fragmentation and rapidly evolving platforms, making testing a difficult task.
The design space for a comprehensive mobile testing strategy includes features,
inputs, potential contextual app states, and large combinations of devices and
underlying platforms. Therefore, automated testing is an essential activity of
the development process. However, current state of the art of automated testing
tools for mobile apps poses limitations that has driven a preference for manual
testing in practice. As of today, there is no comprehensive automated solution
for mobile testing that overcomes fundamental issues such as automated oracles,
history awareness in test cases, or automated evolution of test cases.
In this perspective paper we survey the current state of the art in terms of
the frameworks, tools, and services available to developers to aid in mobile
testing, highlighting present shortcomings. Next, we provide commentary on
current key challenges that restrict the possibility of a comprehensive,
effective, and practical automated testing solution. Finally, we offer our
vision of a comprehensive mobile app testing framework, complete with research
agenda, that is succinctly summarized along three principles: Continuous,
Evolutionary and Large-scale (CEL).Comment: 12 pages, accepted to the Proceedings of 33rd IEEE International
Conference on Software Maintenance and Evolution (ICSME'17
Automating Software Development for Mobile Computing Platforms
Mobile devices such as smartphones and tablets have become ubiquitous in today\u27s computing landscape. These devices have ushered in entirely new populations of users, and mobile operating systems are now outpacing more traditional desktop systems in terms of market share. The applications that run on these mobile devices (often referred to as apps ) have become a primary means of computing for millions of users and, as such, have garnered immense developer interest. These apps allow for unique, personal software experiences through touch-based UIs and a complex assortment of sensors. However, designing and implementing high quality mobile apps can be a difficult process. This is primarily due to challenges unique to mobile development including change-prone APIs and platform fragmentation, just to name a few. in this dissertation we develop techniques that aid developers in overcoming these challenges by automating and improving current software design and testing practices for mobile apps. More specifically, we first introduce a technique, called Gvt, that improves the quality of graphical user interfaces (GUIs) for mobile apps by automatically detecting instances where a GUI was not implemented to its intended specifications. Gvt does this by constructing hierarchal models of mobile GUIs from metadata associated with both graphical mock-ups (i.e., created by designers using photo-editing software) and running instances of the GUI from the corresponding implementation. Second, we develop an approach that completely automates prototyping of GUIs for mobile apps. This approach, called ReDraw, is able to transform an image of a mobile app GUI into runnable code by detecting discrete GUI-components using computer vision techniques, classifying these components into proper functional categories (e.g., button, dropdown menu) using a Convolutional Neural Network (CNN), and assembling these components into realistic code. Finally, we design a novel approach for automated testing of mobile apps, called CrashScope, that explores a given android app using systematic input generation with the intrinsic goal of triggering crashes. The GUI-based input generation engine is driven by a combination of static and dynamic analyses that create a model of an app\u27s GUI and targets common, empirically derived root causes of crashes in android apps. We illustrate that the techniques presented in this dissertation represent significant advancements in mobile development processes through a series of empirical investigations, user studies, and industrial case studies that demonstrate the effectiveness of these approaches and the benefit they provide developers
Efficiency Matters: Speeding Up Automated Testing with GUI Rendering Inference
Due to the importance of Android app quality assurance, many automated GUI
testing tools have been developed. Although the test algorithms have been
improved, the impact of GUI rendering has been overlooked. On the one hand,
setting a long waiting time to execute events on fully rendered GUIs slows down
the testing process. On the other hand, setting a short waiting time will cause
the events to execute on partially rendered GUIs, which negatively affects the
testing effectiveness. An optimal waiting time should strike a balance between
effectiveness and efficiency. We propose AdaT, a lightweight image-based
approach to dynamically adjust the inter-event time based on GUI rendering
state. Given the real-time streaming on the GUI, AdaT presents a deep learning
model to infer the rendering state, and synchronizes with the testing tool to
schedule the next event when the GUI is fully rendered. The evaluations
demonstrate the accuracy, efficiency, and effectiveness of our approach. We
also integrate our approach with the existing automated testing tool to
demonstrate the usefulness of AdaT in covering more activities and executing
more events on fully rendered GUIs.Comment: Proceedings of the 45th International Conference on Software
Engineerin
Automated GUI Testing Techniques for Android Applications
Mobile devices are integral parts of our daily lives; a little computer in our pocket has became a faithful assistant both for work than for amusement. The availability of mobile applications (commonly referred as apps) has made more and more useful bringing these devices with us everyday. The number of such applications in these years has faced a tremendous growth due to the market attractiveness; according to Forbes, by 2017 more than 270 billion mobile applications will be downloaded worldwide. The quality of a mobile application is a major concern for developers, users and application stores. According to a survey conducted by SmartBear from October to December 2013 nearly 50% of consumers will delete a mobile app if they encounter a bug. So, testing mobile applications to prevent the occurrence of software exceptions in production can be considered one of the key factor in uencing its quality together with the market response. As today, in literature many techniques have been presented aiming at testing mobile applications. In particular, many of them have been presented in the context of GUI Testing. The research activity described in this thesis is focused on proposing novel techniques and tools in the field of Automated GUI Testing for Mobile Applications. In particular, the work is targeted to the Android Operating System, that currently is the dominating operating system in the mobile devices market, although the results can be generalized to other mobile platforms
Guiding Random Graphical and Natural User Interface Testing Through Domain Knowledge
Users have access to a diverse set of interfaces that can be used to interact with software. Tools exist for automatically generating test data for an application, but the data required by each user interface is complex. Generating realistic data similar to that of a user is difficult. The environment which an application is running inside may also limit the data available, or updates to an operating system can break support for tools that generate test data. Consequently, applications exist for which there are no automated methods of generating test data similar to that which a user would provide through real usage of a user interface. With no automated method of generating data, the cost of testing increases and there is an increased chance of bugs being released into production code. In this thesis, we investigate techniques which aim to mimic users, observing how stored user interactions can be split to generate data targeted at specific states of an application, or to generate different subareas of the data structure provided by a user interface. To reduce the cost of gathering and labelling graphical user interface data, we look at generating randomised screen shots of applications, which can be automatically labelled and used in the training stage of a machine learning model. These trained models could guide a randomised approach at generating tests, achieving a significantly higher branch coverage than an unguided random approach. However, for natural user interfaces, which allow interaction through body tracking, we could not learn such a model through generated data. We find that models derived from real user data can generate tests with a significantly higher branch coverage than a purely random tester for both natural and graphical user interfaces. Our approaches use no feedback from an application during test generation. Consequently, the models are “generating data in the dark”. Despite this, these models can still generate tests with a higher coverage than random testing, but there may be a benefit to inferring the current state of an application and using this to guide data generation
Session-Based Recommender Systems for Action Selection in GUI Test Generation
Test generation at the graphical user interface (GUI) level has proven to be
an effective method to reveal faults. When doing so, a test generator has to
repeatably decide what action to execute given the current state of the system
under test (SUT). This problem of action selection usually involves random
choice, which is often referred to as monkey testing. Some approaches leverage
other techniques to improve the overall effectiveness, but only a few try to
create human-like actions---or even entire action sequences. We have built a
novel session-based recommender system that can guide test generation. This
allows us to mimic past user behavior, reaching states that require complex
interactions. We present preliminary results from an empirical study, where we
use GitHub as the SUT. These results show that recommender systems appear to be
well-suited for action selection, and that the approach can significantly
contribute to the improvement of GUI-based test generation.Comment: 5 pages, 3 figures, to be published in ICSTW 202
- …