4,133 research outputs found
Maintenance of Automated Test Suites in Industry: An Empirical study on Visual GUI Testing
Context: Verification and validation (V&V) activities make up 20 to 50
percent of the total development costs of a software system in practice. Test
automation is proposed to lower these V&V costs but available research only
provides limited empirical data from industrial practice about the maintenance
costs of automated tests and what factors affect these costs. In particular,
these costs and factors are unknown for automated GUI-based testing.
Objective: This paper addresses this lack of knowledge through analysis of
the costs and factors associated with the maintenance of automated GUI-based
tests in industrial practice.
Method: An empirical study at two companies, Siemens and Saab, is reported
where interviews about, and empirical work with, Visual GUI Testing is
performed to acquire data about the technique's maintenance costs and
feasibility.
Results: 13 factors are observed that affect maintenance, e.g. tester
knowledge/experience and test case complexity. Further, statistical analysis
shows that developing new test scripts is costlier than maintenance but also
that frequent maintenance is less costly than infrequent, big bang maintenance.
In addition a cost model, based on previous work, is presented that estimates
the time to positive return on investment (ROI) of test automation compared to
manual testing.
Conclusions: It is concluded that test automation can lower overall software
development costs of a project whilst also having positive effects on software
quality. However, maintenance costs can still be considerable and the less time
a company currently spends on manual testing, the more time is required before
positive, economic, ROI is reached after automation
LLM for Test Script Generation and Migration: Challenges, Capabilities, and Opportunities
This paper investigates the application of large language models (LLM) in the
domain of mobile application test script generation. Test script generation is
a vital component of software testing, enabling efficient and reliable
automation of repetitive test tasks. However, existing generation approaches
often encounter limitations, such as difficulties in accurately capturing and
reproducing test scripts across diverse devices, platforms, and applications.
These challenges arise due to differences in screen sizes, input modalities,
platform behaviors, API inconsistencies, and application architectures.
Overcoming these limitations is crucial for achieving robust and comprehensive
test automation.
By leveraging the capabilities of LLMs, we aim to address these challenges
and explore its potential as a versatile tool for test automation. We
investigate how well LLMs can adapt to diverse devices and systems while
accurately capturing and generating test scripts. Additionally, we evaluate its
cross-platform generation capabilities by assessing its ability to handle
operating system variations and platform-specific behaviors. Furthermore, we
explore the application of LLMs in cross-app migration, where it generates test
scripts across different applications and software environments based on
existing scripts.
Throughout the investigation, we analyze its adaptability to various user
interfaces, app architectures, and interaction patterns, ensuring accurate
script generation and compatibility. The findings of this research contribute
to the understanding of LLMs' capabilities in test automation. Ultimately, this
research aims to enhance software testing practices, empowering app developers
to achieve higher levels of software quality and development efficiency.Comment: Accepted by the 23rd IEEE International Conference on Software
Quality, Reliability, and Security (QRS 2023
From Predicting Solar Activity to Forecasting Space Weather: Practical Examples of Research-to-Operations and Operations-to-Research
The successful transition of research to operations (R2O) and operations to
research (O2R) requires, above all, interaction between the two communities. We
explore the role that close interaction and ongoing communication played in the
successful fielding of three separate developments: an observation platform, a
numerical model, and a visualization and specification tool. Additionally, we
will examine how these three pieces came together to revolutionize
interplanetary coronal mass ejection (ICME) arrival forecasts. A discussion of
the importance of education and training in ensuring a positive outcome from
R2O activity follows. We describe efforts by the meteorological community to
make research results more accessible to forecasters and the applicability of
these efforts to the transfer of space-weather research.We end with a
forecaster "wish list" for R2O transitions. Ongoing, two-way communication
between the research and operations communities is the thread connecting it
all.Comment: 18 pages, 3 figures, Solar Physics in pres
Using multi-locators to increase the robustness of web test cases
The main reason for the fragility of web test cases is the inability of web element locators to work correctly when the web page DOM evolves. Web elements locators are used in web test cases to identify all the GUI objects to operate upon and eventually to retrieve web page content that is compared against some oracle in order to decide whether the test case has passed or not. Hence, web element locators play an extremely important role in web testing and when a web element locator gets broken developers have to spend substantial time and effort to repair it. While algorithms exist to produce robust web element locators to be used in web test scripts, no algorithm is perfect and different algorithms are exposed to different fragilities when the software evolves. Based on such observation, we propose a new type of locator, named multi-locator, which selects the best locator among a candidate set of locators produced by different algorithms. Such selection is based on a voting procedure that assigns different voting weights to different locator generation algorithms. Experimental results obtained on six web applications, for which a subsequent release was available, show that the multi-locator is more robust than the single locators (about -30% of broken locators w.r.t. the most robust kind of single locator) and that the execution overhead required by the multiple queries done with different locators is negligible (2-3% at most)
Similarity-based Web Element Localization for Robust Test Automation
Non-robust (fragile) test execution is a commonly reported challenge in GUI-based test automation, despite much research and several proposed solutions. A test script needs to be resilient to (minor) changes in the tested application but, at the same time, fail when detecting potential issues that require investigation. Test script fragility is a multi-faceted problem. However, one crucial challenge is how to reliably identify and locate the correct target web elements when the website evolves between releases or otherwise fail and report an issue. This article proposes and evaluates a novel approach called similarity-based web element localization (Similo), which leverages information from multiple web element locator parameters to identify a target element using a weighted similarity score. This experimental study compares Similo to a baseline approach for web element localization. To get an extensive empirical basis, we target 48 of the most popular websites on the Internet in our evaluation. Robustness is considered by counting the number of web elements found in a recent website version compared to how many of these existed in an older version. Results of the experiment show that Similo outperforms the baseline; it failed to locate the correct target web element in 91 out of 801 considered cases (i.e., 11%) compared to 214 failed cases (i.e., 27%) for the baseline approach. The time efficiency of Similo was also considered, where the average time to locate a web element was determined to be 4 milliseconds. However, since the cost of web interactions (e.g., a click) is typically on the order of hundreds of milliseconds, the additional computational demands of Similo can be considered negligible. This study presents evidence that quantifying the similarity between multiple attributes of web elements when trying to locate them, as in our proposed Similo approach, is beneficial. With acceptable efficiency, Similo gives significantly higher effectiveness (i.e., robustness) than the baseline web element localization approach
Automated GUI performance testing
A significant body of prior work has devised approaches for automating the functional testing of interactive applications. However, little work exists for automatically testing their performance. Performance testing imposes additional requirements upon GUI test automation tools: the tools have to be able to replay complex interactive sessions, and they have to avoid perturbing the application's performance. We study the feasibility of using five Java GUI capture and replay tools for GUI performance test automation. Besides confirming the severity of the previously known GUI element identification problem, we also describe a related problem, the temporal synchronization problem, which is of increasing importance for GUI applications that use timer-driven activity. We find that most of the tools we study have severe limitations when used for recording and replaying realistic sessions of real-world Java applications and that all of them suffer from the temporal synchronization problem. However, we find that the most reliable tool, Pounder, causes only limited perturbation and thus can be used to automate performance testing. Based on an investigation of Pounder's approach, we further improve its robustness and reduce its perturbation. Finally, we demonstrate in a set of case studies that the conclusions about perceptible performance drawn from manual tests still hold when using automated tests driven by Pounder. Besides the significance of our findings to GUI performance testing, the results are also relevant to capture and replay-based functional GUI test automation approache
Creating GUI testing tools using accessibility technologies
Abstract Since manual black-box testing of GUI-based APplications (GAPs
Working Notes from the 1992 AAAI Workshop on Automating Software Design. Theme: Domain Specific Software Design
The goal of this workshop is to identify different architectural approaches to building domain-specific software design systems and to explore issues unique to domain-specific (vs. general-purpose) software design. Some general issues that cut across the particular software design domain include: (1) knowledge representation, acquisition, and maintenance; (2) specialized software design techniques; and (3) user interaction and user interface
- …