15,700 research outputs found
Testing GUI-based Software with Undetermined Input Spaces
Most software applications feature a Graphical User Interface (GUI) front-end as the main, and often the only, method for the user to interact with the software. System-testing a software application requires it to be tested as a whole through the GUI. Testers need to generate sequences of GUI events (e.g., mouse clicks and menu selections) to exercise various behaviors of the application. Because the input space of a typical GUI (i.e., the space of all possible GUI events and their interactions) is often enormous, manual GUI testing is impractical. Model-based testing is a new approach that automatically and systematically generates a large number of test cases by leveraging a formal model representing the GUI input space. Unfortunately, modern applications often have a ``context-sensitive reachability GUI,'' in which the GUI components are only reachable with some particular state or environment constraints. Thus, it is challenging to determine the GUI input space and and obtain a GUI model for automated GUI testing.
This research proposes new testing techniques to tackle the challenges in model-based GUI testing. The central thesis is this: GUI-based applications can be effectively and efficiently tested by systematically and incrementally leveraging the application runtime execution observations.
To explore the thesis, a novel model-based testing paradigm called Observer-Model-Exercise* (OME*) is developed. This paradigm relies on the opportunistic observations obtained during test case execution to incrementally explore the GUI input space and construct a GUI model for test case generation.
To evaluate OME*, an open-source automated model-based GUI testing framework called GUITAR is developed. An empirical study with 8 widely-used open-source applications demonstrated that the OME* approach is feasible. Compared to previous model-based testing approaches, OME* was able to increase the GUI input space discovered by as much as 1,044%. As a result, 34 new faults were detected in the subject applications
Using Spec Explorer for Automatic Checking of Constraints in Software €Controlled Systems.
In software engineering, several formal models and tools are proposed for defining system requirements and constraints formally. Such formal definitions can help in the automatic checking and verification for them. It can also help in the automatic test case generation, execution and verification. In this paper, we will demonstrate and evaluate the usage of Spec Explorer from Microsoft for defining and checking examples of software controlled system such as cruise control. Such formal requirements can be eventually embedded in the developed system or can help in exposing important elements to test in the testing stage or through the usage of the applicationModel-Based Testing, Spec Explorer, FSM Models, Software Controlled Systems
Translating Video Recordings of Mobile App Usages into Replayable Scenarios
Screen recordings of mobile applications are easy to obtain and capture a
wealth of information pertinent to software developers (e.g., bugs or feature
requests), making them a popular mechanism for crowdsourced app feedback. Thus,
these videos are becoming a common artifact that developers must manage. In
light of unique mobile development constraints, including swift release cycles
and rapidly evolving platforms, automated techniques for analyzing all types of
rich software artifacts provide benefit to mobile developers. Unfortunately,
automatically analyzing screen recordings presents serious challenges, due to
their graphical nature, compared to other types of (textual) artifacts. To
address these challenges, this paper introduces V2S, a lightweight, automated
approach for translating video recordings of Android app usages into replayable
scenarios. V2S is based primarily on computer vision techniques and adapts
recent solutions for object detection and image classification to detect and
classify user actions captured in a video, and convert these into a replayable
test scenario. We performed an extensive evaluation of V2S involving 175 videos
depicting 3,534 GUI-based actions collected from users exercising features and
reproducing bugs from over 80 popular Android apps. Our results illustrate that
V2S can accurately replay scenarios from screen recordings, and is capable of
reproducing 89% of our collected videos with minimal overhead. A case
study with three industrial partners illustrates the potential usefulness of
V2S from the viewpoint of developers.Comment: In proceedings of the 42nd International Conference on Software
Engineering (ICSE'20), 13 page
UI-Design driven model-based testing
Testing interactive systems is notoriously difficult. Not only do we need to ensure that the functionality of the developed system is correct with respect to the requirements and specifications, we also need to ensure that the user interface to the system is correct (enables a user to access the functionality correctly) and is usable. These different requirements of interactive system testing are not easily combined within a single testing strategy. We investigate the use of models of interactive systems, which have been derived from design artefacts, as the basis for generating tests for an implemented system. We give a model-based method for testing interactive systems which has low overhead in terms of the models required and which enables testing of UI and system functionality from the perspective of user interaction
Overcoming Language Dichotomies: Toward Effective Program Comprehension for Mobile App Development
Mobile devices and platforms have become an established target for modern
software developers due to performant hardware and a large and growing user
base numbering in the billions. Despite their popularity, the software
development process for mobile apps comes with a set of unique, domain-specific
challenges rooted in program comprehension. Many of these challenges stem from
developer difficulties in reasoning about different representations of a
program, a phenomenon we define as a "language dichotomy". In this paper, we
reflect upon the various language dichotomies that contribute to open problems
in program comprehension and development for mobile apps. Furthermore, to help
guide the research community towards effective solutions for these problems, we
provide a roadmap of directions for future work.Comment: Invited Keynote Paper for the 26th IEEE/ACM International Conference
on Program Comprehension (ICPC'18
Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection
In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology
and framework for efficient and effective real-time malware detection,
leveraging the best of conventional machine learning (ML) and deep learning
(DL) algorithms. In PROPEDEUTICA, all software processes in the system start
execution subjected to a conventional ML detector for fast classification. If a
piece of software receives a borderline classification, it is subjected to
further analysis via more performance expensive and more accurate DL methods,
via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays
to the execution of software subjected to deep learning analysis as a way to
"buy time" for DL analysis and to rate-limit the impact of possible malware in
the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and
877 commonly used benign software samples from various categories for the
Windows OS. Our results show that the false positive rate for conventional ML
methods can reach 20%, and for modern DL methods it is usually below 6%.
However, the classification time for DL can be 100X longer than conventional ML
methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional
ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the
percentage of software subjected to DL analysis was approximately 40% on
average. Further, the application of delays in software subjected to ML reduced
the detection time by approximately 10%. Finally, we found and discussed a
discrepancy between the detection accuracy offline (analysis after all traces
are collected) and on-the-fly (analysis in tandem with trace collection). Our
insights show that conventional ML and modern DL-based malware detectors in
isolation cannot meet the needs of efficient and effective malware detection:
high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure
- …