277,957 research outputs found
An Approach to Software Testing of Machine Learning Applications
Some machine learning applications are intended to learn properties of data sets where the correct answers are not already known to human users. It is challenging to test such ML software, because there is no reliable test oracle. We describe a software testing approach aimed at addressing this problem. We present our findings from testing implementations of two different ML ranking algorithms: Support Vector Machines and MartiRank
Identifying Implementation Bugs in Machine Learning based Image Classifiers using Metamorphic Testing
We have recently witnessed tremendous success of Machine Learning (ML) in
practical applications. Computer vision, speech recognition and language
translation have all seen a near human level performance. We expect, in the
near future, most business applications will have some form of ML. However,
testing such applications is extremely challenging and would be very expensive
if we follow today's methodologies. In this work, we present an articulation of
the challenges in testing ML based applications. We then present our solution
approach, based on the concept of Metamorphic Testing, which aims to identify
implementation bugs in ML based image classifiers. We have developed
metamorphic relations for an application based on Support Vector Machine and a
Deep Learning based application. Empirical validation showed that our approach
was able to catch 71% of the implementation bugs in the ML applications.Comment: Published at 27th ACM SIGSOFT International Symposium on Software
Testing and Analysis (ISSTA 2018
Testing and Validating Machine Learning Classifiers by Metamorphic Testing
Machine Learning algorithms have provided important core functionality to support solutions in many scientific computing applications - such as computational biology, computational linguistics, and others. However, it is difficult to test such applications because often there is no "test oracle" to indicate what the correct output should be for arbitrary input. To help address the quality of scientific computing software, in this paper we present a technique for testing the implementations of machine learning classification algorithms on which such scientific computing software depends. Our technique is based on an approach called "metamorphic testing", which has been shown to be effective in such cases. Also presented is a case study on a real-world machine learning application framework, and a discussion of how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also conduct mutation analysis and cross-validation, which reveal that our method has very high effectiveness in killing mutants, and that observing expected cross-validation result alone is not sufficient to test for the correctness of a supervised classification program. Metamorphic testing is strongly recommended as a complementary approach. Finally we discuss how our findings can be used in other areas of computational science and engineering
Recommended from our members
Improving the Quality of Computational Science Software by Using Metamorphic Relations to Test Machine Learning Applications
Many applications in the field of scientific computing - such as computational biology, computational linguistics, and others - depend on Machine Learning algorithms to provide important core functionality to support solutions in the particular problem domains. However, it is difficult to test such applications because often there is no 'test oracle' to indicate what the correct output should be for arbitrary input. To help address the quality of scientific computing software, in this paper we present a technique for testing the implementations of machine learning classification algorithms on which such scientific computing software depends. Our technique is based on an approach called 'metamorphic testing', which has been shown to be effective in such cases. In addition to presenting our technique, we describe a case study we performed on a real-world machine learning application framework, and discuss how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also discuss how our findings can be of use to other areas of computational science and engineering
Teaching agents to learn: from user study to implementation
Graphical user interfaces have helped center computer use on viewing and editing, rather than on programming. Yet the need for end-user programming continues to grow. Software developers have responded to the demand with a barrage of customizable applications and operating systems. But the learning curve associated with a high level of customizability-even in GUI-based operating systems-often prevents users from easily modifying their software. Ironically, the question has become, "What is the easiest way for end users to program?" Perhaps the best way to customize a program, given current interface and software design, is for users to annotate tasks-verbally or via the keyboard-as they are executing them. Experiments have shown that users can "teach" a computer most easily by demonstrating a desired behavior. But the teaching approach raises new questions about how the system, as a learning machine, will correlate, generalize, and disambiguate a user's instructions.
To understand how best to create a system that can learn, the authors conducted an experiment in which users attempt to train an intelligent agent to edit a bibliography. Armed with the results of these experiments, the authors implemented an interactive machine learning system, which they call Configurable Instructible Machine Architecture. Designed to acquire behavior concepts from few examples, Cima keeps users informed and allows them to influence the course of learning. Programming by demonstration reduces boring, repetitive work. Perhaps the most important lesson the authors learned is the value of involving users in the design process. By testing and critiquing their design ideas, users keep the designers focused on their objective: agents that make computer-based work more productive and more enjoyable
Extracting Tasks from Customize Portal using Natural Language Processing
In software documentation, product knowledge and software requirement are very important to improve product quality. Within maintenance stage, reading of whole documentation of large corpus won’t be possible by developers. They need to receive software documentation i.e. (development, designing and testing etc.) in a short period of time. Important documents are able to record in software documentation. There live a space between information which developer wants and software documentation. To solve this problem, an approach for extracting relevant task that is based on heuristically matching the structure of the documentation under three phases of software documentation (i.e. documentation, development and testing) is described. Our main idea is that task is extracted automatically from the software documentation, freeing the developer easily get the required data from software documentation with customize portal using WordNet library and machine learning technique. And then the category of task can be generated easily from existing applications using natural language processing. Our approach use WordNet library to identify relevant tasks for calculating frequency of each word which allows developers in a piece of software to discover the word usage
Recommended from our members
Empirical Evaluation of Approaches to Testing Applications without Test Oracles
Software testing of applications in fields like scientific computing, simulation, machine learning, etc. is particularly challenging because many applications in these domains have no reliable "test oracle" to indicate whether the program's output is correct when given arbitrary input. A common approach to testing such applications has been to use a "pseudo-oracle", in which multiple independently-developed implementations of an algorithm process an input and the results are compared. Other approaches include the use of program invariants, formal specification languages, trace and log file analysis, and metamorphic testing. In this paper, we present the results of two empirical studies in which we compare the effectiveness of some of these approaches, including metamorphic testing, pseudo-oracles, and runtime assertion checking. We also analyze the results in terms of the software development process, and discuss suggestions for practitioners and researchers who need to test software without a test oracle
The scenario coevolution paradigm: adaptive quality assurance for adaptive systems
Systems are becoming increasingly more adaptive, using techniques like machine learning to enhance their behavior on their own rather than only through human developers programming them. We analyze the impact the advent of these new techniques has on the discipline of rigorous software engineering, especially on the issue of quality assurance. To this end, we provide a general description of the processes related to machine learning and embed them into a formal framework for the analysis of adaptivity, recognizing that to test an adaptive system a new approach to adaptive testing is necessary. We introduce scenario coevolution as a design pattern describing how system and test can work as antagonists in the process of software evolution. While the general pattern applies to large-scale processes (including human developers further augmenting the system), we show all techniques on a smaller-scale example of an agent navigating a simple smart factory. We point out new aspects in software engineering for adaptive systems that may be tackled naturally using scenario coevolution. This work is a substantially extended take on Gabor et al. (International symposium on leveraging applications of formal methods, Springer, pp 137–154, 2018)
Jamming Detection and Classification in OFDM-based UAVs via Feature- and Spectrogram-tailored Machine Learning
In this paper, a machine learning (ML) approach is proposed to detect and classify jamming attacks against orthogonal frequency division multiplexing (OFDM) receivers with applications to unmanned aerial vehicles (UAVs). Using software-defined radio (SDR), four types of jamming attacks; namely, barrage, protocol-aware, single-tone, and successive-pulse are launched and investigated. Each type is qualitatively evaluated considering jamming range, launch complexity, and attack severity. Then, a systematic testing procedure is established by placing an SDR in the vicinity of a UAV (i.e., drone) to extract radiometric features before and after a jamming attack is launched. Numeric features that include signal-to-noise ratio (SNR), energy threshold, and key OFDM parameters are used to develop a feature-based classification model via conventional ML algorithms. Furthermore, spectrogram images collected following the same testing procedure are exploited to build a spectrogram-based classification model via state-of-the-art deep learning algorithms (i.e., convolutional neural networks). The performance of both types of algorithms is analyzed quantitatively with metrics including detection and false alarm rates. Results show that the spectrogram-based model classifies jamming with an accuracy of 99.79% and a false-alarm of 0.03%, in comparison to 92.20% and 1.35%, respectively, with the feature-based counterpart
- …