26 research outputs found
Parameterizing Random Test Data According to Equivalence Classes
We are concerned with the problem of detecting bugs in machine learning applications. In the absence of sufficient real-world data, creating suitably large data sets for testing can be a difficult task. Random testing is one solution, but may have limited effectiveness in cases in which a reliable test oracle does not exist, as is the case of the machine learning applications of interest. To address this problem, we have developed an approach to creating data sets called "parameterized random data generation"Â. Our data generation framework allows us to isolate or combine different equivalence classes as desired, and then randomly generate large data sets using the properties of those equivalence classes as parameters. This allows us to take advantage of randomness but still have control over test case selection at the system testing level. We present our findings from using the approach to test two different machine learning ranking applications
An Approach to Software Testing of Machine Learning Applications
Some machine learning applications are intended to learn properties of data sets where the correct answers are not already known to human users. It is challenging to test such ML software, because there is no reliable test oracle. We describe a software testing approach aimed at addressing this problem. We present our findings from testing implementations of two different ML ranking algorithms: Support Vector Machines and MartiRank
Recommended from our members
Properties of Machine Learning Applications for Use in Metamorphic Testing
It is challenging to test machine learning (ML) applications, which are intended to learn properties of data sets where the correct answers are not already known. In the absence of a test oracle, one approach to testing these applications is to use metamorphic testing, in which properties of the application are exploited to define transformation functions on the input, such that the new output will be unchanged or can easily be predicted based on the original output; if the output is not as expected, then a defect must exist in the application. Here, we seek to enumerate and classify the metamorphic properties of some machine learning algorithms, and demonstrate how these can be applied to reveal defects in the applications of interest. In addition to the results of our testing, we present a set of properties that can be used to define these metamorphic relationships so that metamorphic testing can be used as a general approach to testing machine learning applications
Recommended from our members
Properties of Machine Learning Applications for Use in Metamorphic Testing
It is challenging to test machine learning (ML) applications, which are intended to learn properties of data sets where the correct answers are not already known. In the absence of a test oracle, one approach to testing these applications is to use metamorphic testing, in which properties of the application are exploited to define transformation functions on the input, such that the new output will be unchanged or can easily be predicted based on the original output; if the output is not as expected, then a defect must exist in the application. Here, we seek to enumerate and classify the metamorphic properties of some machine learning algorithms, and demonstrate how these can be applied to reveal defects in the applications of interest. In addition to the results of our testing, we present a set of properties that can be used to define these metamorphic relationships so that metamorphic testing can be used as a general approach to testing machine learning applications
Recommended from our members
Improving the Quality of Computational Science Software by Using Metamorphic Relations to Test Machine Learning Applications
Many applications in the field of scientific computing - such as computational biology, computational linguistics, and others - depend on Machine Learning algorithms to provide important core functionality to support solutions in the particular problem domains. However, it is difficult to test such applications because often there is no 'test oracle' to indicate what the correct output should be for arbitrary input. To help address the quality of scientific computing software, in this paper we present a technique for testing the implementations of machine learning classification algorithms on which such scientific computing software depends. Our technique is based on an approach called 'metamorphic testing', which has been shown to be effective in such cases. In addition to presenting our technique, we describe a case study we performed on a real-world machine learning application framework, and discuss how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also discuss how our findings can be of use to other areas of computational science and engineering
On testing effectiveness of metamorphic relations: A case study
One fundamental challenge for software testing is the oracle problem which means that either there does not exist a mechanism (called oracle) to verify the test output given any possible program input or it is very expensive if not impossible to apply the oracle. Metamorphic testing is an innovative approach to oracle problem. In metamorphic testing metamorphic relations are derived from the innate characteristics of the software under test. These relations can help to generate test data and verify the correctness of the test result without the need of oracle. The effectiveness of metamorphic relations can play a significant role in the testing process. It has been argued that the metamorphic relations that cause different software execution behaviors should have high fault detection ability. In this paper we conduct a case study to analyze the relationship between the execution behavior and the fault-detection effectiveness of metamorphic relations. Some code coverage criteria are used to reflect the execution behavior. It is shown that there is a certain degree of correlation between the code coverage achieved by a metamorphic relation and its fault-detection effectiveness
Recommended from our members
Deux: Autonomic Testing System for Operating System Upgrades
Operating system upgrades and patches sometimes break applications that worked fine on the older version. We present an autonomic approach to testing of OS updates while minimizing downtime, usable without local regression suites or IT expertise. Deux utilizes a dual-layer virtual machine architecture, with lightweight application process checkpoint and resume across OS versions, enabling simultaneous execution of the same applications on both OS versions in different VMs. Inputs provided by ordinary users to the production old version are also fed to the new version. The old OS acts as a pseudo-oracle for the update, and application state is automatically re-cloned to continue testing after any output discrepancies (intercepted at system call level) - all transparently to users. If all differences are deemed inconsequential, then the VM roles are switched with the application state already in place. Our empirical evaluation with both LAMP and standalone applications demonstrates Deux's efficiency and effectiveness
Recommended from our members
A Framework for Quality Assurance of Machine Learning Applications
Some machine learning applications are intended to learn properties of data sets where the correct answers are not already known to human users. It is challenging to test and debug such ML software, because there is no reliable test oracle. We describe a framework and collection of tools aimed to assist with this problem. We present our findings from using the testing framework with three implementations of an ML ranking algorithm (all of which had bugs)
Recommended from our members
Using JML Runtime Assertion Checking to Automate Metamorphic Testing in Applications without Test Oracles
It is challenging to test applications and functions for which the correct output for arbitrary input cannot be known in advance, e.g. some computational science or machine learning applications. In the absence of a test oracle, one approach to testing these applications is to use metamorphic testing: existing test case input is modified to produce new test cases in such a manner that, when given the new input, the application should produce an output that can be easily be computed based on the original output. That is, if input x produces output f(x), then we create input x' such that we can predict f(x') based on f(x); if the application or function does not produce the expected output, then a defect must exist, and either f(x) or f(x') (or both) is wrong. By using metamorphic testing, we are able to provide built-in 'pseudo-oracles' for these so-called 'nontestable programs' that have no test oracles. In this paper, we describe an approach in which a function's metamorphic properties are specified using an extension to the Java Modeling Language (JML), a behavioral interface specification language that is used to support the 'design by contract' paradigm in Java applications. Our implementation, called Corduroy, pre-processes these specifications and generates test code that can be executed using JML runtime assertion checking, for ensuring that the specifications hold during program execution. In addition to presenting our approach and implementation, we also describe our findings from case studies in which we apply our technique to applications without test oracles