Search CORE

649 research outputs found

METTLE: a METamorphic testing approach to assessing and validating unsupervised machine LEarning systems

Author: Chen Tsong Yueh
Liu Yang
Poon Pak-Lok
Xie Xiaoyuan
Xu Baowen
Zhang Zhiyi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/06/2019
Field of study

Unsupervised machine learning is the training of an artificial intelligence system using information that is neither classified nor labeled, with a view to modeling the underlying structure or distribution in a dataset. Since unsupervised machine learning systems are widely used in many real-world applications, assessing the appropriateness of these systems and validating their implementations with respect to individual users' requirements and specific application scenarios

\,/\,

contexts are indisputably two important tasks. Such assessment and validation tasks, however, are fairly challenging due to the absence of a priori knowledge of the data. In view of this challenge, we develop a

\textbf{MET}

amorphic

\textbf{T}

esting approach to assessing and validating unsupervised machine

\textbf{LE}

arning systems, abbreviated as METTLE. Our approach provides a new way to unveil the (possibly latent) characteristics of various machine learning systems, by explicitly considering the specific expectations and requirements of these systems from individual users' perspectives. To support METTLE, we have further formulated 11 generic metamorphic relations (MRs), covering users' generally expected characteristics that should be possessed by machine learning systems. To demonstrate the viability and effectiveness of METTLE we have performed an experiment involving six commonly used clustering systems. Our experiment has shown that, guided by user-defined MR-based adequacy criteria, end users are able to assess, validate, and select appropriate clustering systems in accordance with their own specific needs. Our investigation has also yielded insightful understanding and interpretation of the behavior of the machine learning systems from an end-user software engineering's perspective, rather than a designer's or implementor's perspective, who normally adopts a theoretical approach

arXiv.org e-Print Archive

aCQUIRe

ACQUIRE

Using Machine Learning to Generate Test Oracles: A Systematic Literature Review

Author: Fontes Afonso
Gay Gregory
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

Machine learning may enable the automated generation of test oracles. We have characterized emerging research in this area through a systematic literature review examining oracle types, researcher goals, the ML techniques applied, how the generation process was assessed, and the open research challenges in this emerging field.Based on a sample of 22 relevant studies, we observed that ML algorithms generated test verdict, metamorphic relation, and - most commonly - expected output oracles. Almost all studies employ a supervised or semi-supervised approach, trained on labeled system executions or code metadata - including neural networks, support vector machines, adaptive boosting, and decision trees. Oracles are evaluated using the mutation score, correct classifications, accuracy, and ROC. Work-to-date show great promise, but there are significant open challenges regarding the requirements imposed on training data, the complexity of modeled functions, the ML algorithms employed - and how they are applied - the benchmarks used by researchers, and replicability of the studies. We hope that our findings will serve as a roadmap and inspiration for researchers in this field

Chalmers Research

Using Machine Learning to Generate Test Oracles: A Systematic Literature Review

Author: Fontes Afonso
Gay Gregory
Publication venue
Publication date: 01/01/2021
Field of study

Machine learning may enable the automated generation of test oracles. We have characterized emerging research in this area through a systematic literature review examining oracle types, researcher goals, the ML techniques applied, how the generation process was assessed, and the open research challenges in this emerging field. Based on a sample of 22 relevant studies, we observed that ML algorithms generated test verdict, metamorphic relation, and - most commonly - expected output oracles. Almost all studies employ a supervised or semi-supervised approach, trained on labeled system executions or code metadata - including neural networks, support vector machines, adaptive boosting, and decision trees. Oracles are evaluated using the mutation score, correct classifications, accuracy, and ROC. Work-to-date show great promise, but there are significant open challenges regarding the requirements imposed on training data, the complexity of modeled functions, the ML algorithms employed - and how they are applied - the benchmarks used by researchers, and replicability of the studies. We hope that our findings will serve as a roadmap and inspiration for researchers in this field.Comment: Pre-print. Article accepted to 1st International Workshop on Test Oracles at ESEC/FSE 202

arXiv.org e-Print Archive

Chalmers Research

The Integration of Machine Learning into Automated Test Generation: A Systematic Mapping Study

Author: Fontes Afonso
Gay Gregory
Publication venue
Publication date: 09/12/2022
Field of study

Context: Machine learning (ML) may enable effective automated test generation. Objective: We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges. Methods: We perform a systematic mapping on a sample of 102 publications. Results: ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property-based, and expected output oracles. Supervised learning - often based on neural networks - and reinforcement learning - often based on Q-learning - are common, and some publications also employ unsupervised or semi-supervised learning. (Semi-/Un-)Supervised approaches are evaluated using both traditional testing metrics and ML-related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. Conclusion: Work-to-date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed - and how they are applied - benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.Comment: Under submission to Software Testing, Verification, and Reliability journal. (arXiv admin note: text overlap with arXiv:2107.00906 - This is an earlier study that this study extends

arXiv.org e-Print Archive

Chalmers Research

Metamorphic exploration of an unsupervised clustering program

Author: Towey Dave
Yang Sen
Zhou Zhi Quan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Machine learning has been becoming increasingly popular and widely-used in various industry domains. The presence of the oracle problem, however, makes it difficult to ensure the quality of this kind of software. Furthermore, the popularity of machine learning and its application has attracted many users who are not experts in this field. In this paper, we report on using a recently introduced method called metamorphic exploration where we proposed a set of hypothesized metamorphic relations for an unsupervised clustering program, Weka, to enhance understanding of the system and its better use