Search CORE

104,603 research outputs found

Empirical Evaluation of Test Coverage for Functional Programs

Author: Cheng Yufeng
Hao Dan
Wang Meng
Xiong Yingfei
Zhang Lu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

The correlation between test coverage and test effectiveness is important to justify the use of coverage in practice. Existing results on imperative programs mostly show that test coverage predicates effectiveness. However, since functional programs are usually structurally different from imperative ones, it is unclear whether the same result may be derived and coverage can be used as a prediction of effectiveness on functional programs. In this paper we report the first empirical study on the correlation between test coverage and test effectiveness on functional programs. We consider four types of coverage: as input coverages, statement/branch coverage and expression coverage, and as oracle coverages, count of assertions and checked coverage. We also consider two types of effectiveness: raw effectiveness and normalized effectiveness. Our results are twofold. (1) In general the findings on imperative programs still hold on functional programs, warranting the use of coverage in practice. (2) On specific coverage criteria, the results may be unexpected or different from the imperative ones, calling for further studies on functional programs

Crossref

Kent Academic Repository

Explore Bristol Research

Recommended from our members

An empirical investigation into the impact of refactoring on regression testing

Author: Rachatasumrit Napol
Publication venue
Publication date: 04/05/2012
Field of study

It is widely believed that refactoring improves software quality and developer’s productivity by making it easier to maintain and understand software systems. On the other hand, some believe that refactoring has the risk of functionality regression and increased testing cost. This paper investigates the impact of refactoring edits on regression tests using the version history of Java open source projects: (1) Are there adequate regression tests for refactoring in practice? (2) How many of existing regression tests are relevant to refactoring edits and thus need to be re-run for the new version? (3) What proportion of failure-inducing changes are relevant to refactorings? By using a refactoring reconstruction analysis and a change impact analysis in tandem, we investigate the relationship between the types and locations of refactoring edits identified by RefFinder and the affecting changes and affected tests identified by the FaultTracer change impact analysis. The results on three open source projects, JMeter, XMLSecurity, and ANT, show that only 22% of refactored methods and fields are tested by existing regression tests. While refactorings only constitutes 8% of atomic changes, 38% of affected tests are relevant to refactorings. Furthermore, refactorings are involved in almost a half of failed test cases. These results call for new automated regression test augmentation and selection techniques for validating refactoring edits.Electrical and Computer Engineerin

Texas ScholarWorks

Improving Function Coverage with Munch: A Hybrid Fuzzing and Directed Symbolic Execution Approach

Author: Hutzelmann Thomas
Ognawala Saahil
Pretschner Alexander
Psallida Eirini
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/12/2017
Field of study

Fuzzing and symbolic execution are popular techniques for finding vulnerabilities and generating test-cases for programs. Fuzzing, a blackbox method that mutates seed input values, is generally incapable of generating diverse inputs that exercise all paths in the program. Due to the path-explosion problem and dependence on SMT solvers, symbolic execution may also not achieve high path coverage. A hybrid technique involving fuzzing and symbolic execution may achieve better function coverage than fuzzing or symbolic execution alone. In this paper, we present Munch, an open source framework implementing two hybrid techniques based on fuzzing and symbolic execution. We empirically show using nine large open-source programs that overall, Munch achieves higher (in-depth) function coverage than symbolic execution or fuzzing alone. Using metrics based on total analyses time and number of queries issued to the SMT solver, we also show that Munch is more efficient at achieving better function coverage.Comment: To appear at 33rd ACM/SIGAPP Symposium On Applied Computing (SAC). To be held from 9th to 13th April, 201

arXiv.org e-Print Archive

Crossref

Measuring Coverage of Prolog Programs Using Mutation Testing

Author: AJ Offutt
CH Papadimitriou
EY Shapiro
J Wielemaker
JW Lloyd
KL Clark
RS Sangwan
S Krings
Y Jia
Publication venue
Publication date: 23/08/2018
Field of study

Testing is an important aspect in professional software development, both to avoid and identify bugs as well as to increase maintainability. However, increasing the number of tests beyond a reasonable amount hinders development progress. To decide on the completeness of a test suite, many approaches to assert test coverage have been suggested. Yet, frameworks for logic programs remain scarce. In this paper, we introduce a framework for Prolog programs measuring test coverage using mutations. We elaborate the main ideas of mutation testing and transfer them to logic programs. To do so, we discuss the usefulness of different mutations in the context of Prolog and empirically evaluate them in a new mutation testing framework on different examples.Comment: 16 pages, Accepted for presentation in WFLP 201

arXiv.org e-Print Archive

Crossref

SmartUnit: Empirical Evaluations for Automated Unit Testing of Embedded Software in Industry

Author: Miao Weikai
Pu Geguang
Su Ting
Wu Ke
Yan Yichen
Yao Yinbo
Zhang Chengyu
Zhou Hanru
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/06/2018
Field of study

In this paper, we aim at the automated unit coverage-based testing for embedded software. To achieve the goal, by analyzing the industrial requirements and our previous work on automated unit testing tool CAUT, we rebuild a new tool, SmartUnit, to solve the engineering requirements that take place in our partner companies. SmartUnit is a dynamic symbolic execution implementation, which supports statement, branch, boundary value and MC/DC coverage. SmartUnit has been used to test more than one million lines of code in real projects. For confidentiality motives, we select three in-house real projects for the empirical evaluations. We also carry out our evaluations on two open source database projects, SQLite and PostgreSQL, to test the scalability of our tool since the scale of the embedded software project is mostly not large, 5K-50K lines of code on average. From our experimental results, in general, more than 90% of functions in commercial embedded software achieve 100% statement, branch, MC/DC coverage, more than 80% of functions in SQLite achieve 100% MC/DC coverage, and more than 60% of functions in PostgreSQL achieve 100% MC/DC coverage. Moreover, SmartUnit is able to find the runtime exceptions at the unit testing level. We also have reported exceptions like array index out of bounds and divided-by-zero in SQLite. Furthermore, we analyze the reasons of low coverage in automated unit testing in our setting and give a survey on the situation of manual unit testing with respect to automated unit testing in industry.Comment: In Proceedings of 40th International Conference on Software Engineering: Software Engineering in Practice Track, Gothenburg, Sweden, May 27-June 3, 2018 (ICSE-SEIP '18), 10 page

arXiv.org e-Print Archive

Crossref

An empirical investigation into branch coverage for C programs using CUTE and AUSTIN

Author: Baresel
Baresel
Baudry
Botella
Bottaci
Buehler
Burnim
Chakrabarti
Clark
Cohen
Csallner
Ferguson
Godefroid
Godefroid
Godefroid
Harman
Harman
Harman
Harman
Harman
Inkumsah
King
Kiran Lakhotia
Korel
Lakhotia
Majumdar
Mark Harman
McMinn
McMinn
Michael
Miller
Necula
Pargas
Phil McMinn
Sen
Sen
Tillmann
Tonella
Walcott
Wappler
Wegener
Wegener
Williams
Xie
Yoo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Automated test data generation has remained a topic of considerable interest for several decades because it lies at the heart of attempts to automate the process of Software Testing. This paper reports the results of an empirical study using the dynamic symbolic-execution tool. CUTE, and a search based tool, AUSTIN on five non-trivial open source applications. The aim is to provide practitioners with an assessment of what can be achieved by existing techniques with little or no specialist knowledge and to provide researchers with baseline data against which to measure subsequent work. To achieve this, each tool is applied 'as is', with neither additional tuning nor supporting harnesses and with no adjustments applied to the subject programs under test. The mere fact that these tools can be applied 'out of the box' in this manner reflects the growing maturity of Automated test data generation. However, as might be expected, the study reveals opportunities for improvement and suggests ways to hybridize these two approaches that have hitherto been developed entirely independently. (C) 2010 Elsevier Inc. All rights reserved

CiteSeerX

Crossref

UCL Discovery

King's Research Portal

White Rose Research Online

DSpot: Test Amplification for Automatic Assessment of Computational Diversity

Author: Allier Simon
Baudry Benoit
Monperrus Martin
Rodriguez-Cancio Marcelino
Publication venue
Publication date: 09/06/2015
Field of study

Context: Computational diversity, i.e., the presence of a set of programs that all perform compatible services but that exhibit behavioral differences under certain conditions, is essential for fault tolerance and security. Objective: We aim at proposing an approach for automatically assessing the presence of computational diversity. In this work, computationally diverse variants are defined as (i) sharing the same API, (ii) behaving the same according to an input-output based specification (a test-suite) and (iii) exhibiting observable differences when they run outside the specified input space. Method: Our technique relies on test amplification. We propose source code transformations on test cases to explore the input domain and systematically sense the observation domain. We quantify computational diversity as the dissimilarity between observations on inputs that are outside the specified domain. Results: We run our experiments on 472 variants of 7 classes from open-source, large and thoroughly tested Java classes. Our test amplification multiplies by ten the number of input points in the test suite and is effective at detecting software diversity. Conclusion: The key insights of this study are: the systematic exploration of the observable output space of a class provides new insights about its degree of encapsulation; the behavioral diversity that we observe originates from areas of the code that are characterized by their flexibility (caching, checking, formatting, etc.).Comment: 12 page

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

Evolutionary improvement of programs

Author: Arcuri A.
Clark J.A.
White D.R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2011
Field of study

Most applications of genetic programming (GP) involve the creation of an entirely new function, program or expression to solve a specific problem. In this paper, we propose a new approach that applies GP to improve existing software by optimizing its non-functional properties such as execution time, memory usage, or power consumption. In general, satisfying non-functional requirements is a difficult task and often achieved in part by optimizing compilers. However, modern compilers are in general not always able to produce semantically equivalent alternatives that optimize non-functional properties, even if such alternatives are known to exist: this is usually due to the limited local nature of such optimizations. In this paper, we discuss how best to combine and extend the existing evolutionary methods of GP, multiobjective optimization, and coevolution in order to improve existing software. Given as input the implementation of a function, we attempt to evolve a semantically equivalent version, in this case optimized to reduce execution time subject to a given probability distribution of inputs. We demonstrate that our framework is able to produce non-obvious optimizations that compilers are not yet able to generate on eight example functions. We employ a coevolved population of test cases to encourage the preservation of the function's semantics. We exploit the original program both through seeding of the population in order to focus the search, and as an oracle for testing purposes. As well as discussing the issues that arise when attempting to improve software, we employ rigorous experimental method to provide interesting and practical insights to suggest how to address these issues

Enlighten

The Optimisation of Stochastic Grammars to Enable Cost-Effective Probabilistic Structural Testing

Author: Alexander Rob
Clark John Andrew
Hadley Mark Jason
Poulding Simon Marcus
Publication venue
Publication date: 01/01/2013
Field of study

The effectiveness of probabilistic structural testing depends on the characteristics of the probability distribution from which test inputs are sampled at random. Metaheuristic search has been shown to be a practical method of optimis- ing the characteristics of such distributions. However, the applicability of the existing search-based algorithm is lim- ited by the requirement that the software’s inputs must be a fixed number of numeric values. In this paper we relax this limitation by means of a new representation for the probability distribution. The repre- sentation is based on stochastic context-free grammars but incorporates two novel extensions: conditional production weights and the aggregation of terminal symbols represent- ing numeric values. We demonstrate that an algorithm which combines the new representation with hill-climbing search is able to effi- ciently derive probability distributions suitable for testing software with structurally-complex input domains

CiteSeerX

Crossref

White Rose Research Online