27,212 research outputs found
Empirical Evaluation of Test Coverage for Functional Programs
The correlation between test coverage and test effectiveness is important to justify the use of coverage in practice. Existing results on imperative programs mostly show that test coverage predicates effectiveness. However, since functional programs are usually structurally different from imperative ones, it is unclear whether the same result may be derived and coverage can be used as a prediction of effectiveness on functional programs. In this paper we report the first empirical study on the correlation between test coverage and test effectiveness on functional programs. We consider four types of coverage: as input coverages, statement/branch coverage and expression coverage, and as oracle coverages, count of assertions and checked coverage. We also consider two types of effectiveness: raw effectiveness and normalized effectiveness. Our results are twofold. (1) In general the findings on imperative programs still hold on functional programs, warranting the use of coverage in practice. (2) On specific coverage criteria, the results may be unexpected or different from the imperative ones, calling for further studies on functional programs
Evaluating Random Mutant Selection at Class-Level in Projects with Non-Adequate Test Suites
Mutation testing is a standard technique to evaluate the quality of a test
suite. Due to its computationally intensive nature, many approaches have been
proposed to make this technique feasible in real case scenarios. Among these
approaches, uniform random mutant selection has been demonstrated to be simple
and promising. However, works on this area analyze mutant samples at project
level mainly on projects with adequate test suites. In this paper, we fill this
lack of empirical validation by analyzing random mutant selection at class
level on projects with non-adequate test suites. First, we show that uniform
random mutant selection underachieves the expected results. Then, we propose a
new approach named weighted random mutant selection which generates more
representative mutant samples. Finally, we show that representative mutant
samples are larger for projects with high test adequacy.Comment: EASE 2016, Article 11 , 10 page
Is the Stack Distance Between Test Case and Method Correlated With Test Effectiveness?
Mutation testing is a means to assess the effectiveness of a test suite and
its outcome is considered more meaningful than code coverage metrics. However,
despite several optimizations, mutation testing requires a significant
computational effort and has not been widely adopted in industry. Therefore, we
study in this paper whether test effectiveness can be approximated using a more
light-weight approach. We hypothesize that a test case is more likely to detect
faults in methods that are close to the test case on the call stack than in
methods that the test case accesses indirectly through many other methods.
Based on this hypothesis, we propose the minimal stack distance between test
case and method as a new test measure, which expresses how close any test case
comes to a given method, and study its correlation with test effectiveness. We
conducted an empirical study with 21 open-source projects, which comprise in
total 1.8 million LOC, and show that a correlation exists between stack
distance and test effectiveness. The correlation reaches a strength up to 0.58.
We further show that a classifier using the minimal stack distance along with
additional easily computable measures can predict the mutation testing result
of a method with 92.9% precision and 93.4% recall. Hence, such a classifier can
be taken into consideration as a light-weight alternative to mutation testing
or as a preceding, less costly step to that.Comment: EASE 201
Will My Tests Tell Me If I Break This Code?
Automated tests play an important role in software evolution because they can
rapidly detect faults introduced during changes. In practice, code-coverage
metrics are often used as criteria to evaluate the effectiveness of test suites
with focus on regression faults. However, code coverage only expresses which
portion of a system has been executed by tests, but not how effective the tests
actually are in detecting regression faults. Our goal was to evaluate the
validity of code coverage as a measure for test effectiveness. To do so, we
conducted an empirical study in which we applied an extreme mutation testing
approach to analyze the tests of open-source projects written in Java. We
assessed the ratio of pseudo-tested methods (those tested in a way such that
faults would not be detected) to all covered methods and judged their impact on
the software project. The results show that the ratio of pseudo-tested methods
is acceptable for unit tests but not for system tests (that execute large
portions of the whole system). Therefore, we conclude that the coverage metric
is only a valid effectiveness indicator for unit tests.Comment: 7 pages, 3 figure
Generating Effective Test Suites for Model Transformations Using Classifying Terms
Generating sample models for testing a model transformation is no easy task. This paper explores the use of classifying terms and stratified sampling for developing richer test cases for model transformations. Classifying terms are used to define the equivalence classes that characterize the relevant subgroups for the test cases. From each equivalence class of object models, several representative models are chosen depending on the required sample size. We compare our
results with test suites developed using random sampling, and conclude that by using an ordered and stratified approach the coverage and effectiveness of the test suite can be significantly improved.Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tech
- …