Test Case Prioritization (TCP) is an increasingly important regression
testing technique for reordering test cases according to a pre-defined goal,
particularly as agile practices gain adoption. To better understand these
techniques, we perform the first extensive study aimed at empirically
evaluating four static TCP techniques, comparing them with state-of-research
dynamic TCP techniques across several quality metrics. This study was performed
on 58 real-word Java programs encompassing 714 KLoC and results in several
notable observations. First, our results across two effectiveness metrics (the
Average Percentage of Faults Detected APFD and the cost cognizant APFDc)
illustrate that at test-class granularity, these metrics tend to correlate, but
this correlation does not hold at test-method granularity. Second, our analysis
shows that static techniques can be surprisingly effective, particularly when
measured by APFDc. Third, we found that TCP techniques tend to perform better
on larger programs, but that program size does not affect comparative
performance measures between techniques. Fourth, software evolution does not
significantly impact comparative performance results between TCP techniques.
Fifth, neither the number nor type of mutants utilized dramatically impact
measures of TCP effectiveness under typical experimental settings. Finally, our
similarity analysis illustrates that highly prioritized test cases tend to
uncover dissimilar faults.Comment: Preprint of Accepted Paper to IEEE Transactions on Software
Engineerin