213 research outputs found
Refactoring Assertion Roulette and Duplicate Assert test smells: a controlled experiment
Test smells can reduce the developers' ability to interact with the test
code. Refactoring test code offers a safe strategy to handle test smells.
However, the manual refactoring activity is not a trivial process, and it is
often tedious and error-prone. This study aims to evaluate RAIDE, a tool for
automatic identification and refactoring of test smells. We present an
empirical assessment of RAIDE, in which we analyzed its capability at
refactoring Assertion Roulette and Duplicate Assert test smells and compared
the results against both manual refactoring and a state-of-the-art approach.
The results show that RAIDE provides a faster and more intuitive approach for
handling test smells than using an automated tool for smells detection combined
with manual refactoring
What the Smell? An Empirical Investigation on the Distribution and Severity of Test Smells in Open Source Android Applications
The widespread adoption of mobile devices, coupled with the ease of developing mobile-based applications (apps) has created a lucrative and competitive environment for app developers. Solely focusing on app functionality and time-to-market is not enough for developers to ensure the success of their app. Quality attributes exhibited by the app must also be a key focus point; not just at the onset of app development, but throughout its lifetime.
The impact analysis of bad programming practices, or code smells, in production code has been the focus of numerous studies in software maintenance. Similar to production code, unit tests are also susceptible to bad programming practices which can have a negative impact not only on the quality of the software system but also on maintenance activities. With the present corpus of studies on test smells primarily on traditional applications, there is a need to fill the void in understanding the deviation of testing guidelines in the mobile environment. Furthermore, there is a need to understand the degree to which test smells are prevalent in mobile apps and the impact of such smells on app maintenance. Hence, the purpose of this research is to: (1) extend the existing set of bad test-code practices by introducing new test smells, (2) provide the software engineering community with an open-source test smell detection tool, and (3) perform a large-scale empirical study on test smell occurrence, distribution, and impact on the maintenance of open-source Android apps.
Through multiple experiments, our findings indicate that most Android apps lack an automated verification of their testing mechanisms. As for the apps with existing test suites, they exhibit test smells early on in their lifetime with varying degrees of co-occurrences with different smell types. Our exploration of the relationship between test smells and technical debt proves that test smells are a strong measurement of technical debt. Furthermore, we observed positive correlations between specific smell types and highly changed/buggy test files. Hence, this research demonstrates that test smells can be used as indicators for necessary preventive software maintenance for test suites
On the Distribution of Test Smells in Open Source Android Applications: An Exploratory Study
The impact of bad programming practices, such as code smells, in production code has been the focus of numerous studies in software engineering. Like production code, unit tests are also affected by bad programming practices which can have a negative impact on the quality and maintenance of a software system. While several studies addressed code and test smells in desktop applications, there is little knowledge of test smells in the context of mobile applications. In this study, we extend the existing catalog of test smells by identifying and defining new smells and survey over 40 developers who confirm that our proposed smells are bad programming practices in test suites. Additionally, we perform an empirical study on the occurrences and distribution of the proposed smells on 656 open-source Android apps. Our findings show a widespread occurrence of test smells in apps. We also show that apps tend to exhibit test smells early in their lifetime with different degrees of co-occurrences on different smell types. This empirical study demonstrates that test smells can be used as an indicator for necessary preventive software maintenance for test suites
Test Smell: A Parasitic Energy Consumer in Software Testing
Traditionally, energy efficiency research has focused on reducing energy
consumption at the hardware level and, more recently, in the design and coding
phases of the software development life cycle. However, software testing's
impact on energy consumption did not receive attention from the research
community. Specifically, how test code design quality and test smell (e.g.,
sub-optimal design and bad practices in test code) impact energy consumption
has not been investigated yet. This study examined 12 Apache projects to
analyze the association between test smell and its effects on energy
consumption in software testing. We conducted a mixed-method empirical analysis
from two dimensions; software (data mining in Apache projects) and developers'
views (a survey of 62 software practitioners). Our findings show that: 1) test
smell is associated with energy consumption in software testing. Specifically
smelly part of a test case consumes 10.92\% more energy compared to the
non-smelly part. 2) certain test smells are more energy-hungry than others, 3)
refactored test cases tend to consume less energy than their smelly
counterparts, and 4) most developers lack knowledge about test smells' impact
on energy consumption. We conclude the paper with several observations that can
direct future research and developments
Automatic generation of smell-free unit tests
Tese de mestrado, Engenharia Informática, 2022, Universidade de Lisboa, Faculdade de CiênciasAutomated test generation tools (such as EvoSuite) typically aim to maximize code
coverage. However, they frequently disregard non-coverage aspects that can be relevant
for testers, such as the quality of the generated tests. Therefore, automatically generated
tests are often affected by a set of test-specific bad programming practices that may hinder
the quality of both test and production code, i.e., test smells. Given that other researchers
have successfully integrated non-coverage quality metrics into EvoSuite, we decided to
extend the EvoSuite tool such that the generated test code is smell-free. To this aim, we
compiled 54 test smells from several sources and selected 16 smells that are relevant to the
context of this work. We then augmented the tool with the respective test smell metrics
and investigated the diffusion of the selected smells and the distribution of the metrics.
Finally, we implemented an approach to optimize the test smell metrics as secondary
criteria. After establishing the optimal configuration to optimize as secondary criteria
(which we used throughout the remainder of the study), we conducted an empirical study
to assess whether the tests became significantly less smelly. Furthermore, we studied
how the proposed metrics affect the fault detection effectiveness, coverage, and size of
the generated tests. Our study revealed that the proposed approach reduces the overall
smelliness of the generated tests; in particular, the diffusion of the “Indirect Testing” and
“Unrelated Assertions” smells improved considerably. Moreover, our approach improved
the smelliness of the tests generated by EvoSuite without compromising the code coverage
or fault detection effectiveness. The size and length of the generated tests were also not
affected by the new secondary criteria
On the Effectiveness of Unit Tests in Test-driven Development
Background: Writing unit tests is one of the primary activities
in test-driven development. Yet, the existing reviews report few
evidence supporting or refuting the effect of this development approach
on test case quality. Lack of ability and skills of developers to
produce sufficiently good test cases are also reported as limitations
of applying test-driven development in industrial practice.
Objective: We investigate the impact of test-driven development
on the effectiveness of unit test cases compared to an incremental
test last development in an industrial context.
Method: We conducted an experiment in an industrial setting
with 24 professionals. Professionals followed the two development
approaches to implement the tasks. We measure unit test effectiveness
in terms of mutation score. We also measure branch and
method coverage of test suites to compare our results with the
literature.
Results: In terms of mutation score, we have found that the test
cases written for a test-driven development task have a higher
defect detection ability than test cases written for an incremental
test-last development task. Subjects wrote test cases that cover
more branches on a test-driven development task compared to the
other task. However, test cases written for an incremental test-last
development task cover more methods than those written for the
second task.
Conclusion: Our findings are different from previous studies
conducted at academic settings. Professionals were able to perform
more effective unit testing with test-driven development. Furthermore,
we observe that the coverage measure preferred in academic
studies reveal different aspects of a development approach. Our
results need to be validated in larger industrial contexts.Istanbul Technical University
Scientific Research Projects (MGA-2017-40712), and the
Academy of Finland (Decision No. 278354)
Detailed Overview of Software Smells
This document provides an overview of literature concerning software smells covering various dimensions of smells along with their corresponding references
Test case quality: an empirical study on belief and evidence
Software testing is a mandatory activity in any serious software development
process, as bugs are a reality in software development. This raises the
question of quality: good tests are effective in finding bugs, but until a test
case actually finds a bug, its effectiveness remains unknown. Therefore,
determining what constitutes a good or bad test is necessary. This is not a
simple task, and there are a number of studies that identify different
characteristics of a good test case. A previous study evaluated 29 hypotheses
regarding what constitutes a good test case, but the findings are based on
developers' beliefs, which are subjective and biased. In this paper we
investigate eight of these hypotheses, through an extensive empirical study
based on open software repositories. Despite our best efforts, we were unable
to find evidence that supports these beliefs. This indicates that, although
these hypotheses represent good software engineering advice, they do not
necessarily mean that they are enough to provide the desired outcome of good
testing code.Comment: 12 pages, 1 figure, 3 table
- …