9,212 research outputs found
Too Trivial To Test? An Inverse View on Defect Prediction to Identify Methods with Low Fault Risk
Background. Test resources are usually limited and therefore it is often not
possible to completely test an application before a release. To cope with the
problem of scarce resources, development teams can apply defect prediction to
identify fault-prone code regions. However, defect prediction tends to low
precision in cross-project prediction scenarios.
Aims. We take an inverse view on defect prediction and aim to identify
methods that can be deferred when testing because they contain hardly any
faults due to their code being "trivial". We expect that characteristics of
such methods might be project-independent, so that our approach could improve
cross-project predictions.
Method. We compute code metrics and apply association rule mining to create
rules for identifying methods with low fault risk. We conduct an empirical
study to assess our approach with six Java open-source projects containing
precise fault data at the method level.
Results. Our results show that inverse defect prediction can identify approx.
32-44% of the methods of a project to have a low fault risk; on average, they
are about six times less likely to contain a fault than other methods. In
cross-project predictions with larger, more diversified training sets,
identified methods are even eleven times less likely to contain a fault.
Conclusions. Inverse defect prediction supports the efficient allocation of
test resources by identifying methods that can be treated with less priority in
testing activities and is well applicable in cross-project prediction
scenarios.Comment: Submitted to PeerJ C
Cascades and Dissipative Anomalies in Compressible Fluid Turbulence
We investigate dissipative anomalies in a turbulent fluid governed by the
compressible Navier-Stokes equation. We follow an exact approach pioneered by
Onsager, which we explain as a non-perturbative application of the principle of
renormalization-group invariance. In the limit of high Reynolds and P\'eclet
numbers, the flow realizations are found to be described as distributional or
"coarse-grained" solutions of the compressible Euler equations, with standard
conservation laws broken by turbulent anomalies. The anomalous dissipation of
kinetic energy is shown to be due not only to local cascade, but also to a
distinct mechanism called pressure-work defect. Irreversible heating in
stationary, planar shocks with an ideal-gas equation of state exemplifies the
second mechanism. Entropy conservation anomalies are also found to occur by two
mechanisms: an anomalous input of negative entropy (negentropy) by
pressure-work and a cascade of negentropy to small scales. We derive
"4/5th-law"-type expressions for the anomalies, which allow us to characterize
the singularities (structure-function scaling exponents) required to sustain
the cascades. We compare our approach with alternative theories and empirical
evidence. It is argued that the "Big Power-Law in the Sky" observed in electron
density scintillations in the interstellar medium is a manifestation of a
forward negentropy cascade, or an inverse cascade of usual thermodynamic
entropy
The Co-Evolution of Test Maintenance and Code Maintenance through the lens of Fine-Grained Semantic Changes
Automatic testing is a widely adopted technique for improving software
quality. Software developers add, remove and update test methods and test
classes as part of the software development process as well as during the
evolution phase, following the initial release. In this work we conduct a large
scale study of 61 popular open source projects and report the relationships we
have established between test maintenance, production code maintenance, and
semantic changes (e.g, statement added, method removed, etc.). performed in
developers' commits.
We build predictive models, and show that the number of tests in a software
project can be well predicted by employing code maintenance profiles (i.e., how
many commits were performed in each of the maintenance activities: corrective,
perfective, adaptive). Our findings also reveal that more often than not,
developers perform code fixes without performing complementary test maintenance
in the same commit (e.g., update an existing test or add a new one). When
developers do perform test maintenance, it is likely to be affected by the
semantic changes they perform as part of their commit.
Our work is based on studying 61 popular open source projects, comprised of
over 240,000 commits consisting of over 16,000,000 semantic change type
instances, performed by over 4,000 software engineers.Comment: postprint, ICSME 201
Continuous Defect Prediction: The Idea and a Related Dataset
We would like to present the idea of our Continuous Defect Prediction (CDP)
research and a related dataset that we created and share. Our dataset is
currently a set of more than 11 million data rows, representing files involved
in Continuous Integration (CI) builds, that synthesize the results of CI builds
with data we mine from software repositories. Our dataset embraces 1265
software projects, 30,022 distinct commit authors and several software process
metrics that in earlier research appeared to be useful in software defect
prediction. In this particular dataset we use TravisTorrent as the source of CI
data. TravisTorrent synthesizes commit level information from the Travis CI
server and GitHub open-source projects repositories. We extend this data to a
file change level and calculate the software process metrics that may be used,
for example, as features to predict risky software changes that could break the
build if committed to a repository with CI enabled.Comment: Lech Madeyski and Marcin Kawalerowicz. "Continuous Defect Prediction:
The Idea and a Related Dataset" In: 14th International Conference on Mining
Software Repositories (MSR'17). Buenos Aires. 2017, pp. 515-518. doi:
10.1109/MSR.2017.46. URL:
http://madeyski.e-informatyka.pl/download/MadeyskiKawalerowiczMSR17.pd
A DEEP ENSEMBLE LEARNING METHOD FOR EFFORT-AWARE JUST-IN-TIME DEFECT PREDICTION
Nowadays, logistics for transportation and distribution of merchandise are a key element to increase the competitiveness of companies. However, the election of alternative routes outside the panned routes causes the logistic companies to provide a poor-quality service, with units that endanger the appropriate deliver of merchandise and impacting negatively the way in which the supply chain works. This paper aims to develop a module that allows the processing, analysis and deployment of satellite information oriented to the pattern analysis, to find anomalies in the paths of the operators by implementing the algorithm TODS, to be able to help in the decision making. The experimental results show that the algorithm detects optimally the abnormal routes using historical data as a base
- …