16,490 research outputs found
The Search for the Laws of Automatic Random Testing
Can one estimate the number of remaining faults in a software system? A
credible estimation technique would be immensely useful to project managers as
well as customers. It would also be of theoretical interest, as a general law
of software engineering. We investigate possible answers in the context of
automated random testing, a method that is increasingly accepted as an
effective way to discover faults. Our experimental results, derived from
best-fit analysis of a variety of mathematical functions, based on a large
number of automated tests of library code equipped with automated oracles in
the form of contracts, suggest a poly-logarithmic law. Although further
confirmation remains necessary on different code bases and testing techniques,
we argue that understanding the laws of testing may bring significant benefits
for estimating the number of detectable faults and comparing different projects
and practices.Comment: 20 page
Stateful Testing: Finding More Errors in Code and Contracts
Automated random testing has shown to be an effective approach to finding
faults but still faces a major unsolved issue: how to generate test inputs
diverse enough to find many faults and find them quickly. Stateful testing, the
automated testing technique introduced in this article, generates new test
cases that improve an existing test suite. The generated test cases are
designed to violate the dynamically inferred contracts (invariants)
characterizing the existing test suite. As a consequence, they are in a good
position to detect new errors, and also to improve the accuracy of the inferred
contracts by discovering those that are unsound. Experiments on 13 data
structure classes totalling over 28,000 lines of code demonstrate the
effectiveness of stateful testing in improving over the results of long
sessions of random testing: stateful testing found 68.4% new errors and
improved the accuracy of automatically inferred contracts to over 99%, with
just a 7% time overhead.Comment: 11 pages, 3 figure
Unified functional network and nonlinear time series analysis for complex systems science: The pyunicorn package
We introduce the \texttt{pyunicorn} (Pythonic unified complex network and
recurrence analysis toolbox) open source software package for applying and
combining modern methods of data analysis and modeling from complex network
theory and nonlinear time series analysis. \texttt{pyunicorn} is a fully
object-oriented and easily parallelizable package written in the language
Python. It allows for the construction of functional networks such as climate
networks in climatology or functional brain networks in neuroscience
representing the structure of statistical interrelationships in large data sets
of time series and, subsequently, investigating this structure using advanced
methods of complex network theory such as measures and models for spatial
networks, networks of interacting networks, node-weighted statistics or network
surrogates. Additionally, \texttt{pyunicorn} provides insights into the
nonlinear dynamics of complex systems as recorded in uni- and multivariate time
series from a non-traditional perspective by means of recurrence quantification
analysis (RQA), recurrence networks, visibility graphs and construction of
surrogate time series. The range of possible applications of the library is
outlined, drawing on several examples mainly from the field of climatology.Comment: 28 pages, 17 figure
A Bayesian framework for verification and recalibration of ensemble forecasts: How uncertain is NAO predictability?
Predictability estimates of ensemble prediction systems are uncertain due to
limited numbers of past forecasts and observations. To account for such
uncertainty, this paper proposes a Bayesian inferential framework that provides
a simple 6-parameter representation of ensemble forecasting systems and the
corresponding observations. The framework is probabilistic, and thus allows for
quantifying uncertainty in predictability measures such as correlation skill
and signal-to-noise ratios. It also provides a natural way to produce
recalibrated probabilistic predictions from uncalibrated ensembles forecasts.
The framework is used to address important questions concerning the skill of
winter hindcasts of the North Atlantic Oscillation for 1992-2011 issued by the
Met Office GloSea5 climate prediction system. Although there is much
uncertainty in the correlation between ensemble mean and observations, there is
strong evidence of skill: the 95% credible interval of the correlation
coefficient of [0.19,0.68] does not overlap zero. There is also strong evidence
that the forecasts are not exchangeable with the observations: With over 99%
certainty, the signal-to-noise ratio of the forecasts is smaller than the
signal-to-noise ratio of the observations, which suggests that raw forecasts
should not be taken as representative scenarios of the observations. Forecast
recalibration is thus required, which can be coherently addressed within the
proposed framework.Comment: 36 pages, 10 figure
Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes
I argue that data becomes temporarily interesting by itself to some
self-improving, but computationally limited, subjective observer once he learns
to predict or compress the data in a better way, thus making it subjectively
simpler and more beautiful. Curiosity is the desire to create or discover more
non-random, non-arbitrary, regular data that is novel and surprising not in the
traditional sense of Boltzmann and Shannon but in the sense that it allows for
compression progress because its regularity was not yet known. This drive
maximizes interestingness, the first derivative of subjective beauty or
compressibility, that is, the steepness of the learning curve. It motivates
exploring infants, pure mathematicians, composers, artists, dancers, comedians,
yourself, and (since 1990) artificial systems.Comment: 35 pages, 3 figures, based on KES 2008 keynote and ALT 2007 / DS 2007
joint invited lectur
Objective assessment of region of interest-aware adaptive multimedia streaming quality
Adaptive multimedia streaming relies on controlled
adjustment of content bitrate and consequent video quality variation in order to meet the bandwidth constraints of the communication
link used for content delivery to the end-user. The values of the easy to measure network-related Quality of Service metrics have no direct relationship with the way moving images are
perceived by the human viewer. Consequently variations in the video stream bitrate are not clearly linked to similar variation in the user perceived quality. This is especially true if some human visual system-based adaptation techniques are employed. As research has shown, there are certain image regions in each frame of a video sequence on which the users are more interested than in the others. This paper presents the Region of Interest-based Adaptive Scheme (ROIAS) which adjusts differently the regions within each frame of the streamed multimedia content based on the user interest in them. ROIAS is presented and discussed in terms of the adjustment algorithms employed and their impact on the human perceived video quality. Comparisons with existing approaches, including a constant quality adaptation scheme across the whole frame area, are performed employing two objective metrics which estimate user perceived video quality
Searching for invariants using genetic programming and mutation testing
Invariants are concise and useful descriptions of a program's behaviour. As most programs are not annotated with invariants, previous research has attempted to automatically generate them from source code. In this paper, we propose a new approach to invariant generation using search. We reuse the trace generation front-end of existing tool Daikon and integrate it with genetic programming and a mutation testing tool. We demonstrate that our system can find the same invariants through search that Daikon produces via template instantiation, and we also find useful invariants that Daikon does not. We then present a method of ranking invariants such that we can identify those that are most interesting, through a novel application of program mutation
REI:An integrated measure for software reusability
To capitalize upon the benefits of software reuse, an efficient selection among candidate reusable assets should be performed in terms of functional fitness and adaptability. The reusability of assets is usually measured through reusability indices. However, these do not capture all facets of reusability, such as structural characteristics, external quality attributes, and documentation. In this paper, we propose a reusability index (REI) as a synthesis of various software metrics and evaluate its ability to quantify reuse, based on IEEE Standard on Software Metrics Validity. The proposed index is compared with existing ones through a case study on 80 reusable open-source assets. To illustrate the applicability of the proposed index, we performed a pilot study, where real-world reuse decisions have been compared with decisions imposed by the use of metrics (including REI). The results of the study suggest that the proposed index presents the highest predictive and discriminative power; it is the most consistent in ranking reusable assets and the most strongly correlated to their levels of reuse. The findings of the paper are discussed to understand the most important aspects in reusability assessment (interpretation of results), and interesting implications for research and practice are provided
Automated Fixing of Programs with Contracts
This paper describes AutoFix, an automatic debugging technique that can fix
faults in general-purpose software. To provide high-quality fix suggestions and
to enable automation of the whole debugging process, AutoFix relies on the
presence of simple specification elements in the form of contracts (such as
pre- and postconditions). Using contracts enhances the precision of dynamic
analysis techniques for fault detection and localization, and for validating
fixes. The only required user input to the AutoFix supporting tool is then a
faulty program annotated with contracts; the tool produces a collection of
validated fixes for the fault ranked according to an estimate of their
suitability.
In an extensive experimental evaluation, we applied AutoFix to over 200
faults in four code bases of different maturity and quality (of implementation
and of contracts). AutoFix successfully fixed 42% of the faults, producing, in
the majority of cases, corrections of quality comparable to those competent
programmers would write; the used computational resources were modest, with an
average time per fix below 20 minutes on commodity hardware. These figures
compare favorably to the state of the art in automated program fixing, and
demonstrate that the AutoFix approach is successfully applicable to reduce the
debugging burden in real-world scenarios.Comment: Minor changes after proofreadin
- …