60,645 research outputs found
Test Set Diameter: Quantifying the Diversity of Sets of Test Cases
A common and natural intuition among software testers is that test cases need
to differ if a software system is to be tested properly and its quality
ensured. Consequently, much research has gone into formulating distance
measures for how test cases, their inputs and/or their outputs differ. However,
common to these proposals is that they are data type specific and/or calculate
the diversity only between pairs of test inputs, traces or outputs.
We propose a new metric to measure the diversity of sets of tests: the test
set diameter (TSDm). It extends our earlier, pairwise test diversity metrics
based on recent advances in information theory regarding the calculation of the
normalized compression distance (NCD) for multisets. An advantage is that TSDm
can be applied regardless of data type and on any test-related information, not
only the test inputs. A downside is the increased computational time compared
to competing approaches.
Our experiments on four different systems show that the test set diameter can
help select test sets with higher structural and fault coverage than random
selection even when only applied to test inputs. This can enable early test
design and selection, prior to even having a software system to test, and
complement other types of test automation and analysis. We argue that this
quantification of test set diversity creates a number of opportunities to
better understand software quality and provides practical ways to increase it.Comment: In submissio
A Harmonic Extension Approach for Collaborative Ranking
We present a new perspective on graph-based methods for collaborative ranking
for recommender systems. Unlike user-based or item-based methods that compute a
weighted average of ratings given by the nearest neighbors, or low-rank
approximation methods using convex optimization and the nuclear norm, we
formulate matrix completion as a series of semi-supervised learning problems,
and propagate the known ratings to the missing ones on the user-user or
item-item graph globally. The semi-supervised learning problems are expressed
as Laplace-Beltrami equations on a manifold, or namely, harmonic extension, and
can be discretized by a point integral method. We show that our approach does
not impose a low-rank Euclidean subspace on the data points, but instead
minimizes the dimension of the underlying manifold. Our method, named LDM (low
dimensional manifold), turns out to be particularly effective in generating
rankings of items, showing decent computational efficiency and robust ranking
quality compared to state-of-the-art methods
Testing conformance of a deterministic implementation against a non-deterministic stream X-machine
Stream X-machines are a formalisation of extended finite state machines that have been used to specify systems. One of the great benefits of using stream X-machines, for the purpose of specification, is the associated test generation technique which produces a test that is guaranteed to determine correctness under certain design for test conditions. This test generation algorithm has recently been extended to the case where the specification is non-deterministic. However, the algorithms for testing from a non-deterministic stream X-machine currently have limitations: either they test for equivalence, rather than conformance or they restrict the source of non-determinism allowed in the specification. This paper introduces a new test generation algorithm that overcomes both of these limitations, for situations where the implementation is known to be deterministic
A practical guide and software for analysing pairwise comparison experiments
Most popular strategies to capture subjective judgments from humans involve
the construction of a unidimensional relative measurement scale, representing
order preferences or judgments about a set of objects or conditions. This
information is generally captured by means of direct scoring, either in the
form of a Likert or cardinal scale, or by comparative judgments in pairs or
sets. In this sense, the use of pairwise comparisons is becoming increasingly
popular because of the simplicity of this experimental procedure. However, this
strategy requires non-trivial data analysis to aggregate the comparison ranks
into a quality scale and analyse the results, in order to take full advantage
of the collected data. This paper explains the process of translating pairwise
comparison data into a measurement scale, discusses the benefits and
limitations of such scaling methods and introduces a publicly available
software in Matlab. We improve on existing scaling methods by introducing
outlier analysis, providing methods for computing confidence intervals and
statistical testing and introducing a prior, which reduces estimation error
when the number of observers is low. Most of our examples focus on image
quality assessment.Comment: Code available at https://github.com/mantiuk/pwcm
Supervised estimation of Granger-based causality between time series
Brain effective connectivity aims to detect causal interactions between distinct brain units and it is typically studied through the analysis of direct measurements of the neural activity, e.g., magneto/electroencephalography (M/EEG) signals. The literature on methods for causal inference is vast. It includes model-based methods in which a generative model of the data is assumed and model-free methods that directly infer causality from the probability distribution of the underlying stochastic process. Here, we firstly focus on the model-based methods developed from the Granger criterion of causality, which assumes the autoregressive model of the data. Secondly, we introduce a new perspective, that looks at the problem in a way that is typical of the machine learning literature. Then, we formulate the problem of causality detection as a supervised learning task, by proposing a classification-based approach. A classifier is trained to identify causal interactions between time series for the chosen model and by means of a proposed feature space. In this paper, we are interested in comparing this classification-based approach with the standard Geweke measure of causality in the time domain, through simulation study. Thus, we customized our approach to the case of a MAR model and designed a feature space which contains causality measures based on the idea of precedence and predictability in time. Two variations of the supervised method are proposed and compared to a standard Granger causal analysis method. The results of the simulations show that the supervised method outperforms the standard approach, in particular it is more robust to noise. As evidence of the efficacy of the proposed method, we report the details of our submission to the causality detection competition of Biomag2014, where the proposed method reached the 2nd place. Moreover, as empirical application, we applied the supervised approach on a dataset of neural recordings of rats obtaining an important reduction in the false positive rate
Differential expression analysis for multiple conditions
As high-throughput sequencing has become common practice, the cost of
sequencing large amounts of genetic data has been drastically reduced, leading
to much larger data sets for analysis. One important task is to identify
biological conditions that lead to unusually high or low expression of a
particular gene. Packages such as DESeq implement a simple method for testing
differential signal when exactly two biological conditions are possible. For
more than two conditions, pairwise testing is typically used. Here the DESeq
method is extended so that three or more biological conditions can be assessed
simultaneously. Because the computation time grows exponentially in the number
of conditions, a Monte Carlo approach provides a fast way to approximate the
-values for the new test. The approach is studied on both simulated data and
a data set of {\em C. jejuni}, the bacteria responsible for most food poisoning
in the United States
- ā¦