Search CORE

139,685 research outputs found

PopArt: Ranked Testing Efficiency

Author: Barr Earl
Clark David
Papapanagiotakis-Bousy Iason
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

Too often, programmers are under pressure to maximize their confidence in the correctness of their code with a tight testing budget. Should they spend some of that budget on finding “interesting” inputs or spend their entire testing budget on test executions? Work on testing efficiency has explored two competing approaches to answer this question: systematic partition testing (ST), which defines a testing partition and tests its parts, and random testing (RT), which directly samples inputs with replacement. A consensus as to which is better when has yet to emerge. We present Probability Ordered Partition Testing (POPART), a new systematic partition-based testing strategy that visits the parts of a testing partition in decreasing probability order and in doing so leverages any non-uniformity over that partition. We show how to construct a homogeneous testing partition, a requirement for systematic testing, by using an executable oracle and the path partition. A program’s path partition is a naturally occurring testing partition that is usually skewed for the simple reason that some paths execute more frequently than others. To confirm this conventional wisdom, we instrument programs from the Codeflaws repository and find that 80% of them have a skewed path probability distribution. POPART visits the parts of a testing partition in decreasing probability order. We then compare POPART with RT to characterise the configuration space in which each is more efficient. We show that, when simulating Codeflaws, POPART outperforms RT after 100;000 executions. Our results reaffirm RT’s power for very small testing budgets but also show that for any application requiring high (above 90%) probability-weighted coverage POPART should be preferred. In such cases, despite paying more for each test execution, we prove that POPART outperforms RT: it traverses parts whose cumulative probability bounds that of random testing, showing that sampling without replacement pays for itself, given a nonuniform probability over a testing partition

UCL Discovery

Partition Information and its Transmission over Boolean Multi-Access Channels

Author: Vaidyanathan Ramachandran
Wang Yue
Wei Shuangqing
Wu Shuhang
Yuan Jian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/07/2014
Field of study

In this paper, we propose a novel partition reservation system to study the partition information and its transmission over a noise-free Boolean multi-access channel. The objective of transmission is not message restoration, but to partition active users into distinct groups so that they can, subsequently, transmit their messages without collision. We first calculate (by mutual information) the amount of information needed for the partitioning without channel effects, and then propose two different coding schemes to obtain achievable transmission rates over the channel. The first one is the brute force method, where the codebook design is based on centralized source coding; the second method uses random coding where the codebook is generated randomly and optimal Bayesian decoding is employed to reconstruct the partition. Both methods shed light on the internal structure of the partition problem. A novel hypergraph formulation is proposed for the random coding scheme, which intuitively describes the information in terms of a strong coloring of a hypergraph induced by a sequence of channel operations and interactions between active users. An extended Fibonacci structure is found for a simple, but non-trivial, case with two active users. A comparison between these methods and group testing is conducted to demonstrate the uniqueness of our problem.Comment: Submitted to IEEE Transactions on Information Theory, major revisio

arXiv.org e-Print Archive

Crossref

Louisiana State University

Proportional sampling strategy: A compendium and some insights

Author: Chen TY
Tse TH
Yu YT
Publication venue: 'Elsevier BV'
Publication date: 01/01/2001
Field of study

There have been numerous studies on the effectiveness of partition and random testing. In particular, the proportional sampling (PS) strategy has been proved, under certain conditions, to be the only form of partition testing that outperforms random testing regardless of where the failure-causing inputs are. This paper provides an integrated synthesis and overview of our recent studies on the PS strategy and its related work. Through this synthesis, we offer a perspective that properly interprets the results obtained so far, and present some of the interesting issues involved and new insights obtained during the course of this research. © 2001 Elsevier Science Inc. All rights reserved.postprin

HKU Scholars Hub

Consistent distribution-free $K$ -sample and independence tests for univariate random variables

Author: Brill Barak
Gorfine Malka
Heller Ruth
Heller Yair
Kaufman Shachar
Publication venue
Publication date: 18/06/2015
Field of study

A popular approach for testing if two univariate random variables are statistically independent consists of partitioning the sample space into bins, and evaluating a test statistic on the binned data. The partition size matters, and the optimal partition size is data dependent. While for detecting simple relationships coarse partitions may be best, for detecting complex relationships a great gain in power can be achieved by considering finer partitions. We suggest novel consistent distribution-free tests that are based on summation or maximization aggregation of scores over all partitions of a fixed size. We show that our test statistics based on summation can serve as good estimators of the mutual information. Moreover, we suggest regularized tests that aggregate over all partition sizes, and prove those are consistent too. We provide polynomial-time algorithms, which are critical for computing the suggested test statistics efficiently. We show that the power of the regularized tests is excellent compared to existing tests, and almost as powerful as the tests based on the optimal (yet unknown in practice) partition size, in simulations as well as on a real data example.Comment: arXiv admin note: substantial text overlap with arXiv:1308.155

arXiv.org e-Print Archive

CiteSeerX

Test case selection with and without replacement

Author: Chan FT
Chen TY
Leung H
Tse TH
Publication venue: 'Elsevier BV'
Publication date: 01/01/2000
Field of study

Previous theoretical studies on the effectiveness of partition testing and random testing have assumed that test cases are selected with replacement. Although this assumption has been well known to be less realistic, it has still been used in previous theoretical work because it renders the analyses more tractable. This paper presents a theoretical investigation aimed at comparing the effectiveness when test cases are selected with and without replacement, and exploring the relationships between these two scenarios. We propose a new effectiveness metric for software testing, namely the expected number of distinct failures detected, to re-examine existing partition testing strategies.postprin

HKU Scholars Hub

Asymptotic Error Free Partitioning over Noisy Boolean Multiaccess Channels

Author: Vaidyanathan Ramachandran
Wang Yue
Wei Shuangqing
Wu Shuhang
Yuan Jian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/05/2015
Field of study

In this paper, we consider the problem of partitioning active users in a manner that facilitates multi-access without collision. The setting is of a noisy, synchronous, Boolean, multi-access channel where

K

active users (out of a total of

N

users) seek to access. A solution to the partition problem places each of the

N

users in one of

K

groups (or blocks) such that no two active nodes are in the same block. We consider a simple, but non-trivial and illustrative case of

K=2

active users and study the number of steps

T

used to solve the partition problem. By random coding and a suboptimal decoding scheme, we show that for any

T\geq (C_1 +\xi_1)\log N

, where

C_1

and

\xi_1

are positive constants (independent of

N

), and

\xi_1

can be arbitrary small, the partition problem can be solved with error probability

P_e^{(N)} \to 0

, for large

N

. Under the same scheme, we also bound

T

from the other direction, establishing that, for any

T \leq (C_2 - \xi_2) \log N

, the error probability

P_e^{(N)} \to 1

for large

N

; again

C_2

and

\xi_2

are constants and

\xi_2

can be arbitrarily small. These bounds on the number of steps are lower than the tight achievable lower-bound in terms of

T \geq (C_g +\xi)\log N

for group testing (in which all active users are identified, rather than just partitioned). Thus, partitioning may prove to be a more efficient approach for multi-access than group testing.Comment: This paper was submitted in June 2014 to IEEE Transactions on Information Theory, and is under review no

arXiv.org e-Print Archive

Crossref

Louisiana State University

Finding and testing network communities by lumped Markov chains

Author: A Barrat
A Lancichinetti
A Lancichinetti
A Lancichinetti
A Lancichinetti
AE Krause
BS Everitt
C Meyer
C Piccardi
Carlo Piccardi
D Garlaschelli
DJ Fenn
E Weinan
F Fouss
F Radicchi
G Fagiolo
G Flake
G Palla
H Zhou
IA Kovacs
J He
J Reichardt
J Reichardt
J Reichardt
J Sima
JC Delvenne
JG Kemeny
K Steinhaeuser
KH Hoffmann
L Danon
L Šubelj
M Barigozzi
M Holgersson
M Rosvall
MA Serrano
MEJ Newman
MEJ Newman
MEJ Newman
MEJ Newman
P Buchholz
P Jonsson
P Pons
R Guimera
R Kannan
R Narayanam
S Boccaletti
S Cafieri
S Fortunato
S Fortunato
S Fortunato
S Wasserman
SH Strogatz
VD Blondel
Y Hu
Y Hu
Y Kim
Yamir Moreno
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Identifying communities (or clusters), namely groups of nodes with comparatively strong internal connectivity, is a fundamental task for deeply understanding the structure and function of a network. Yet, there is a lack of formal criteria for defining communities and for testing their significance. We propose a sharp definition which is based on a significance threshold. By means of a lumped Markov chain model of a random walker, a quality measure called "persistence probability" is associated to a cluster. Then the cluster is defined as an "

\alpha

-community" if such a probability is not smaller than

\alpha

. Consistently, a partition composed of

\alpha

-communities is an "

\alpha

-partition". These definitions turn out to be very effective for finding and testing communities. If a set of candidate partitions is available, setting the desired

\alpha

-level allows one to immediately select the

\alpha

-partition with the finest decomposition. Simultaneously, the persistence probabilities quantify the significance of each single community. Given its ability in individually assessing the quality of each cluster, this approach can also disclose single well-defined communities even in networks which overall do not possess a definite clusterized structure

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Directory of Open Access Journals

PubMed Central

Improved model identification for non-linear systems using a random subsampling and multifold modelling (RSMM) approach

Author: Aguirre LA
Billings SA
Brown M
Chen S
Cherkassky V
Devijver PA
H.L. Wei
Hansen LK
Ljung L
Ljung L
Montgomery DC
Murray-Smith R
Pearson RK
S.A. Billings
Shao J
Shao J
Stone M
Tsang KM
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2009
Field of study

In non-linear system identification, the available observed data are conventionally partitioned into two parts: the training data that are used for model identification and the test data that are used for model performance testing. This sort of 'hold-out' or 'split-sample' data partitioning method is convenient and the associated model identification procedure is in general easy to implement. The resultant model obtained from such a once-partitioned single training dataset, however, may occasionally lack robustness and generalisation to represent future unseen data, because the performance of the identified model may be highly dependent on how the data partition is made. To overcome the drawback of the hold-out data partitioning method, this study presents a new random subsampling and multifold modelling (RSMM) approach to produce less biased or preferably unbiased models. The basic idea and the associated procedure are as follows. First, generate K training datasets (and also K validation datasets), using a K-fold random subsampling method. Secondly, detect significant model terms and identify a common model structure that fits all the K datasets using a new proposed common model selection approach, called the multiple orthogonal search algorithm. Finally, estimate and refine the model parameters for the identified common-structured model using a multifold parameter estimation method. The proposed method can produce robust models with better generalisation performance

Crossref

White Rose Research Online