Search CORE

90,768 research outputs found

A Case Study in Matching Test and Proof Coverage

Author: Ledru Y.
du Bousquet L.
Dadeau F.
Allouti F.
Publication venue: Elsevier B.V.
Publication date: 01/01/1928
Field of study

AbstractThis paper studies the complementarity of test and deductive proof processes for Java programs specified in JML (Java Modeling Language). The proof of a program may be long and difficult, especially when automatic provers give up. When a theorem is not automatically proved, there are two possibilities: either the theorem is correct and there are not enough pieces of information to deal with the proof, or the theorem is incorrect. In order to discriminate between those two alternatives, testing techniques can be used. Here, we present experiments around the use of the JACK tool to prove Java programs annotated with JML assertions. When JACK fails to decide proof obligations, we use a combinatorial testing tool, TOBIAS, to produce large test suites that exercise the unproved program parts. The key issue is to establish the relevance of the test suite with respect to the unproved proof obligations. Therefore, we use code coverage techniques: our approach takes advantage of the statement orientation of the JACK tool to compare the statements involved in the unproved proof obligations and the statements covered by the test suite. Finally, we ensure our confidence within the test suites, by evaluating them on mutant program killing exercises. These techniques have been put into practice and are illustrated by a simple case study

Elsevier - Publisher Connector

Crossref

Galiciana

Objective Bayes and Conditional Frequentist Inference

Author: Kuffner Todd Alan
Kuffner Todd Alan
Publication venue: Mathematics, Imperial College London
Publication date: 01/09/2011
Field of study

Objective Bayesian methods have garnered considerable interest and support among statisticians, particularly over the past two decades. It has often been ignored, however, that in some cases the appropriate frequentist inference to match is a conditional one. We present various methods for extending the probability matching prior (PMP) methods to conditional settings. A method based on saddlepoint approximations is found to be the most tractable and we demonstrate its use in the most common exact ancillary statistic models. As part of this analysis, we give a proof of an exactness property of a particular PMP in location-scale models. We use the proposed matching methods to investigate the relationships between conditional and unconditional PMPs. A key component of our analysis is a numerical study of the performance of probability matching priors from both a conditional and unconditional perspective in exact ancillary models. In concluding remarks we propose many routes for future research

Spiral - Imperial College Digital Repository

XRay: Enhancing the Web's Transparency with Differential Correlation

Author: Chaintreau Augustin
Ducoffe Guillaume
Geambasu Roxana
Lan Francis
Lecuyer Mathias
Papancea Andrei
Petsios Theofilos
Spahn Riley
Publication venue
Publication date: 20/08/2014
Field of study

Today's Web services - such as Google, Amazon, and Facebook - leverage user data for varied purposes, including personalizing recommendations, targeting advertisements, and adjusting prices. At present, users have little insight into how their data is being used. Hence, they cannot make informed choices about the services they choose. To increase transparency, we developed XRay, the first fine-grained, robust, and scalable personal data tracking system for the Web. XRay predicts which data in an arbitrary Web account (such as emails, searches, or viewed products) is being used to target which outputs (such as ads, recommended products, or prices). XRay's core functions are service agnostic and easy to instantiate for new services, and they can track data within and across services. To make predictions independent of the audited service, XRay relies on the following insight: by comparing outputs from different accounts with similar, but not identical, subsets of data, one can pinpoint targeting through correlation. We show both theoretically, and through experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision and recall by correlating data from a surprisingly small number of extra accounts.Comment: Extended version of a paper presented at the 23rd USENIX Security Symposium (USENIX Security 14

arXiv.org e-Print Archive

CiteSeerX

HAL-UNICE

INRIA a CCSD electronic archive server

Synthesizing Program Input Grammars

Author: Albarghouthi A.
Cadar C.
Cho C. Y.
Forrester J. E.
Godefroid P.
Holler C.
Huang L.
Lee L.
Oncina J.
Solomonoff R. J.
Sutton M.
Sutton M.
Vardhan A.
Viide J.
Wondracek G.
Publication venue
Publication date: 16/06/2017
Field of study

We present an algorithm for synthesizing a context-free grammar encoding the language of valid program inputs from a set of input examples and blackbox access to the program. Our algorithm addresses shortcomings of existing grammar inference algorithms, which both severely overgeneralize and are prohibitively slow. Our implementation, GLADE, leverages the grammar synthesized by our algorithm to fuzz test programs with structured inputs. We show that GLADE substantially increases the incremental coverage on valid inputs compared to two baseline fuzzers

arXiv.org e-Print Archive

Crossref

BIGMAC : breaking inaccurate genomes and merging assembled contigs for long read metagenomic assembly.

Author: Clum Alicia
Hall Richard
Lam Ka-Kit
Rao Satish
Publication venue: eScholarship, University of California
Publication date: 01/10/2016
Field of study

BackgroundThe problem of de-novo assembly for metagenomes using only long reads is gaining attention. We study whether post-processing metagenomic assemblies with the original input long reads can result in quality improvement. Previous approaches have focused on pre-processing reads and optimizing assemblers. BIGMAC takes an alternative perspective to focus on the post-processing step.ResultsUsing both the assembled contigs and original long reads as input, BIGMAC first breaks the contigs at potentially mis-assembled locations and subsequently scaffolds contigs. Our experiments on metagenomes assembled from long reads show that BIGMAC can improve assembly quality by reducing the number of mis-assemblies while maintaining or increasing N50 and N75. Moreover, BIGMAC shows the largest N75 to number of mis-assemblies ratio on all tested datasets when compared to other post-processing tools.ConclusionsBIGMAC demonstrates the effectiveness of the post-processing approach in improving the quality of metagenomic assemblies

PubMed Central

eScholarship - University of California

Frequentist and Bayesian measures of confidence via multiscale bootstrap for testing three regions

Author: B. Efron
B. Efron
B. Efron
E.L. Lehmann
G.S. Datta
H. Akaike
H. Shimodaira
H. Shimodaira
H. Shimodaira
H.W. Peers
Hidetoshi Shimodaira
J. Felsenstein
J.P. DuPreez
M.D. Perlman
M.D. Perlman
R. Tibshirani
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/06/2008
Field of study

A new computation method of frequentist

p

-values and Bayesian posterior probabilities based on the bootstrap probability is discussed for the multivariate normal model with unknown expectation parameter vector. The null hypothesis is represented as an arbitrary-shaped region. We introduce new parametric models for the scaling-law of bootstrap probability so that the multiscale bootstrap method, which was designed for one-sided test, can also computes confidence measures of two-sided test, extending applicability to a wider class of hypotheses. Parameter estimation is improved by the two-step multiscale bootstrap and also by including higher-order terms. Model selection is important not only as a motivating application of our method, but also as an essential ingredient in the method. A compromise between frequentist and Bayesian is attempted by showing that the Bayesian posterior probability with an noninformative prior is interpreted as a frequentist

p

-value of ``zero-sided'' test

arXiv.org e-Print Archive

Crossref

Research Papers in Economics

Pattern Matching for sets of segments

Author: Efrat Alon
Indyk Piotr
Venkatasubramanian Suresh
Publication venue
Publication date: 01/01/2000
Field of study

In this paper we present algorithms for a number of problems in geometric pattern matching where the input consist of a collections of segments in the plane. Our work consists of two main parts. In the first, we address problems and measures that relate to collections of orthogonal line segments in the plane. Such collections arise naturally from problems in mapping buildings and robot exploration. We propose a new measure of segment similarity called a \emph{coverage measure}, and present efficient algorithms for maximising this measure between sets of axis-parallel segments under translations. Our algorithms run in time O(n^3\polylog n) in the general case, and run in time O(n^2\polylog n) for the case when all segments are horizontal. In addition, we show that when restricted to translations that are only vertical, the Hausdorff distance between two sets of horizontal segments can be computed in time roughly O(n^{3/2}{\sl polylog}n). These algorithms form significant improvements over the general algorithm of Chew et al. that takes time

O(n^4 \log^2 n)

. In the second part of this paper we address the problem of matching polygonal chains. We study the well known \Frd, and present the first algorithm for computing the \Frd under general translations. Our methods also yield algorithms for computing a generalization of the \Fr distance, and we also present a simple approximation algorithm for the \Frd that runs in time O(n^2\polylog n).Comment: To appear in the 12 ACM Symposium on Discrete Algorithms, Jan 200

arXiv.org e-Print Archive

CiteSeerX

Exact Post Model Selection Inference for Marginal Screening

Author: Lee Jason D
Taylor Jonathan E
Publication venue
Publication date: 27/02/2014
Field of study

We develop a framework for post model selection inference, via marginal screening, in linear regression. At the core of this framework is a result that characterizes the exact distribution of linear functions of the response

y

, conditional on the model being selected (``condition on selection" framework). This allows us to construct valid confidence intervals and hypothesis tests for regression coefficients that account for the selection procedure. In contrast to recent work in high-dimensional statistics, our results are exact (non-asymptotic) and require no eigenvalue-like assumptions on the design matrix

X

. Furthermore, the computational cost of marginal regression, constructing confidence intervals and hypothesis testing is negligible compared to the cost of linear regression, thus making our methods particularly suitable for extremely large datasets. Although we focus on marginal screening to illustrate the applicability of the condition on selection framework, this framework is much more broadly applicable. We show how to apply the proposed framework to several other selection procedures including orthogonal matching pursuit, non-negative least squares, and marginal screening+Lasso

arXiv.org e-Print Archive

CiteSeerX