90,768 research outputs found

    A Case Study in Matching Test and Proof Coverage

    Get PDF
    AbstractThis paper studies the complementarity of test and deductive proof processes for Java programs specified in JML (Java Modeling Language). The proof of a program may be long and difficult, especially when automatic provers give up. When a theorem is not automatically proved, there are two possibilities: either the theorem is correct and there are not enough pieces of information to deal with the proof, or the theorem is incorrect. In order to discriminate between those two alternatives, testing techniques can be used. Here, we present experiments around the use of the JACK tool to prove Java programs annotated with JML assertions. When JACK fails to decide proof obligations, we use a combinatorial testing tool, TOBIAS, to produce large test suites that exercise the unproved program parts. The key issue is to establish the relevance of the test suite with respect to the unproved proof obligations. Therefore, we use code coverage techniques: our approach takes advantage of the statement orientation of the JACK tool to compare the statements involved in the unproved proof obligations and the statements covered by the test suite. Finally, we ensure our confidence within the test suites, by evaluating them on mutant program killing exercises. These techniques have been put into practice and are illustrated by a simple case study

    Objective Bayes and Conditional Frequentist Inference

    Get PDF
    Objective Bayesian methods have garnered considerable interest and support among statisticians, particularly over the past two decades. It has often been ignored, however, that in some cases the appropriate frequentist inference to match is a conditional one. We present various methods for extending the probability matching prior (PMP) methods to conditional settings. A method based on saddlepoint approximations is found to be the most tractable and we demonstrate its use in the most common exact ancillary statistic models. As part of this analysis, we give a proof of an exactness property of a particular PMP in location-scale models. We use the proposed matching methods to investigate the relationships between conditional and unconditional PMPs. A key component of our analysis is a numerical study of the performance of probability matching priors from both a conditional and unconditional perspective in exact ancillary models. In concluding remarks we propose many routes for future research

    XRay: Enhancing the Web's Transparency with Differential Correlation

    Get PDF
    Today's Web services - such as Google, Amazon, and Facebook - leverage user data for varied purposes, including personalizing recommendations, targeting advertisements, and adjusting prices. At present, users have little insight into how their data is being used. Hence, they cannot make informed choices about the services they choose. To increase transparency, we developed XRay, the first fine-grained, robust, and scalable personal data tracking system for the Web. XRay predicts which data in an arbitrary Web account (such as emails, searches, or viewed products) is being used to target which outputs (such as ads, recommended products, or prices). XRay's core functions are service agnostic and easy to instantiate for new services, and they can track data within and across services. To make predictions independent of the audited service, XRay relies on the following insight: by comparing outputs from different accounts with similar, but not identical, subsets of data, one can pinpoint targeting through correlation. We show both theoretically, and through experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision and recall by correlating data from a surprisingly small number of extra accounts.Comment: Extended version of a paper presented at the 23rd USENIX Security Symposium (USENIX Security 14

    Synthesizing Program Input Grammars

    Full text link
    We present an algorithm for synthesizing a context-free grammar encoding the language of valid program inputs from a set of input examples and blackbox access to the program. Our algorithm addresses shortcomings of existing grammar inference algorithms, which both severely overgeneralize and are prohibitively slow. Our implementation, GLADE, leverages the grammar synthesized by our algorithm to fuzz test programs with structured inputs. We show that GLADE substantially increases the incremental coverage on valid inputs compared to two baseline fuzzers

    BIGMAC : breaking inaccurate genomes and merging assembled contigs for long read metagenomic assembly.

    Get PDF
    BackgroundThe problem of de-novo assembly for metagenomes using only long reads is gaining attention. We study whether post-processing metagenomic assemblies with the original input long reads can result in quality improvement. Previous approaches have focused on pre-processing reads and optimizing assemblers. BIGMAC takes an alternative perspective to focus on the post-processing step.ResultsUsing both the assembled contigs and original long reads as input, BIGMAC first breaks the contigs at potentially mis-assembled locations and subsequently scaffolds contigs. Our experiments on metagenomes assembled from long reads show that BIGMAC can improve assembly quality by reducing the number of mis-assemblies while maintaining or increasing N50 and N75. Moreover, BIGMAC shows the largest N75 to number of mis-assemblies ratio on all tested datasets when compared to other post-processing tools.ConclusionsBIGMAC demonstrates the effectiveness of the post-processing approach in improving the quality of metagenomic assemblies

    Frequentist and Bayesian measures of confidence via multiscale bootstrap for testing three regions

    Full text link
    A new computation method of frequentist pp-values and Bayesian posterior probabilities based on the bootstrap probability is discussed for the multivariate normal model with unknown expectation parameter vector. The null hypothesis is represented as an arbitrary-shaped region. We introduce new parametric models for the scaling-law of bootstrap probability so that the multiscale bootstrap method, which was designed for one-sided test, can also computes confidence measures of two-sided test, extending applicability to a wider class of hypotheses. Parameter estimation is improved by the two-step multiscale bootstrap and also by including higher-order terms. Model selection is important not only as a motivating application of our method, but also as an essential ingredient in the method. A compromise between frequentist and Bayesian is attempted by showing that the Bayesian posterior probability with an noninformative prior is interpreted as a frequentist pp-value of ``zero-sided'' test

    Pattern Matching for sets of segments

    Full text link
    In this paper we present algorithms for a number of problems in geometric pattern matching where the input consist of a collections of segments in the plane. Our work consists of two main parts. In the first, we address problems and measures that relate to collections of orthogonal line segments in the plane. Such collections arise naturally from problems in mapping buildings and robot exploration. We propose a new measure of segment similarity called a \emph{coverage measure}, and present efficient algorithms for maximising this measure between sets of axis-parallel segments under translations. Our algorithms run in time O(n^3\polylog n) in the general case, and run in time O(n^2\polylog n) for the case when all segments are horizontal. In addition, we show that when restricted to translations that are only vertical, the Hausdorff distance between two sets of horizontal segments can be computed in time roughly O(n^{3/2}{\sl polylog}n). These algorithms form significant improvements over the general algorithm of Chew et al. that takes time O(n4log2n)O(n^4 \log^2 n). In the second part of this paper we address the problem of matching polygonal chains. We study the well known \Frd, and present the first algorithm for computing the \Frd under general translations. Our methods also yield algorithms for computing a generalization of the \Fr distance, and we also present a simple approximation algorithm for the \Frd that runs in time O(n^2\polylog n).Comment: To appear in the 12 ACM Symposium on Discrete Algorithms, Jan 200

    Exact Post Model Selection Inference for Marginal Screening

    Full text link
    We develop a framework for post model selection inference, via marginal screening, in linear regression. At the core of this framework is a result that characterizes the exact distribution of linear functions of the response yy, conditional on the model being selected (``condition on selection" framework). This allows us to construct valid confidence intervals and hypothesis tests for regression coefficients that account for the selection procedure. In contrast to recent work in high-dimensional statistics, our results are exact (non-asymptotic) and require no eigenvalue-like assumptions on the design matrix XX. Furthermore, the computational cost of marginal regression, constructing confidence intervals and hypothesis testing is negligible compared to the cost of linear regression, thus making our methods particularly suitable for extremely large datasets. Although we focus on marginal screening to illustrate the applicability of the condition on selection framework, this framework is much more broadly applicable. We show how to apply the proposed framework to several other selection procedures including orthogonal matching pursuit, non-negative least squares, and marginal screening+Lasso
    corecore