162,153 research outputs found

    FlashProfile: A Framework for Synthesizing Data Profiles

    Get PDF
    We address the problem of learning a syntactic profile for a collection of strings, i.e. a set of regex-like patterns that succinctly describe the syntactic variations in the strings. Real-world datasets, typically curated from multiple sources, often contain data in various syntactic formats. Thus, any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify the different formats is infeasible in standard big-data scenarios. Prior techniques are restricted to a small set of pre-defined patterns (e.g. digits, letters, words, etc.), and provide no control over granularity of profiles. We define syntactic profiling as a problem of clustering strings based on syntactic similarity, followed by identifying patterns that succinctly describe each cluster. We present a technique for synthesizing such profiles over a given language of patterns, that also allows for interactive refinement by requesting a desired number of clusters. Using a state-of-the-art inductive synthesis framework, PROSE, we have implemented our technique as FlashProfile. Across 153153 tasks over 7575 large real datasets, we observe a median profiling time of only ∌ 0.7 \sim\,0.7\,s. Furthermore, we show that access to syntactic profiles may allow for more accurate synthesis of programs, i.e. using fewer examples, in programming-by-example (PBE) workflows such as FlashFill.Comment: 28 pages, SPLASH (OOPSLA) 201

    Domain-Type-Guided Refinement Selection Based on Sliced Path Prefixes

    Full text link
    Abstraction is a successful technique in software verification, and interpolation on infeasible error paths is a successful approach to automatically detect the right level of abstraction in counterexample-guided abstraction refinement. Because the interpolants have a significant influence on the quality of the abstraction, and thus, the effectiveness of the verification, an algorithm for deriving the best possible interpolants is desirable. We present an analysis-independent technique that makes it possible to extract several alternative sequences of interpolants from one given infeasible error path, if there are several reasons for infeasibility in the error path. We take as input the given infeasible error path and apply a slicing technique to obtain a set of error paths that are more abstract than the original error path but still infeasible, each for a different reason. The (more abstract) constraints of the new paths can be passed to a standard interpolation engine, in order to obtain a set of interpolant sequences, one for each new path. The analysis can then choose from this set of interpolant sequences and select the most appropriate, instead of being bound to the single interpolant sequence that the interpolation engine would normally return. For example, we can select based on domain types of variables in the interpolants, prefer to avoid loop counters, or compare with templates for potential loop invariants, and thus control what kind of information occurs in the abstraction of the program. We implemented the new algorithm in the open-source verification framework CPAchecker and show that our proof-technique-independent approach yields a significant improvement of the effectiveness and efficiency of the verification process.Comment: 10 pages, 5 figures, 1 table, 4 algorithm

    Time-sensitive Information Flow Control in Timed Event-B

    Get PDF
    Protecting confidential data in today’s computing\ud environments is an important problem. Information flow\ud control can help to avoid information leakage and violations\ud introduced by executing the software applications. In software\ud development cycle, it is important to handle security related\ud issues from the beginning specifications at the level of abstract.\ud Mu [1] investigated the problem of preserving information flow\ud security in the Event-B specification models. A typed Event-\ud B model was presented to enforce information flow security\ud and to prevent direct flows introduced by the system. However,\ud in practice, timing behaviours of programs can also introduce\ud a covert flow. The problem of run-time flow monitoring and\ud controlling must also be addressed. This paper investigates\ud information flow control in the Event-B specification language\ud with timing constructs. We present a timed Event-B system\ud by introducing timers and relevant time constraints into the\ud system events. We suggest a time-sensitive flow security condition\ud for the timed Event-B systems, and present a type system\ud to close the covert channels of timing flows for the system by\ud ensuring the security condition. We then investigate how to\ud refine timed events during the stepwise refinement modelling\ud to satisfy the security condition

    On the Verified-by-Construction Approach

    No full text
    • 

    corecore