8 research outputs found

    Context-Free Path Querying by Matrix Multiplication

    Full text link
    Graph data models are widely used in many areas, for example, bioinformatics, graph databases. In these areas, it is often required to process queries for large graphs. Some of the most common graph queries are navigational queries. The result of query evaluation is a set of implicit relations between nodes of the graph, i.e. paths in the graph. A natural way to specify these relations is by specifying paths using formal grammars over the alphabet of edge labels. An answer to a context-free path query in this approach is usually a set of triples (A, m, n) such that there is a path from the node m to the node n, whose labeling is derived from a non-terminal A of the given context-free grammar. This type of queries is evaluated using the relational query semantics. Another example of path query semantics is the single-path query semantics which requires presenting a single path from the node m to the node n, whose labeling is derived from a non-terminal A for all triples (A, m, n) evaluated using the relational query semantics. There is a number of algorithms for query evaluation which use these semantics but all of them perform poorly on large graphs. One of the most common technique for efficient big data processing is the use of a graphics processing unit (GPU) to perform computations, but these algorithms do not allow to use this technique efficiently. In this paper, we show how the context-free path query evaluation using these query semantics can be reduced to the calculation of the matrix transitive closure. Also, we propose an algorithm for context-free path query evaluation which uses relational query semantics and is based on matrix operations that make it possible to speed up computations by using a GPU.Comment: 9 pages, 11 figures, 2 table

    Active Learning of Points-To Specifications

    Full text link
    When analyzing programs, large libraries pose significant challenges to static points-to analysis. A popular solution is to have a human analyst provide points-to specifications that summarize relevant behaviors of library code, which can substantially improve precision and handle missing code such as native code. We propose ATLAS, a tool that automatically infers points-to specifications. ATLAS synthesizes unit tests that exercise the library code, and then infers points-to specifications based on observations from these executions. ATLAS automatically infers specifications for the Java standard library, and produces better results for a client static information flow analysis on a benchmark of 46 Android apps compared to using existing handwritten specifications

    ์ฝœ-๋ฆฌํ„ด ์ง์ด ๋งž๋Š” ๊ฒฝ๋กœ์— ๋Œ€ํ•œ ์‚ฌ์šฉ์ž ์ƒํ˜ธ์ž‘์šฉ์„ ํ†ตํ•ด ์ •๋ณดํ๋ฆ„ ๋ถ„์„ ๊ฒฝ๋ณด๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฐฉ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2017. 2. ์ด๊ด‘๊ทผ.๋ณธ ๋…ผ๋ฌธ์€ ์ •๋ณด ํ๋ฆ„ ๋ถ„์„๊ธฐ์˜ ๊ฒฝ๋ณด ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ์ž์ƒํ˜ธ์ž‘์šฉ์„ ํ†ตํ•œ ํšจ์œจ์ ์ธ ๊ฒฝ๋ณด ๋ถ„๋ฅ˜ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ ์‚ฌ์šฉ์ž์ œ์•ฝ์‹์„ ๋งŒ์กฑํ•˜๋Š” ์ฝœ-๋ฆฌํ„ด ์ง์ด ๋งž๋Š” ์ตœ๋‹จ ํ•จ์ˆ˜ ํ˜ธ์ถœ ๊ฒฝ๋กœ๋ฅผ ์•ˆ์ „ํ•˜๊ณ ํšจ์œจ์ ์œผ๋กœ ์ฐพ์Œ์œผ๋กœ์จ ์‚ฌ์šฉ์ž์˜ ๊ฒฝ๋ณด ๋ถ„๋ฅ˜๋ฅผ ๋•๋Š”๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด ๋ฐฉ๋ฒ•์˜์‹ค์šฉ์„ฑ์„ ๋ณด์ด๊ธฐ ์œ„ํ•˜์—ฌ ๊ฒฝ๋ณด ๋ถ„๋ฅ˜ ์‹œ์Šคํ…œ SHOVEL์„ ๋””์ž์ธ ๋ฐ๊ตฌํ˜„ํ•˜์˜€๋‹ค. ๊ตฌํ˜„๋œ ๊ฒฝ๋ณด ๋ถ„๋ฅ˜๊ธฐ SHOVEL๊ณผ ์ •์  ๋ถ„์„๊ธฐ SPARROW๋ฅผ ํ†ตํ•ด์ด 44๊ฐœ์˜ ์˜คํ”ˆ์†Œ์Šค C ํ”„๋กœ๊ทธ๋žจ์œผ๋กœ๋ถ€ํ„ฐ 360๊ฐœ์˜ ๊ฒฝ๋ณด๋ฅผ ๋ถ„๋ฅ˜ํ•˜์˜€๋‹ค. ์ด๊ณผ์ •์—์„œ ๊ฒฝ๋ณด์˜ ์ง„์œ„์—ฌ๋ถ€๋ฅผ ํ‰๊ท  2.93ํšŒ์˜ ์ ์€ ์ˆ˜์˜ ์‚ฌ์šฉ์ž์ƒํ˜ธ์ž‘์šฉ์œผ๋กœ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋˜ํ•œ, ๊ฒฝ๋ณด ๋ถ„๋ฅ˜๋ฅผ ํ†ตํ•ด 48๊ฐœ์˜ ๋ฒ„๊ทธ๋ฅผ๋ฐœ๊ฒฌํ•˜์˜€๊ณ  ๊ทธ์ค‘ 3๊ฐœ์˜ ๋ฒ„๊ทธ์— ๋Œ€ํ•ด CVE๋ฒˆํ˜ธ๋ฅผ ๋ถ€์—ฌ๋ฐ›์•˜๋‹ค.์ œ 1 ์žฅ ์„œ๋ก  1 1.1 ๋™๊ธฐ 1 1.2 ํ•ด๊ฒฐ์ฑ… 1 1.3 ์‚ฌ์šฉ์ž ์ƒํ˜ธ์ž‘์šฉ 2 1.3.1 ์‚ฌ์šฉ์ž ์ƒํ˜ธ์ž‘์šฉ ์˜ˆ์ œ 3 1.4 ๊ฒฐ๊ณผ 6 1.5 ๋…ผ๋ฌธ์˜ ๊ตฌ์„ฑ 7 ์ œ 2 ์žฅ ์ฝœ-๋ฆฌํ„ด ์ง์ด ๋งž๋Š” ํ•จ์ˆ˜ ํ˜ธ์ถœ ๊ฒฝ๋กœ ํ‘œํ˜„ 8 2.1 ํ•จ์ˆ˜ ํ˜ธ์ถœ ๊ฒฝ๋กœ 8 2.2 ์ฝœ-๋ฆฌํ„ด ์ง์ด ๋งž๋Š” ๊ฒฝ๋กœ ์ •์˜ 8 2.3 ์ฝœ-๋ฆฌํ„ด ์ง์ด ๋งž๋Š” ๊ฒฝ๋กœ์˜ ํšจ์œจ์ ์ธ ํ‘œํ˜„ 10 ์ œ 3 ์žฅ ์ฝœ-๋ฆฌํ„ด ์ง์ด ๋งž๋Š” ํ•จ์ˆ˜ ํ˜ธ์ถœ ๊ฒฝ๋กœ ํƒ์ƒ‰ 12 3.1 ์•Œ๊ณ ๋ฆฌ์ฆ˜ 12 3.2 ์ฝœ-๋ฆฌํ„ด ์ง์ด ๋งž๋Š” ๊ฒฝ๋กœ์˜ ๋ถ€์šธ์‹ ์ธ์ฝ”๋”ฉ 13 3.2.1 ๋ถ€์šธ์‹ ์ธ์ฝ”๋”ฉ ํ•จ์ˆ˜ ฮฆ 14 3.2.2 ฮฆ ์ ์šฉ 16 ์ œ 4 ์žฅ ์‹คํ—˜ ๊ฒฐ๊ณผ 22 4.1 ์‹คํ—˜ ํ™˜๊ฒฝ 22 4.2 ํ‰๊ฐ€ 23 4.3 ๋ฐœ๊ฒฌ๋œ ์‹ค์ œ ์ทจ์•ฝ์  26 ์ œ 5 ์žฅ ์—ฐ๊ตฌ์˜ ํ•œ๊ณ„ ๋ฐ ๋ณด์™„ ์‚ฌํ•ญ 29 5.1 ์—ฐ๊ตฌ์˜ ํ•œ๊ณ„ 29 5.2 ๋ณด์™„ ์‚ฌํ•ญ 29 ์ œ 6 ์žฅ ๊ด€๋ จ ์—ฐ๊ตฌ 31 ์ œ 7 ์žฅ ๊ฒฐ๋ก  33 ์ œ A ์žฅ ๋ถ€๋ก 34 ์ฐธ๊ณ ๋ฌธํ—Œ 39 Abstract 44Maste

    Lifestate: Event-Driven Protocols and Callback Control Flow

    Get PDF
    Developing interactive applications (apps) against event-driven software frameworks such as Android is notoriously difficult. To create apps that behave as expected, developers must follow complex and often implicit asynchronous programming protocols. Such protocols intertwine the proper registering of callbacks to receive control from the framework with appropriate application-programming interface (API) calls that in turn affect the set of possible future callbacks. An app violates the protocol when, for example, it calls a particular API method in a state of the framework where such a call is invalid. What makes automated reasoning hard in this domain is largely what makes programming apps against such frameworks hard: the specification of the protocol is unclear, and the control flow is complex, asynchronous, and higher-order. In this paper, we tackle the problem of specifying and modeling event-driven application-programming protocols. In particular, we formalize a core meta-model that captures the dialogue between event-driven frameworks and application callbacks. Based on this meta-model, we define a language called lifestate that permits precise and formal descriptions of application-programming protocols and the callback control flow imposed by the event-driven framework. Lifestate unifies modeling what app callbacks can expect of the framework with specifying rules the app must respect when calling into the framework. In this way, we effectively combine lifecycle constraints and typestate rules. To evaluate the effectiveness of lifestate modeling, we provide a dynamic verification algorithm that takes as input a trace of execution of an app and a lifestate protocol specification to either produce a trace witnessing a protocol violation or a proof that no such trace is realizable

    On the Practice and Application of Context-Free Language Reachability

    Get PDF
    The Context-Free Language Reachability (CFL-R) formalism relates to some of the most important computational problems facing researchers and industry practitioners. CFL-R is a generalisation of graph reachability and language recognition, such that pairs in a labelled graph are reachable if and only if there is a path between them whose labels, joined together in the order they were encountered, spell a word in a given context-free language. The formalism finds particular use as a vehicle for phrasing and reasoning about program analysis, since complex relationships within the data, logic or structure of computer programs are easily expressed and discovered in CFL-R. Unfortunately, The potential of CFL-R can not be met by state of the art solvers. Current algorithms have scalability and expressibility issues that prevent them from being used on large graph instances or complex grammars. This work outlines our efforts in understanding the practical concerns surrounding CFL-R, and applying this knowledge to improve the performance of CFL-R applications. We examine the major difficulties with solving CFL-R-based analyses at-scale, via a case-study of points-to analysis as a CFL-R problem. Points-to analysis is fundamentally important to many modern research and industry efforts, and is relevant to optimisation, bug-checking and security technologies. Our understanding of the scalability challenge motivates work in developing practical CFL-R techniques. We present improved evaluation algorithms and declarative optimisation techniques for CFL-R, capitalising on the simplicity of CFL-R to creating fully automatic methodologies. The culmination of our work is a general-purpose and high-performance tool called Cauliflower, a solver-generator for CFL-R problems. We describe Cauliflower and evaluate its performance experimentally, showing significant improvement over alternative general techniques

    Specification Inference Using Context-Free Language Reachability

    No full text
    We present a framework for computing context-free language reachability properties when parts of the program are missing. Our framework infers candidate specifications for missing pro-gram pieces that are needed for verifying a property of interest, and presents these specifications to a human auditor for validation. We have implemented this framework for a taint analysis of Android apps that relies on specifications for Android library methods. In an extensive experimental study on 179 apps, our tool performs veri-fication with only a small number of queries to a human auditor

    Specification Inference Using Context-Free Language Reachability

    No full text
    corecore