614 research outputs found

    Static Analysis for ECMAScript String Manipulation Programs

    Get PDF
    In recent years, dynamic languages, such as JavaScript or Python, have been increasingly used in a wide range of fields and applications. Their tricky and misunderstood behaviors pose a great challenge for static analysis of these languages. A key aspect of any dynamic language program is the multiple usage of strings, since they can be implicitly converted to another type value, transformed by string-to-code primitives or used to access an object-property. Unfortunately, string analyses for dynamic languages still lack precision and do not take into account some important string features. In this scenario, more precise string analyses become a necessity. The goal of this paper is to place a first step for precisely handling dynamic language string features. In particular, we propose a new abstract domain approximating strings as finite state automata and an abstract interpretation-based static analysis for the most common string manipulating operations provided by the ECMAScript specification. The proposed analysis comes with a prototype static analyzer implementation for an imperative string manipulating language, allowing us to show and evaluate the improved precision of the proposed analysis

    Static analysis for ECMAscript string manipulation programs

    Get PDF
    In recent years, dynamic languages, such as JavaScript or Python, have been increasingly used in a wide range of fields and applications. Their tricky and misunderstood behaviors pose a great challenge for static analysis of these languages. A key aspect of any dynamic language program is the multiple usage of strings, since they can be implicitly converted to another type value, transformed by string-to-code primitives or used to access an object-property. Unfortunately, string analyses for dynamic languages still lack precision and do not take into account some important string features. In this scenario, more precise string analyses become a necessity. The goal of this paper is to place a first step for precisely handling dynamic language string features. In particular, we propose a new abstract domain approximating strings as finite state automata and an abstract interpretation-based static analysis for the most common string manipulating operations provided by the ECMAScript specification. The proposed analysis comes with a prototype static analyzer implementation for an imperative string manipulating language, allowing us to show and evaluate the improved precision of the proposed analysis

    Ontology learning for the semantic deep web

    Get PDF
    Ontologies could play an important role in assisting users in their search for Web pages. This dissertation considers the problem of constructing natural ontologies that support users in their Web search efforts and increase the number of relevant Web pages that are returned. To achieve this goal, this thesis suggests combining the Deep Web information, which consists of dynamically generated Web pages and cannot be indexed by the existing automated Web crawlers, with ontologies, resulting in the Semantic Deep Web. The Deep Web information is exploited in three different ways: extracting attributes from the Deep Web data sources automatically, generating domain ontologies from the Deep Web automatically, and extracting instances from the Deep Web to enhance the domain ontologies. Several algorithms for the above mentioned tasks are presented. Lxperimeiital results suggest that the proposed methods assist users with finding more relevant Web sites. Another contribution of this dissertation includes developing a methodology to evaluate existing general purpose ontologies using the Web as a corpus. The quality of ontologies (QoO) is quantified by analyzing existing ontologies to get numeric measures of how natural their concepts and their relationships are. This methodology was first applied to several major, popular ontologies, such as WordNet, OpenCyc and the UMLS. Subsequently the domain ontologies developed in this research were evaluated from the naturalness perspective

    Garbage-Free Abstract Interpretation Through Abstract Reference Counting

    Get PDF
    Abstract garbage collection is the application of garbage collection to an abstract interpreter. Existing work has shown that abstract garbage collection can improve both the interpreter\u27s precision and performance. Current approaches rely on heuristics to decide when to apply abstract garbage collection. Garbage will build up and impact precision and performance when the collection is applied infrequently, while too frequent applications will bring about their own performance overhead. A balance between these tradeoffs is often difficult to strike. We propose a new approach to cope with the buildup of garbage in the results of an abstract interpreter. Our approach is able to eliminate all garbage, therefore obtaining the maximum precision and performance benefits of abstract garbage collection. At the same time, our approach does not require frequent heap traversals, and therefore adds little to the interpreters\u27s running time. The core of our approach uses reference counting to detect and eliminate garbage as soon as it arises. However, reference counting cannot deal with cycles, and we show that cycles are much more common in an abstract interpreter than in its concrete counterpart. To alleviate this problem, our approach detects cycles and employs reference counting at the level of strongly connected components. While this technique in general works for any system that uses reference counting, we argue that it works particularly well for an abstract interpreter. In fact, we show formally that for the continuation store, where most of the cycles occur, the cycle detection technique only requires O(1) amortized operations per continuation push. We present our approach formally, and provide a proof-of-concept implementation in the Scala-AM framework. We empirically show our approach achieves both the optimal precision and significantly better performance compared to existing approaches to abstract garbage collection

    Tight polynomial worst-case bounds for loop programs

    Get PDF
    In 2008, Ben-Amram, Jones and Kristiansen showed that for a simple programming language - representing non-deterministic imperative programs with bounded loops, and arithmetics limited to addition and multiplication - it is possible to decide precisely whether a program has certain growth-rate properties, in particular whether a computed value, or the program's running time, has a polynomial growth rate. A natural and intriguing problem was to move from answering the decision problem to giving a quantitative result, namely, a tight polynomial upper bound. This paper shows how to obtain asymptotically-tight, multivariate, disjunctive polynomial bounds for this class of programs. This is a complete solution: whenever a polynomial bound exists it will be found. A pleasant surprise is that the algorithm is quite simple; but it relies on some subtle reasoning. An important ingredient in the proof is the forest factorization theorem, a strong structural result on homomorphisms into a finite monoid

    Learning Invariants using Decision Trees and Implication Counterexamples

    Get PDF
    Inductive invariants can be robustly synthesized using a learning model where the teacher is a program verifier who instructs the learner through concrete program configurations, classified as positive, negative, and implications. We propose the first learning algorithms in this model with implication counter-examples that are based on scalable machine learning techniques. In particular, we extend decision tree learning algorithms, building new scalable and heuristic ways to construct small decision trees using statistical measures that account for implication counterexamples. We implement the learners and an appropriate teacher, and show that they are scalable, efficient and convergent in synthesizing adequate inductive invariants in a suite of more than 50 programs.Ope
    corecore