353 research outputs found

    Static and dynamic semantics of NoSQL languages

    Get PDF
    We present a calculus for processing semistructured data that spans differences of application area among several novel query languages, broadly categorized as "NoSQL". This calculus lets users define their own operators, capturing a wider range of data processing capabilities, whilst providing a typing precision so far typical only of primitive hard-coded operators. The type inference algorithm is based on semantic type checking, resulting in type information that is both precise, and flexible enough to handle structured and semistructured data. We illustrate the use of this calculus by encoding a large fragment of Jaql, including operations and iterators over JSON, embedded SQL expressions, and co-grouping, and show how the encoding directly yields a typing discipline for Jaql as it is, namely without the addition of any type definition or type annotation in the code

    RML: Runtime Monitoring Language

    Get PDF
    Runtime verification is a relatively new software verification technique that aims to prove the correctness of a specific run of a program, rather than statically verify the code. The program is instrumented in order to collect all the relevant information, and the resulting trace of events is inspected by a monitor that verifies its compliance with respect to a specification of the expected properties of the system under scrutiny. Many languages exist that can be used to formally express the expected behavior of a system, with different design choices and degrees of expressivity. This thesis presents RML, a specification language designed for runtime verification, with the goal of being completely modular and independent from the instrumentation and the kind of system being monitored. RML is highly expressive, and allows one to express complex, parametric, non-context-free properties concisely. RML is compiled down to TC, a lower level calculus, which is fully formalized with a deterministic, rewriting-based semantics. In order to evaluate the approach, an open source implementation has been developed, and several examples with Node.js programs have been tested. Benchmarks show the ability of the monitors automatically generated from RML specifications to effectively and efficiently verify complex properties

    Efficient asymmetric inclusion of regular expressions with interleaving and counting for XML type-checking

    Get PDF
    The inclusion of Regular Expressions (REs) is the kernel of any type-checking algorithm for XML manipulation languages. XML applications would benefit from the extension of REs with interleaving and counting, but this is not feasible in general, since inclusion is EXPSPACE-complete for such extended REs. In Colazzo et al. (2009) [1] we introduced a notion of ?conflict-free REs?, which are extended REs with excellent complexity behaviour, including a polynomial inclusion algorithm [1] and linear membership (Ghelli et al., 2008 [2]). Conflict-free REs have interleaving and counting, but the complexity is tamed by the ?conflict-free? limitations, which have been found to be satisfied by the vast majority of the content models published on the Web.However, a type-checking algorithm needs to compare machine-generated subtypes against human-defined supertypes. The conflict-free restriction, while quite harmless for the human-defined supertype, is far too restrictive for the subtype. We show here that the PTIME inclusion algorithm can be actually extended to deal with totally unrestricted REs with counting and interleaving in the subtype position, provided that the supertype is conflict-free.This is exactly the expressive power that we need in order to use subtyping inside type-checking algorithms, and the cost of this generalized algorithm is only quadratic, which is as good as the best algorithm we have for the symmetric case (see [1]). The result is extremely surprising, since we had previously found that symmetric inclusion becomes NP-hard as soon as the candidate subtype is enriched with binary intersection, a generalization that looked much more innocent than what we achieve here

    Abelian Primitive Words

    Full text link
    We investigate Abelian primitive words, which are words that are not Abelian powers. We show that unlike classical primitive words, the set of Abelian primitive words is not context-free. We can determine whether a word is Abelian primitive in linear time. Also different from classical primitive words, we find that a word may have more than one Abelian root. We also consider enumeration problems and the relation to the theory of codes

    A decidable weakening of Compass Logic based on cone-shaped cardinal directions

    Get PDF
    We introduce a modal logic, called Cone Logic, whose formulas describe properties of points in the plane and spatial relationships between them. Points are labelled by proposition letters and spatial relations are induced by the four cone-shaped cardinal directions. Cone Logic can be seen as a weakening of Venema's Compass Logic. We prove that, unlike Compass Logic and other projection-based spatial logics, its satisfiability problem is decidable (precisely, PSPACE-complete). We also show that it is expressive enough to capture meaningful interval temporal logics - in particular, the interval temporal logic of Allen's relations "Begins", "During", and "Later", and their transposes

    On some modifications and applications of the post correspondence problem

    Get PDF
    The Post Correspondence Problem was introduced by Emil Post in 1946. The problem considers pairs of lists of sequences of symbols, or words, where each word has its place on the list determined by its index. The Post Correspondence Problem asks does there exist a sequence of indices so that, when we write the words in the order of the sequence as single words from both lists, the two resulting words are equal. Post proved the problem to be undecidable, that is, no algorithm deciding it can exist. A variety of restrictions and modifications have been introduced to the original formulation of the problem, that have then been shown to be either decidable or undecidable. Both the original Post Correspondence Problem and its modifications have been widely used in proving other decision problems undecidable. In this thesis we consider some modifications of the Post Correspondence Problem as well as some applications of it in undecidability proofs. We consider a modification for sequences of indices that are infinite to two directions. We also consider a modification to the original Post Correspondence Problem where instead of the words being equal for a sequence of indices, we take two sequences that are conjugates of each other. Two words are conjugates if we can write one word by taking the other and moving some part of that word from the end to the beginning. Both modifications are shown to be undecidable. We also use the Post Correspondence Problem and its modification for injective morphisms in proving two problems from formal language theory to be undecidable; the first problem is on special shuffling of words and the second problem on fixed points of rational functions

    Computing Possible and Certain Answers over Order-Incomplete Data

    Full text link
    This paper studies the complexity of query evaluation for databases whose relations are partially ordered; the problem commonly arises when combining or transforming ordered data from multiple sources. We focus on queries in a useful fragment of SQL, namely positive relational algebra with aggregates, whose bag semantics we extend to the partially ordered setting. Our semantics leads to the study of two main computational problems: the possibility and certainty of query answers. We show that these problems are respectively NP-complete and coNP-complete, but identify tractable cases depending on the query operators or input partial orders. We further introduce a duplicate elimination operator and study its effect on the complexity results.Comment: 55 pages, 56 references. Extended journal version of arXiv:1707.07222. Up to the stylesheet, page/environment numbering, and possible minor publisher-induced changes, this is the exact content of the journal paper that will appear in Theoretical Computer Scienc

    On the Complexity of Free Word Orders

    Get PDF
    International audienceWe propose some extensions of mildly context-sensitive for- malisms whose aim is to model free word orders in natural languages. We give a detailed analysis of the complexity of the formalisms we propose

    On the membership problem for pattern languages and related topics

    Get PDF
    In this thesis, we investigate the complexity of the membership problem for pattern languages. A pattern is a string over the union of the alphabets A and X, where X := {x_1, x_2, x_3, ...} is a countable set of variables and A is a finite alphabet containing terminals (e.g., A := {a, b, c, d}). Every pattern, e.g., p := x_1 x_2 a b x_2 b x_1 c x_2, describes a pattern language, i.e., the set of all words that can be obtained by uniformly substituting the variables in the pattern by arbitrary strings over A. Hence, u := cacaaabaabcaccaa is a word of the pattern language of p, since substituting cac for x_1 and aa for x_2 yields u. On the other hand, there is no way to obtain the word u' := bbbababbacaaba by substituting the occurrences of x_1 and x_2 in p by words over A. The problem to decide for a given pattern q and a given word w whether or not w is in the pattern language of q is called the membership problem for pattern languages. Consequently, (p, u) is a positive instance and (p, u') is a negative instance of the membership problem for pattern languages. For the unrestricted case, i.e., for arbitrary patterns and words, the membership problem is NP-complete. In this thesis, we identify classes of patterns for which the membership problem can be solved efficiently. Our first main result in this regard is that the variable distance, i.e., the maximum number of different variables that separate two consecutive occurrences of the same variable, substantially contributes to the complexity of the membership problem for pattern languages. More precisely, for every class of patterns with a bounded variable distance the membership problem can be solved efficiently. The second main result is that the same holds for every class of patterns with a bounded scope coincidence degree, where the scope coincidence degree is the maximum number of intervals that cover a common position in the pattern, where each interval is given by the leftmost and rightmost occurrence of a variable in the pattern. The proof of our first main result is based on automata theory. More precisely, we introduce a new automata model that is used as an algorithmic framework in order to show that the membership problem for pattern languages can be solved in time that is exponential only in the variable distance of the corresponding pattern. We then take a closer look at this automata model and subject it to a sound theoretical analysis. The second main result is obtained in a completely different way. We encode patterns and words as relational structures and we then reduce the membership problem for pattern languages to the homomorphism problem of relational structures, which allows us to exploit the concept of the treewidth. This approach turns out be successful, and we show that it has potential to identify further classes of patterns with a polynomial time membership problem. Furthermore, we take a closer look at two aspects of pattern languages that are indirectly related to the membership problem. Firstly, we investigate the phenomenon that patterns can describe regular or context-free languages in an unexpected way, which implies that their membership problem can be solved efficiently. In this regard, we present several sufficient conditions and necessary conditions for the regularity and context-freeness of pattern languages. Secondly, we compare pattern languages with languages given by so-called extended regular expressions with backreferences (REGEX). The membership problem for REGEX languages is very important in practice and since REGEX are similar to pattern languages, it might be possible to improve algorithms for the membership problem for REGEX languages by investigating their relationship to patterns. In this regard, we investigate how patterns can be extended in order to describe large classes of REGEX languages
    • …
    corecore