411 research outputs found

    A Trichotomy for Regular Trail Queries

    Get PDF
    Regular path queries (RPQs) are an essential component of graph query languages. Such queries consider a regular expression r and a directed edge-labeled graph G and search for paths in G for which the sequence of labels is in the language of r. In order to avoid having to consider infinitely many paths, some database engines restrict such paths to be trails, that is, they only consider paths without repeated edges. In this paper we consider the evaluation problem for RPQs under trail semantics, in the case where the expression is fixed. We show that, in this setting, there exists a trichotomy. More precisely, the complexity of RPQ evaluation divides the regular languages into the finite languages, the class T_tract (for which the problem is tractable), and the rest. Interestingly, the tractable class in the trichotomy is larger than for the trichotomy for simple paths, discovered by Bagan et al. [Bagan et al., 2013]. In addition to this trichotomy result, we also study characterizations of the tractable class, its expressivity, the recognition problem, closure properties, and show how the decision problem can be extended to the enumeration problem, which is relevant to practice

    An Analytical Study of Large SPARQL Query Logs

    Full text link
    With the adoption of RDF as the data model for Linked Data and the Semantic Web, query specification from end- users has become more and more common in SPARQL end- points. In this paper, we conduct an in-depth analytical study of the queries formulated by end-users and harvested from large and up-to-date query logs from a wide variety of RDF data sources. As opposed to previous studies, ours is the first assessment on a voluminous query corpus, span- ning over several years and covering many representative SPARQL endpoints. Apart from the syntactical structure of the queries, that exhibits already interesting results on this generalized corpus, we drill deeper in the structural char- acteristics related to the graph- and hypergraph represen- tation of queries. We outline the most common shapes of queries when visually displayed as pseudographs, and char- acterize their (hyper-)tree width. Moreover, we analyze the evolution of queries over time, by introducing the novel con- cept of a streak, i.e., a sequence of queries that appear as subsequent modifications of a seed query. Our study offers several fresh insights on the already rich query features of real SPARQL queries formulated by real users, and brings us to draw a number of conclusions and pinpoint future di- rections for SPARQL query evaluation, query optimization, tuning, and benchmarking

    Evaluation and Enumeration Problems for Regular Path Queries

    Get PDF
    Regular path queries (RPQs) are a central component of graph databases. We investigate decision- and enumeration problems concerning the evaluation of RPQs under several semantics that have recently been considered: arbitrary paths, shortest paths, and simple paths. Whereas arbitrary and shortest paths can be enumerated in polynomial delay, the situation is much more intricate for simple paths. For instance, already the question if a given graph contains a simple path of a certain length has cases with highly non-trivial solutions and cases that are long-standing open problems. We study RPQ evaluation for simple paths from a parameterized complexity perspective and define a class of simple transitive expressions that is prominent in practice and for which we can prove a dichotomy for the evaluation problem. We observe that, even though simple path semantics is intractable for RPQs in general, it is feasible for the vast majority of RPQs that are used in practice. At the heart of our study on simple paths is a result of independent interest: the two disjoint paths problem in directed graphs is W[1]-hard if parameterized by the length of one of the two paths

    Separability by Short Subsequences and Subwords

    Get PDF
    The separability problem for regular languages asks, given two regular languages I and E, whether there exists a language S that separates the two, that is, includes I but contains nothing from E. Typically, S comes from a simple, less expressive class of languages than I and E. In general, a simple separator SS can be seen as an approximation of I or as an explanation of how I and E are different. In a database context, separators can be used for explaining the result of regular path queries or for finding explanations for the difference between paths in a graph database, that is, how paths from given nodes u_1 to v_1 are different from those from u_2 to v_2. We study the complexity of separability of regular languages by combinations of subsequences or subwords of a given length k. The rationale is that the parameter k can be used to influence the size and simplicity of the separator. The emphasis of our study is on tracing the tractability of the problem

    Frontiers of tractability for typechecking simple XML transformations

    Get PDF
    AbstractTypechecking consists of statically verifying whether the output of an XML transformation is always conform to an output type for documents satisfying a given input type. We focus on complete algorithms which always produce the correct answer. We consider top–down XML transformations incorporating XPath expressions and abstract document types by grammars and tree automata. By restricting schema languages and transformations, we identify several practical settings for which typechecking can be done in polynomial time. Moreover, the resulting framework provides a rather complete picture as we show that most scenarios cannot be enlarged without rendering the typechecking problem intractable. So, the present research sheds light on when to use fast complete algorithms and when to reside to sound but incomplete ones

    On the complexity of typechecking top-down XML transformations

    Get PDF
    AbstractWe investigate the typechecking problem for XML transformations: statically verifying that every answer to a transformation conforms to a given output schema, for inputs satisfying a given input schema. As typechecking quickly turns undecidable for query languages capable of testing equality of data values, we return to the limited framework where we abstract XML documents as labeled ordered trees. We focus on simple top-down recursive transformations motivated by XSLT and structural recursion on trees. We parameterize the problem by several restrictions on the transformations (deleting, non-deleting, bounded width) and consider both tree automata and DTDs as input and output schemas. The complexity of the typechecking problems in this scenario ranges from PTIME to EXPTIME
    • …
    corecore