3 research outputs found

    A logic for document spanners

    Get PDF
    Document spanners are a formal framework for information extraction that was introduced by Fagin, Kimelfeld, Reiss, and Vansummeren (PODS 2013, JACM 2015). One of the central models in this framework are core spanners, which formalize the query language AQL that is used in IBM’s SystemT. As shown by Freydenberger and Holldack (ICDT 2016, ToCS 2018), there is a connection between core spanners and ECreg, the existential theory of concatenation with regular constraints. The present paper further develops this connection by defining SpLog, a fragment of ECreg that has the same expressive power as core spanners. This equivalence extends beyond equivalence of expressive power, as we show the existence of polynomial time conversions between SpLog and core spanners. Consequences and applications include an alternative way of defining relations for spanners, a pumping lemma for core spanners, and insights into the relative succinctness of various classes of spanner representations and their connection to graph querying languages. We also briefly discuss the connection between SpLog with negation and core spanners with a difference operator

    Language Operations with Regular Expressions of Polynomial Size

    Get PDF
    This work deals with questions regarding to what extent regularity-preserving language operations affect the descriptional complexity of regular expressions. Some language operations are identified which are feasible for regular expressions in the sense that the result of the operation can be represented as a regular expression of size polynomial in that of the operands. We prove that taking language quotients, in particular the prefix and suffix closures, of a regular set can incur at most a quadratic blow-up on the required expression size. The circular shift operation can cause only a cubic increase in size and an at least quadratic bloat can be necessary in the worst case
    corecore