203,128 research outputs found

    Privacy-Preserving Regular Expression Matching using Nondeterministic Finite Automata

    Get PDF
    Motivated by the privacy requirements in network intrusion detection and DNS policy checking, we have developed a suite of protocols and algorithms for regular expression matching with enhanced privacy: - A new regular expression matching algorithm that is oblivious to the input strings, of which the complexity is only O(mn)O(mn) where mm and nn are the length of strings and the regular expression respectively. It is achieved by exploiting the structure of the Thompson nondeterministic automata. - A zero-knowledge proof of regular expression pattern matching in which a prover generates a proof to demonstrate that a public regular expression matches her input string without revealing the string itself. -Two secure-regex protocols that ensure the privacy of both the string and regular expression. The first protocol is based on the oblivious stack and reduces the complexity of the state-of-the-art from O(mn2)O(mn^2) to O(mnlogn)O(mn\log n). The second protocol relies on the oblivious transfer and performs better empirically when the size of regular expressions is smaller than 2122^{12}. We also evaluated our protocols in the context of encrypted DNS policy checking and intrusion detection and achieved 4.5X improvements over the state-of-the-art. These results also indicate the practicality of our approach in real-world applications

    A Trichotomy for Regular Simple Path Queries on Graphs

    Full text link
    Regular path queries (RPQs) select nodes connected by some path in a graph. The edge labels of such a path have to form a word that matches a given regular expression. We investigate the evaluation of RPQs with an additional constraint that prevents multiple traversals of the same nodes. Those regular simple path queries (RSPQs) find several applications in practice, yet they quickly become intractable, even for basic languages such as (aa)* or a*ba*. In this paper, we establish a comprehensive classification of regular languages with respect to the complexity of the corresponding regular simple path query problem. More precisely, we identify the fragment that is maximal in the following sense: regular simple path queries can be evaluated in polynomial time for every regular language L that belongs to this fragment and evaluation is NP-complete for languages outside this fragment. We thus fully characterize the frontier between tractability and intractability for RSPQs, and we refine our results to show the following trichotomy: Evaluations of RSPQs is either AC0, NL-complete or NP-complete in data complexity, depending on the regular language L. The fragment identified also admits a simple characterization in terms of regular expressions. Finally, we also discuss the complexity of the following decision problem: decide, given a language L, whether finding a regular simple path for L is tractable. We consider several alternative representations of L: DFAs, NFAs or regular expressions, and prove that this problem is NL-complete for the first representation and PSPACE-complete for the other two. As a conclusion we extend our results from edge-labeled graphs to vertex-labeled graphs and vertex-edge labeled graphs.Comment: 15 pages, conference submissio

    Answering SPARQL queries modulo RDF Schema with paths

    Get PDF
    SPARQL is the standard query language for RDF graphs. In its strict instantiation, it only offers querying according to the RDF semantics and would thus ignore the semantics of data expressed with respect to (RDF) schemas or (OWL) ontologies. Several extensions to SPARQL have been proposed to query RDF data modulo RDFS, i.e., interpreting the query with RDFS semantics and/or considering external ontologies. We introduce a general framework which allows for expressing query answering modulo a particular semantics in an homogeneous way. In this paper, we discuss extensions of SPARQL that use regular expressions to navigate RDF graphs and may be used to answer queries considering RDFS semantics. We also consider their embedding as extensions of SPARQL. These SPARQL extensions are interpreted within the proposed framework and their drawbacks are presented. In particular, we show that the PSPARQL query language, a strict extension of SPARQL offering transitive closure, allows for answering SPARQL queries modulo RDFS graphs with the same complexity as SPARQL through a simple transformation of the queries. We also consider languages which, in addition to paths, provide constraints. In particular, we present and compare nSPARQL and our proposal CPSPARQL. We show that CPSPARQL is expressive enough to answer full SPARQL queries modulo RDFS. Finally, we compare the expressiveness and complexity of both nSPARQL and the corresponding fragment of CPSPARQL, that we call cpSPARQL. We show that both languages have the same complexity through cpSPARQL, being a proper extension of SPARQL graph patterns, is more expressive than nSPARQL.Comment: RR-8394; alkhateeb2003

    Distributed PCP Theorems for Hardness of Approximation in P

    Get PDF
    We present a new distributed model of probabilistically checkable proofs (PCP). A satisfying assignment x{0,1}nx \in \{0,1\}^n to a CNF formula φ\varphi is shared between two parties, where Alice knows x1,,xn/2x_1, \dots, x_{n/2}, Bob knows xn/2+1,,xnx_{n/2+1},\dots,x_n, and both parties know φ\varphi. The goal is to have Alice and Bob jointly write a PCP that xx satisfies φ\varphi, while exchanging little or no information. Unfortunately, this model as-is does not allow for nontrivial query complexity. Instead, we focus on a non-deterministic variant, where the players are helped by Merlin, a third party who knows all of xx. Using our framework, we obtain, for the first time, PCP-like reductions from the Strong Exponential Time Hypothesis (SETH) to approximation problems in P. In particular, under SETH we show that there are no truly-subquadratic approximation algorithms for Bichromatic Maximum Inner Product over {0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate Regular Expression Matching, and Diameter in Product Metric. All our inapproximability factors are nearly-tight. In particular, for the first two problems we obtain nearly-polynomial factors of 2(logn)1o(1)2^{(\log n)^{1-o(1)}}; only (1+o(1))(1+o(1))-factor lower bounds (under SETH) were known before

    Capacity-achieving ensembles for the binary erasure channel with bounded complexity

    Full text link
    We present two sequences of ensembles of non-systematic irregular repeat-accumulate codes which asymptotically (as their block length tends to infinity) achieve capacity on the binary erasure channel (BEC) with bounded complexity per information bit. This is in contrast to all previous constructions of capacity-achieving sequences of ensembles whose complexity grows at least like the log of the inverse of the gap (in rate) to capacity. The new bounded complexity result is achieved by puncturing bits, and allowing in this way a sufficient number of state nodes in the Tanner graph representing the codes. We also derive an information-theoretic lower bound on the decoding complexity of randomly punctured codes on graphs. The bound holds for every memoryless binary-input output-symmetric channel and is refined for the BEC.Comment: 47 pages, 9 figures. Submitted to IEEE Transactions on Information Theor
    corecore