203,128 research outputs found
Privacy-Preserving Regular Expression Matching using Nondeterministic Finite Automata
Motivated by the privacy requirements in network intrusion detection and DNS policy checking, we have developed a suite of protocols and algorithms for regular expression matching with enhanced privacy:
- A new regular expression matching algorithm that is oblivious to the input strings, of which the complexity is only where and are the length of strings and the regular expression respectively. It is achieved by exploiting the structure of the Thompson nondeterministic automata.
- A zero-knowledge proof of regular expression pattern matching in which a prover generates a proof to demonstrate that a public regular expression matches her input string without revealing the string itself.
-Two secure-regex protocols that ensure the privacy of both the string and regular expression. The first protocol is based on the oblivious stack and reduces the complexity of the state-of-the-art from to . The second protocol relies on the oblivious transfer and performs better empirically when the size of regular expressions is smaller than .
We also evaluated our protocols in the context of encrypted DNS policy checking and intrusion detection and achieved 4.5X improvements over the state-of-the-art. These results also indicate the practicality of our approach in real-world applications
A Trichotomy for Regular Simple Path Queries on Graphs
Regular path queries (RPQs) select nodes connected by some path in a graph.
The edge labels of such a path have to form a word that matches a given regular
expression. We investigate the evaluation of RPQs with an additional constraint
that prevents multiple traversals of the same nodes. Those regular simple path
queries (RSPQs) find several applications in practice, yet they quickly become
intractable, even for basic languages such as (aa)* or a*ba*.
In this paper, we establish a comprehensive classification of regular
languages with respect to the complexity of the corresponding regular simple
path query problem. More precisely, we identify the fragment that is maximal in
the following sense: regular simple path queries can be evaluated in polynomial
time for every regular language L that belongs to this fragment and evaluation
is NP-complete for languages outside this fragment. We thus fully characterize
the frontier between tractability and intractability for RSPQs, and we refine
our results to show the following trichotomy: Evaluations of RSPQs is either
AC0, NL-complete or NP-complete in data complexity, depending on the regular
language L. The fragment identified also admits a simple characterization in
terms of regular expressions.
Finally, we also discuss the complexity of the following decision problem:
decide, given a language L, whether finding a regular simple path for L is
tractable. We consider several alternative representations of L: DFAs, NFAs or
regular expressions, and prove that this problem is NL-complete for the first
representation and PSPACE-complete for the other two. As a conclusion we extend
our results from edge-labeled graphs to vertex-labeled graphs and vertex-edge
labeled graphs.Comment: 15 pages, conference submissio
Answering SPARQL queries modulo RDF Schema with paths
SPARQL is the standard query language for RDF graphs. In its strict
instantiation, it only offers querying according to the RDF semantics and would
thus ignore the semantics of data expressed with respect to (RDF) schemas or
(OWL) ontologies. Several extensions to SPARQL have been proposed to query RDF
data modulo RDFS, i.e., interpreting the query with RDFS semantics and/or
considering external ontologies. We introduce a general framework which allows
for expressing query answering modulo a particular semantics in an homogeneous
way. In this paper, we discuss extensions of SPARQL that use regular
expressions to navigate RDF graphs and may be used to answer queries
considering RDFS semantics. We also consider their embedding as extensions of
SPARQL. These SPARQL extensions are interpreted within the proposed framework
and their drawbacks are presented. In particular, we show that the PSPARQL
query language, a strict extension of SPARQL offering transitive closure,
allows for answering SPARQL queries modulo RDFS graphs with the same complexity
as SPARQL through a simple transformation of the queries. We also consider
languages which, in addition to paths, provide constraints. In particular, we
present and compare nSPARQL and our proposal CPSPARQL. We show that CPSPARQL is
expressive enough to answer full SPARQL queries modulo RDFS. Finally, we
compare the expressiveness and complexity of both nSPARQL and the corresponding
fragment of CPSPARQL, that we call cpSPARQL. We show that both languages have
the same complexity through cpSPARQL, being a proper extension of SPARQL graph
patterns, is more expressive than nSPARQL.Comment: RR-8394; alkhateeb2003
Distributed PCP Theorems for Hardness of Approximation in P
We present a new distributed model of probabilistically checkable proofs
(PCP). A satisfying assignment to a CNF formula is
shared between two parties, where Alice knows , Bob knows
, and both parties know . The goal is to have
Alice and Bob jointly write a PCP that satisfies , while
exchanging little or no information. Unfortunately, this model as-is does not
allow for nontrivial query complexity. Instead, we focus on a non-deterministic
variant, where the players are helped by Merlin, a third party who knows all of
.
Using our framework, we obtain, for the first time, PCP-like reductions from
the Strong Exponential Time Hypothesis (SETH) to approximation problems in P.
In particular, under SETH we show that there are no truly-subquadratic
approximation algorithms for Bichromatic Maximum Inner Product over
{0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate
Regular Expression Matching, and Diameter in Product Metric. All our
inapproximability factors are nearly-tight. In particular, for the first two
problems we obtain nearly-polynomial factors of ; only
-factor lower bounds (under SETH) were known before
Capacity-achieving ensembles for the binary erasure channel with bounded complexity
We present two sequences of ensembles of non-systematic irregular
repeat-accumulate codes which asymptotically (as their block length tends to
infinity) achieve capacity on the binary erasure channel (BEC) with bounded
complexity per information bit. This is in contrast to all previous
constructions of capacity-achieving sequences of ensembles whose complexity
grows at least like the log of the inverse of the gap (in rate) to capacity.
The new bounded complexity result is achieved by puncturing bits, and allowing
in this way a sufficient number of state nodes in the Tanner graph representing
the codes. We also derive an information-theoretic lower bound on the decoding
complexity of randomly punctured codes on graphs. The bound holds for every
memoryless binary-input output-symmetric channel and is refined for the BEC.Comment: 47 pages, 9 figures. Submitted to IEEE Transactions on Information
Theor
- …