27 research outputs found
GLL-based Context-Free Path Querying for Neo4j
We propose GLL-based context-free path querying algorithm which handles
queries in Extended Backus-Naur Form (EBNF) using Recursive State Machines
(RSM). Utilization of EBNF allows one to combine traditional regular
expressions and mutually recursive patterns in constraints natively. The
proposed algorithm solves both the reachability-only and the all-paths problems
for the all-pairs and the multiple sources cases. The evaluation on realworld
graphs demonstrates that utilization of RSMs increases performance of query
evaluation. Being implemented as a stored procedure for Neo4j, our solution
demonstrates better performance than a similar solution for RedisGraph.
Performance of our solution of regular path queries is comparable with
performance of native Neo4j solution, and in some cases our solution requires
significantly less memory
Efficient top K temporal spatial keyword search
Massive amount of data that are geo-tagged and associated with text information are being generated at an unprecedented scale in many emerging applications such as location based services and social networks. Due to their importance, a large body of work has focused on efficiently computing various spatial keyword queries. In this paper, we study the top-k temporal spatial keyword query which considers three important constraints during the search including time, spatial proximity and textual relevance. A novel index structure, namely SSG-tree, to efficiently insert/delete spatio-temporal web objects with high rates. Base on SSG-tree an efficient algorithm is developed to support top-k temporal spatial keyword query. We show via extensive experimentation with real spatial databases that our method has increased performance over alternate techniques
The NFA Acceptance Hypothesis: Non-Combinatorial and Dynamic Lower Bounds
We pose the fine-grained hardness hypothesis that the textbook algorithm for
the NFA Acceptance problem is optimal up to subpolynomial factors, even for
dense NFAs and fixed alphabets.
We show that this barrier appears in many variations throughout the
algorithmic literature by introducing a framework of Colored Walk problems.
These yield fine-grained equivalent formulations of the NFA Acceptance problem
as problems concerning detection of an --walk with a prescribed color
sequence in a given edge- or node-colored graph. For NFA Acceptance on sparse
NFAs (or equivalently, Colored Walk in sparse graphs), a tight lower bound
under the Strong Exponential Time Hypothesis has been rediscovered several
times in recent years. We show that our hardness hypothesis, which concerns
dense NFAs, has several interesting implications:
- It gives a tight lower bound for Context-Free Language Reachability. This
proves conditional optimality for the class of 2NPDA-complete problems,
explaining the cubic bottleneck of interprocedural program analysis.
- It gives a tight lower bound for the Word Break
problem on strings of length and dictionaries of total size .
- It implies the popular OMv hypothesis. Since the NFA acceptance problem is
a static (i.e., non-dynamic) problem, this provides a static reason for the
hardness of many dynamic problems.
Thus, a proof of the NFA Acceptance hypothesis would resolve several
interesting barriers. Conversely, a refutation of the NFA Acceptance hypothesis
may lead the way to attacking the current barriers observed for Context-Free
Language Reachability, the Word Break problem and the growing list of dynamic
problems proven hard under the OMv hypothesis.Comment: 31 pages, Accepted at ITC
Histogram techniques for cost estimation in query optimization.
Yu Xiaohui.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 98-115).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 2 --- Related Work --- p.6Chapter 2.1 --- Query Optimization --- p.6Chapter 2.2 --- Query Rewriting --- p.8Chapter 2.2.1 --- Optimizing Multi-Block Queries --- p.8Chapter 2.2.2 --- Semantic Query Optimization --- p.13Chapter 2.2.3 --- Query Rewriting in Starburst --- p.15Chapter 2.3 --- Plan Generation --- p.16Chapter 2.3.1 --- Dynamic Programming Approach --- p.16Chapter 2.3.2 --- Join Query Processing --- p.17Chapter 2.3.3 --- Queries with Aggregates --- p.23Chapter 2.4 --- Statistics and Cost Estimation --- p.24Chapter 2.5 --- Histogram Techniques --- p.27Chapter 2.5.1 --- Definitions --- p.28Chapter 2.5.2 --- Trivial Histograms --- p.29Chapter 2.5.3 --- Heuristic-based Histograms --- p.29Chapter 2.5.4 --- V-Optimal Histograms --- p.32Chapter 2.5.5 --- Wavelet-based Histograms --- p.35Chapter 2.5.6 --- Multidimensional Histograms --- p.35Chapter 2.5.7 --- Global Histograms --- p.37Chapter 3 --- New Histogram Techniques --- p.39Chapter 3.1 --- Piecewise Linear Histograms --- p.39Chapter 3.1.1 --- Construction --- p.41Chapter 3.1.2 --- Usage --- p.43Chapter 3.1.3 --- Error Measures --- p.43Chapter 3.1.4 --- Experiments --- p.45Chapter 3.1.5 --- Conclusion --- p.51Chapter 3.2 --- A-Optimal Histograms --- p.54Chapter 3.2.1 --- A-Optimal(mean) Histograms --- p.56Chapter 3.2.2 --- A-Optimal(median) Histograms --- p.58Chapter 3.2.3 --- A-Optimal(median-cf) Histograms --- p.59Chapter 3.2.4 --- Experiments --- p.60Chapter 4 --- Global Histograms --- p.64Chapter 4.1 --- Wavelet-based Global Histograms --- p.65Chapter 4.1.1 --- Wavelet-based Global Histograms I --- p.66Chapter 4.1.2 --- Wavelet-based Global Histograms II --- p.68Chapter 4.2 --- Piecewise Linear Global Histograms --- p.70Chapter 4.3 --- A-Optimal Global Histograms --- p.72Chapter 4.3.1 --- Experiments --- p.74Chapter 5 --- Dynamic Maintenance --- p.81Chapter 5.1 --- Problem Definition --- p.83Chapter 5.2 --- Refining Bucket Coefficients --- p.84Chapter 5.3 --- Restructuring --- p.86Chapter 5.4 --- Experiments --- p.91Chapter 6 --- Conclusions --- p.95Bibliography --- p.9