9 research outputs found
Approximate Query Service on Autonomous IoT Cameras
Elf is a runtime for an energy-constrained camera to continuously summarize
video scenes as approximate object counts. Elf's novelty centers on planning
the camera's count actions under energy constraint. (1) Elf explores the rich
action space spanned by the number of sample image frames and the choice of
per-frame object counters; it unifies errors from both sources into one single
bounded error. (2) To decide count actions at run time, Elf employs a
learning-based planner, jointly optimizing for past and future videos without
delaying result materialization. Tested with more than 1,000 hours of videos
and under realistic energy constraints, Elf continuously generates object
counts within only 11% of the true counts on average. Alongside the counts, Elf
presents narrow errors shown to be bounded and up to 3.4x smaller than
competitive baselines. At a higher level, Elf makes a case for advancing the
geographic frontier of video analytics
Indexing Views to Route and Plan Queries in a Peer Data Management System
P2P computing gains increasing attention lately, since it provides the means for realizing computing systems that scale to very large numbers of participating peers, while ensuring high autonomy and fault-tolerance. Peer Data Management Systems (PDMS) have been proposed to support sophisticated facilities in exchanging, querying and integrating (semi-)structured data hosted by peers. In this thesis, we are interested in routing and planning graph queries in a PDMS, where peers advertise their local bases using fragments of community RDF/S schemas (i.e., views). We introduce an original encoding for these fragments, in order to e#ciently check whether a peer view is subsumed by a query. We rely on this encoding to design an RDF/S view lookup service featuring a stateless and a statefull execution over a DHT-based P2P infrastructure. We design and implement a mechanism based on an interleaved execution of the routing and planning activities in order to distribute the processing of a query. We finally evaluate experimentally our system (a) to demonstrate its scalability for large P2P networks and arbitrary RDF/S schema fragments, (b) to estimate the number of routing hops required by the two versions of our lookup service and (c) to demonstrate the degree of distribution achieved by the interleaved query routing and planning. To the best of our knowledge this is the first system o#ering the aforementioned functionality and performance
Adaptive Compression for Fast Scans on String Columns
State-of-the-art OLAP systems tend to use columnar data representations,
as these are both suitable for analytics and amenable to compression.
Local dictionary value encoding has been shown to achieve high
compression rates for string columns while still allowing fast filtered
scans. In this paper, we argue that the effectiveness and efficiency of
local dictionary compression is limited by data repetition across file
blocks and by dictionary look-ups inside each block during filtered scan
execution. To address this problem, we introduce an adaptive compression
technique that is based on differential dictionaries and targets both
storage efficiency and query performance. The proposed scheme reduces
dramatically the need to store repeated values across different file
blocks and significantly accelerates read operations by reducing the
time needed for dictionary look-ups. A preliminary set of experiments
has given very promising results, showing that, in many cases, the
proposed new dictionary compression scheme is much more efficient than
existing techniques, occasionally up to an order of magnitude
Heuristics-based Query Optimisation for SPARQL
Query optimization in RDF Stores is a challenging problem as SPARQL queries typically contain many more joins than equivalent relational plans, and hence lead to a large join order search space. In such cases, cost-based query optimization often is not possible. One practical reason for this is that statistics typically are missing in web scale setting such as the Linked Open Datasets (LOD). The more profound reason is that due to the absence of schematic structure in RDF, join-hit ratio estimation requires complicated forms of correlated join statistics; and currently there are no methods to identify the relevant correlations beforehand. For this reason, the use of good heuristics is essential in SPARQL query optimization, even in the case that are partially used with cost-based statistics (i.e., hybrid query optimization). In this paper we describe a set of useful heuristics for SPARQL query optimizers. We present these in the context of a new Heuristic SPARQL Planner (HSP) that is capable of exploiting the syntactic and the structural variations of the triple patterns in a SPARQL query in order to choose an execution plan without the need of any cost model. For this, we define the variable graph and we show a reduction of the SPARQL query optimization problem to the maximum weight independent set problem. We implemented our planner on top of the MonetDB open source column-store and evaluated its effectiveness against the state-ofthe-art RDF-3X engine as well as comparing the plan quality with a relational (SQL) equivalent of the benchmarks. 1
MonetDB/MonetDB Jul2017_release
This is the official mirror of the MonetDB Mercurial repository. Please note that we do not accept pull requests on github. The regresession test results can be found on the MonetDB Testweb http://monetdb.cwi.nl/testweb/web/status.php .For contributions please see: https://www.monetdb.org/Developer