52 research outputs found
Low-latency XPath Query Evaluation on Multi-Core Processors
XML and the XPath querying language have become ubiquitous data and querying standards used in many industrial settings and across the World-Wide Web. The high latency of XPath queries over large XML databases remains a problem for many applications. While this latency could be reduced by parallel execution, issues such as work partitioning, memory contention, and load imbalance may diminish the benefits of parallelization. We propose three parallel XPath query engines: Static Work Partitioning, Work Queue, and Producer- Consumer-Hybrid. All three engines attempt to solve the issue of load imbalance while minimizing sequential execution time and overhead. We analyze their performance on sets of synthetic and real-world datasets. Results obtained on two multi-core platforms show that while load-balancing is easily achieved for most synthetic datasets, real-world datasets prove more challenging. Nevertheless, our Producer-Consumer-Hybrid query engine achieves good results across the board (speedup up to 6.31 on an 8-core platform)
Optimizing Sensor Data Acquisition for Energy-Efficient Smartphone-based Continuous Event Processing
Cost-Optimal Execution of Boolean Query Trees with Shared Streams
International audienceThe processing of queries expressed as trees of boolean operators applied to predicates on sensor data streams has several applications in mobile computing. Sensor data must be retrieved from the sensors, which incurs a cost, e.g., an energy expense that depletes the battery of a mobile query processing device. The objective is to determine the order in which predicates should be evaluated so as to shortcut part of the query evaluation and minimize the expected cost. This problem has been studied assuming that each data stream occurs at a single predicate. In this work we remove this assumption since it does not necessarily hold for real-world queries. Our main results are an optimal algorithm for single-level trees and a proof of NP-completeness for DNF trees. For DNF trees, however, we show that there is an optimal predicate evaluation order that corresponds to a depth-first traversal. This result provides inspiration for a class of heuristics. We show that one of these heuristics largely outperforms other sensible heuristics, including a heuristic proposed in previous work
Cost-Optimal Execution of Trees of Boolean Operators with Shared Streams
The processing of queries expressed as trees of boolean operators applied to predicates on sensor data streams has several applications in mobile computing. Sensor data must be retrieved from the sensors to a query processing device, such as a smartphone, over one or more network interfaces. Retrieving a data item incurs a cost, e.g., an energy expense that depletes the smartphone's battery. Since the query tree contains boolean operators, part of the tree can be shortcircuited depending on the retrieved sensor data. An interesting problem is to determine the order in which predicates should be evaluated so as to minimize the expected query processing cost. This problem has been studied in previous work assuming that each data stream occurs in a single predicate. In this work we remove this assumption since it does not necessarily hold for real-world queries. Our main results are an optimal algorithm for single-level trees and a proof of NP-completeness for DNF trees. For DNF trees, however, we show that there is an optimal predicate evaluation order that corresponds to a depth-first traversal. This result provides inspiration for a class of heuristics. We show that one of these heuristics largely outperforms other sensible heuristics, including the one heuristic proposed in previous work for our general version of the query processing problem.Le traitement de requĂȘtes, exprimĂ©es sous forme d'arbres d'opĂ©rateurs boolĂ©ens appliquĂ©s Ă des prĂ©dicats sur des flux de donnĂ©es de senseurs, a de nombreuses applications dans le domaine du calcul mobile. Les donnĂ©es doivent ĂȘtre transfĂ©rĂ©es des senseurs vers l'appareil de traitement des donnĂ©es, par exemple un {smartphone}. TransfĂ©rer une donnĂ©e induit un coĂ»t, par exemple une consommation Ă©nergĂ©tique qui diminuera la charge de la batterie du smartphone. Comme l'arbre de requĂȘtes contient des opĂ©rateurs boolĂ©ens, des pans de l'arbre peuvent ĂȘtre court-circuitĂ©s en fonction des donnĂ©es rĂ©cupĂ©rĂ©es. Un problĂšme intĂ©ressant est de dĂ©terminer l'ordre dans lequel les prĂ©dicats doivent ĂȘtre Ă©valuĂ©s afin de minimiser l'espĂ©rance du coĂ»t du traitement de la requĂȘte. Ce problĂšme a dĂ©jĂ Ă©tĂ© Ă©tudiĂ© sous l'hypothĂšse que chaque flux apparaĂźt dans un seul prĂ©dicat. Dans le prĂ©sent travail nous Ă©liminons cette hypothĂšse qui ne correspond pas forcĂ©ment Ă la rĂ©alitĂ©. Nos principaux rĂ©sultats sont un algorithme optimal pour les arbres avec un seul niveau, et une preuve de NP-complĂ©tude pour les arbres sous forme normale disjonctive. Pour les arbres sous forme normale disjonctive, cependant, nous montrons qu'il existe un ordre optimal d'Ă©valuation des prĂ©dicats qui correspond Ă un parcours en profondeur d'abord. Ce rĂ©sultat nous sert Ă concevoir toute une classe d'heuristiques. Nous montrons que l'une de ces heuristiques a de bien meilleurs rĂ©sultats que les autres heuristiques et, entre autres, que la seule heuristique prĂ©cĂ©demment proposĂ©e pour le cadre gĂ©nĂ©ral
Real time business performance monitoring and analysis using metric network
Abstract-Monitoring and analyzing business performance in a continuous manner nowadays is crucial for enterprises to achieve operational excellence, and to better align daily operations with long-term business strategies. To do so, performance measures need to be collected from daily operations and aggregated to construct higher-level Key Performance Indicators (KPIs) in nearly real time. We propose a system called metric network for enterprise-wide business performance monitoring and analysis. A metric network consists of metrics, metric repositories, aggregation agents, and knowledge agents. We describe in details the generic procedure patterns of these metric network entities and their communication pattern. Our loosely coupled design makes it easy to enhance features by adding more metrics and agents. The proposed approach is examined using real metrics on a fictitious scenario
Efficient Update of Indexes for Dynamically Changing Web Documents
The original publication is available at www.springerlink.comRecent work on incremental crawling has enabled the indexed document collection of a
search engine to be more synchronized with the changing World Wide Web. However, this
synchronized collection is not immediately searchable, because the keyword index is rebuilt
from scratch less frequently than the collection can be refreshed. An inverted index is usually
used to index documents crawled from the web. Complete index rebuild at high frequency is
expensive. Previous work on incremental inverted index updates have been restricted to adding
and removing documents. Updating the inverted index for previously indexed documents that
have changed has not been addressed.
In this paper, we propose an efficient method to update the inverted index for previously
indexed documents whose contents have changed. Our method uses the idea of landmarks
together with the diff algorithm to significantly reduce the number of postings in the inverted
index that need to be updated. Our experiments verify that our landmark-diff method results
in significant savings in the number of update operations on the inverted index
The Case for Cloud-Enabled Mobile Sensing Services
Singapore MOE Academic Research Fund Tier
Energy-efficient collaborative query processing framework for mobile sensing services
Ministry of Education, Singapore under its Academic Research Funding Tier
- âŠ