Search CORE

52 research outputs found

Low-latency XPath Query Evaluation on Multi-Core Processors

Author: Casanova Henri
Karsin Benjamin
Lim Lipyeow
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2017
Field of study

XML and the XPath querying language have become ubiquitous data and querying standards used in many industrial settings and across the World-Wide Web. The high latency of XPath queries over large XML databases remains a problem for many applications. While this latency could be reduced by parallel execution, issues such as work partitioning, memory contention, and load imbalance may diminish the benefits of parallelization. We propose three parallel XPath query engines: Static Work Partitioning, Work Queue, and Producer- Consumer-Hybrid. All three engines attempt to solve the issue of load imbalance while minimizing sequential execution time and overhead. We analyze their performance on sets of synthetic and real-world datasets. Results obtained on two multi-core platforms show that while load-balancing is easily achieved for most synthetic datasets, real-world datasets prove more challenging. Nevertheless, our Producer-Consumer-Hybrid query engine achieves good results across the board (speedup up to 6.31 on an 8-core platform)

Crossref

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)

Optimizing Sensor Data Acquisition for Energy-Efficient Smartphone-based Continuous Event Processing

Author: LIM Lipyeow
MISRA Archan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2011
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Cost-Optimal Execution of Boolean Query Trees with Shared Streams

Author: Casanova Henri
Lim Lipyeow
Robert Yves
Vivien Frédéric
Zaidouni Dounia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/05/2014
Field of study

International audienceThe processing of queries expressed as trees of boolean operators applied to predicates on sensor data streams has several applications in mobile computing. Sensor data must be retrieved from the sensors, which incurs a cost, e.g., an energy expense that depletes the battery of a mobile query processing device. The objective is to determine the order in which predicates should be evaluated so as to shortcut part of the query evaluation and minimize the expected cost. This problem has been studied assuming that each data stream occurs at a single predicate. In this work we remove this assumption since it does not necessarily hold for real-world queries. Our main results are an optimal algorithm for single-level trees and a proof of NP-completeness for DNF trees. For DNF trees, however, we show that there is an optimal predicate evaluation order that corresponds to a depth-first traversal. This result provides inspiration for a class of heuristics. We show that one of these heuristics largely outperforms other sensible heuristics, including a heuristic proposed in previous work

HAL-ENS-LYON

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Cost-Optimal Execution of Trees of Boolean Operators with Shared Streams

Author: Casanova Henri
Lim Lipyeow
Robert Yves
Vivien Frédéric
Zaidouni Dounia
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

The processing of queries expressed as trees of boolean operators applied to predicates on sensor data streams has several applications in mobile computing. Sensor data must be retrieved from the sensors to a query processing device, such as a smartphone, over one or more network interfaces. Retrieving a data item incurs a cost, e.g., an energy expense that depletes the smartphone's battery. Since the query tree contains boolean operators, part of the tree can be shortcircuited depending on the retrieved sensor data. An interesting problem is to determine the order in which predicates should be evaluated so as to minimize the expected query processing cost. This problem has been studied in previous work assuming that each data stream occurs in a single predicate. In this work we remove this assumption since it does not necessarily hold for real-world queries. Our main results are an optimal algorithm for single-level trees and a proof of NP-completeness for DNF trees. For DNF trees, however, we show that there is an optimal predicate evaluation order that corresponds to a depth-first traversal. This result provides inspiration for a class of heuristics. We show that one of these heuristics largely outperforms other sensible heuristics, including the one heuristic proposed in previous work for our general version of the query processing problem.Le traitement de requêtes, exprimées sous forme d'arbres d'opérateurs booléens appliqués à des prédicats sur des flux de données de senseurs, a de nombreuses applications dans le domaine du calcul mobile. Les données doivent être transférées des senseurs vers l'appareil de traitement des données, par exemple un {smartphone}. Transférer une donnée induit un coût, par exemple une consommation énergétique qui diminuera la charge de la batterie du smartphone. Comme l'arbre de requêtes contient des opérateurs booléens, des pans de l'arbre peuvent être court-circuités en fonction des données récupérées. Un problème intéressant est de déterminer l'ordre dans lequel les prédicats doivent être évalués afin de minimiser l'espérance du coût du traitement de la requête. Ce problème a déjà été étudié sous l'hypothèse que chaque flux apparaît dans un seul prédicat. Dans le présent travail nous éliminons cette hypothèse qui ne correspond pas forcément à la réalité. Nos principaux résultats sont un algorithme optimal pour les arbres avec un seul niveau, et une preuve de NP-complétude pour les arbres sous forme normale disjonctive. Pour les arbres sous forme normale disjonctive, cependant, nous montrons qu'il existe un ordre optimal d'évaluation des prédicats qui correspond à un parcours en profondeur d'abord. Ce résultat nous sert à concevoir toute une classe d'heuristiques. Nous montrons que l'une de ces heuristiques a de bien meilleurs résultats que les autres heuristiques et, entre autres, que la seule heuristique précédemment proposée pour le cadre général

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Real time business performance monitoring and analysis using metric network

Author: Hui Lei
Lipyeow Lim
Pu Huang
Publication venue
Publication date: 01/01/2006
Field of study

Abstract-Monitoring and analyzing business performance in a continuous manner nowadays is crucial for enterprises to achieve operational excellence, and to better align daily operations with long-term business strategies. To do so, performance measures need to be collected from daily operations and aggregated to construct higher-level Key Performance Indicators (KPIs) in nearly real time. We propose a system called metric network for enterprise-wide business performance monitoring and analysis. A metric network consists of metrics, metric repositories, aggregation agents, and knowledge agents. We describe in details the generic procedure patterns of these metric network entities and their communication pattern. Our loosely coupled design makes it easy to enhance features by adding more metrics and agents. The proposed approach is examined using real metrics on a fictitious scenario

CiteSeerX

Efficient Update of Indexes for Dynamically Changing Web Documents

Author: Agarwal Ramesh
Lim Lipyeow
Padmanabhan Sriram
Vitter Jeffrey Scott
Wang Min
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/03/2011
Field of study

The original publication is available at www.springerlink.comRecent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchronized collection is not immediately searchable, because the keyword index is rebuilt from scratch less frequently than the collection can be refreshed. An inverted index is usually used to index documents crawled from the web. Complete index rebuild at high frequency is expensive. Previous work on incremental inverted index updates have been restricted to adding and removing documents. Updating the inverted index for previously indexed documents that have changed has not been addressed. In this paper, we propose an efficient method to update the inverted index for previously indexed documents whose contents have changed. Our method uses the idea of landmarks together with the diff algorithm to significantly reduce the number of postings in the inverted index that need to be updated. Our experiments verify that our landmark-diff method results in significant savings in the number of update operations on the inverted index

KU ScholarWorks

The Case for Cloud-Enabled Mobile Sensing Services

Author: BALAN Rajesh Krishna
LIM Lipyeow
MISRA Archan
SEN Sougata
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

Singapore MOE Academic Research Fund Tier

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Energy-efficient collaborative query processing framework for mobile sensing services

Author: LIM Lipyeow
MISRA Archan
MO Tianli
SATTLER Kai Uwe
YANG Jin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2013
Field of study

Ministry of Education, Singapore under its Academic Research Funding Tier

Crossref

Institutional Knowledge at Singapore Management University