Search CORE

3 research outputs found

FleXPath: flexible structure and full-text querying for XML

Author: AMER-YAHIA SIHEM
LAKSHMANAN LAKS VS
SHASHANK PANDIT
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Querying XML data is a well-explored topic with powerful database-style query languages such as XPath and XQuery set to become W3C standards. An equally compelling paradigm for querying XML documents is full-text search on textual content. In this paper, we study fundamental challenges that arise when we try to integrate these two querying paradigms. While keyword search is based on approximate matching, XPath has exact match semantics. We address this mismatch by considering queries on structure as a "template", and looking for answers that best match this template and the full-text search. To achieve this, we provide an elegant definition of relaxation on structure and define primitive operators to span the space of relaxations. Query answering is now based on ranking potential answers on structural and full-text search conditions. We set out certain desirable principles for ranking schemes and propose natural ranking schemes that adhere to these principles. We develop efficient algorithms for answering top-K queries and discuss results from a comprehensive set of experiments that demonstrate the utility and scalability of the proposed framework and algorithms.© AC

Dspace at IIT Bombay

On Efficient Approximate Queries over Machine Learning Models

Author: Amer-Yahia Sihem
Ding Dujian
Lakshmanan Laks VS
Publication venue
Publication date: 13/06/2022
Field of study

The question of answering queries over ML predictions has been gaining attention in the database community. This question is challenging because the cost of finding high quality answers corresponds to invoking an oracle such as a human expert or an expensive deep neural network model on every single item in the DB and then applying the query. We develop a novel unified framework for approximate query answering by leveraging a proxy to minimize the oracle usage of finding high quality answers for both Precision-Target (PT) and Recall-Target (RT) queries. Our framework uses a judicious combination of invoking the expensive oracle on data samples and applying the cheap proxy on the objects in the DB. It relies on two assumptions. Under the Proxy Quality assumption, proxy quality can be quantified in a probabilistic manner w.r.t. the oracle. This allows us to develop two algorithms: PQA that efficiently finds high quality answers with high probability and no oracle calls, and PQE, a heuristic extension that achieves empirically good performance with a small number of oracle calls. Alternatively, under the Core Set Closure assumption, we develop two algorithms: CSC that efficiently returns high quality answers with high probability and minimal oracle usage, and CSE, which extends it to more general settings. Our extensive experiments on five real-world datasets on both query types, PT and RT, demonstrate that our algorithms outperform the state-of-the-art and achieve high result quality with provable statistical guarantees.Comment: Submitted to VLDB 2023, 16 pages, 10 figures; added formal claims for section

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

Timber: A native xml database

Author: Jagadish Hosagrahar V
Al-Khalifa Shurug
Chapman Adriane
Lakshmanan Laks VS
Nierman Andrew
Paparizos Stelios
Patel Jignesh M
Srivastava Divesh
Wiwatwattana Nuwee
Wu Yuqing
Publication venue
Publication date: 01/01/2002
Field of study

Southampton (e-Prints Soton)

Biblioteca Digital de la Comunidad de Madrid