Search CORE

24,786 research outputs found

Answering Why-not Questions on Reverse Top-k Queries

Author: CHEN Gang
GAO Yunjun
LIU Qing
ZHENG Baihua
ZHOU Linlin
Publication venue: 'VLDB Endowment'
Publication date: 01/02/2015
Field of study

Why-not questions, which aim to seek clarifications on the missing tuples for query results, have recently received considerable attention from the database community. In this paper, we systematically explore why-not questions on reverse top-k queries , owing to its importance in multi-criteria decision making. Given an initial reverse top- k query and a missing/why-not weighting vector set W m that is absent from the query result, why-not questions on reverse top- k queries explain why W m does not appear in the query result and provide suggestions on how to refine the initial query with minimum penalty to include W m in the refined query result. We first formalize why-not questions on reverse top- k queries and reveal their semantics, and then propose a unified framework called WQRTQ to answer why-not questions on both monochromatic and bichromatic reverse top- k queries. Our framework offers three solutions, namely, (i) modifying a query point q , (ii) modifying a why-not weighting vector set W m and a parameter k , and (iii) modifying q , W m , and k simultaneously, to cater for different application scenarios. Extensive experimental evaluation using both real and synthetic data sets verifies the effectiveness and efficiency of the presented algorithms. </jats:p

Crossref

Institutional Knowledge at Singapore Management University

Answering why-not and why questions on reverse top-k queries

Author: CHEN Gang
GAO Yunjun
LIU Qing
ZHENG Baihua
ZHOU Linlin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/09/2016
Field of study

Crossref

Institutional Knowledge at Singapore Management University

SQL Query Completion for Data Exploration

Author: Guilly Marie Le
Petit Jean-Marc
Scuturici Vasile-Marian
Publication venue
Publication date: 07/02/2018
Field of study

Within the big data tsunami, relational databases and SQL are still there and remain mandatory in most of cases for accessing data. On the one hand, SQL is easy-to-use by non specialists and allows to identify pertinent initial data at the very beginning of the data exploration process. On the other hand, it is not always so easy to formulate SQL queries: nowadays, it is more and more frequent to have several databases available for one application domain, some of them with hundreds of tables and/or attributes. Identifying the pertinent conditions to select the desired data, or even identifying relevant attributes is far from trivial. To make it easier to write SQL queries, we propose the notion of SQL query completion: given a query, it suggests additional conditions to be added to its WHERE clause. This completion is semantic, as it relies on the data from the database, unlike current completion tools that are mostly syntactic. Since the process can be repeated over and over again -- until the data analyst reaches her data of interest --, SQL query completion facilitates the exploration of databases. SQL query completion has been implemented in a SQL editor on top of a database management system. For the evaluation, two questions need to be studied: first, does the completion speed up the writing of SQL queries? Second , is the completion easily adopted by users? A thorough experiment has been conducted on a group of 70 computer science students divided in two groups (one with the completion and the other one without) to answer those questions. The results are positive and very promising

arXiv.org e-Print Archive

HAL

Hal-Diderot

ANSWERING WHY-NOT QUESTIONS ON REVERSE SKYLINE QUERIES OVER INCOMPLETE DATA

Author: Connery Tosca Yoel
Santoso Bagus Jati
Publication venue: 'Lembaga Penelitian dan Pengabdian kepada Masyarakat ITS'
Publication date: 12/03/2019
Field of study

Recently, the development of the query-based preferences has received considerable attention from researchers and data users. One of the most popular preference-based queries is the skyline query, which will give a subset of superior records that are not dominated by any other records. As the developed version of skyline queries, a reverse skyline query rise. This query aims to get information about the query points that make a data or record as the part of result of their skyline query. Furthermore, data-oriented IT development requires scientists to be able to process data in all conditions. In the real world, there exist incomplete multidimensional data, both because of damage, loss, and privacy. In order to increase the usability over a data set, this study will discuss one of the problems in processing reverse skyline queries over incomplete data, namely the "why-not" problem. The considered solution to this "why-not" problem is advice and steps so that a query point that does not initially consider an incomplete data, as a result, can later make the record or incomplete data as part of the results. In this study, there will be further discussion about the dominance relationship between incomplete data along with the solution of the problem. Moreover, some performance evaluations are conducted to measure the level of efficiency and effectiveness

JUTI: Jurnal Ilmiah Teknologi Informasi

Towards Why-Not Spatial Keyword Top-k Queries:A Direction-Aware Approach

Author: Chen Lei
Jensen Christian S.
Li Yafei
Xu Jianliang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Crossref

VBN

YASK:A Why-Not Question Answering Engine for Spatial Keyword Query Services

Author: Chen Lei
Jensen Christian Søndergaard
Li Yafei
Xu Jianliang
Publication venue
Publication date: 01/09/2016
Field of study

VBN

Crowdsourcing Multiple Choice Science Questions

Author: Gardner Matt
Liu Nelson F.
Welbl Johannes
Publication venue
Publication date: 01/01/2017
Field of study

We present a novel method for obtaining high-quality, domain-targeted multiple choice questions from crowd workers. Generating these questions can be difficult without trading away originality, relevance or diversity in the answer options. Our method addresses these problems by leveraging a large corpus of domain-specific text and a small set of existing questions. It produces model suggestions for document selection and answer distractor choice which aid the human question generation process. With this method we have assembled SciQ, a dataset of 13.7K multiple choice science exam questions (Dataset available at http://allenai.org/data.html). We demonstrate that the method produces in-domain questions by providing an analysis of this new dataset and by showing that humans cannot distinguish the crowdsourced questions from original questions. When using SciQ as additional training data to existing questions, we observe accuracy improvements on real science exams.Comment: accepted for the Workshop on Noisy User-generated Text (W-NUT) 201

arXiv.org e-Print Archive

Crossref

SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting

Author: Cai Ling
Janowicz Krzysztof
Lao Ni
Mai Gengchen
Regalia Blake
Shi Meilin
Yan Bo
Zhu Rui
Publication venue: 'Wiley'
Publication date: 25/04/2020
Field of study

Learning knowledge graph (KG) embeddings is an emerging technique for a variety of downstream tasks such as summarization, link prediction, information retrieval, and question answering. However, most existing KG embedding models neglect space and, therefore, do not perform well when applied to (geo)spatial data and tasks. For those models that consider space, most of them primarily rely on some notions of distance. These models suffer from higher computational complexity during training while still losing information beyond the relative distance between entities. In this work, we propose a location-aware KG embedding model called SE-KGE. It directly encodes spatial information such as point coordinates or bounding boxes of geographic entities into the KG embedding space. The resulting model is capable of handling different types of spatial reasoning. We also construct a geographic knowledge graph as well as a set of geographic query-answer pairs called DBGeo to evaluate the performance of SE-KGE in comparison to multiple baselines. Evaluation results show that SE-KGE outperforms these baselines on the DBGeo dataset for geographic logic query answering task. This demonstrates the effectiveness of our spatially-explicit model and the importance of considering the scale of different geographic entities. Finally, we introduce a novel downstream task called spatial semantic lifting which links an arbitrary location in the study area to entities in the KG via some relations. Evaluation on DBGeo shows that our model outperforms the baseline by a substantial margin.Comment: Accepted to Transactions in GI

arXiv.org e-Print Archive

Crossref

Explore Bristol Research