Search CORE

10,696 research outputs found

Efficient Methods for Aggregate Reverse Rank Queries

Author: CHEN Hanxiong
DONG Yuyang
FURUSE Kazutaka
KITAGAWA Hiroyuki
北川博之
古瀬一隆
陳漢雄
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date: 01/04/2018
Field of study

Given two data sets of user preferences and product attributes in addition to a set of query products, the aggregate reverse rank (ARR) query returns top-k users who regard the given query products as the highest aggregate rank than other users. ARR queries are designed to focus on product bundling in marketing. Manufacturers are mostly willing to bundle several products together for the purpose of maximizing benefits or inventory liquidation. This naturally leads to an increase in data on users and products. Thus, the problem of efficiently processing ARR queries become a big issue. In this paper, we reveal two limitations of the state-of-the-art solution to ARR query; that is, (a) It has poor efficiency when the distribution of the query set is dispersive. (b) It has to process a large portion user data. To address these limitations, we develop a cluster-and-process method and a sophisticated indexing strategy. From the theoretical analysis of the results and experimental comparisons, we conclude that our proposals have superior performance

Tsukuba Repository

Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses

Author: Dillon Tharam S.
Feng Ling
Publication venue
Publication date: 01/01/2003
Field of study

A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses

CiteSeerX

University of Twente Research Information

Reverse k-Ranks Queries on Large Graphs

Author: Cheung DWL
Li H
Liu Y
Mamoulis N
Qian Y
Publication venue: Konstanz, Germany
Publication date: 01/01/2017
Field of study

published_or_final_versio

HKU Scholars Hub

Efficient All Top-k Computation - A Unified Solution for All Top-k, Reverse Top-k and Top-m Influential Queries

Author: Cheung DWL
Ge S
Mamoulis N
U LH
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

published_or_final_versio

HKU Scholars Hub

SODA: Generating SQL for Business Users

Author: Blunschi Lukas
Jossen Claudio
Kossman Donald
Mori Magdalini
Stockinger Kurt
Publication venue
Publication date: 01/01/2012
Field of study

The purpose of data warehouses is to enable business analysts to make better decisions. Over the years the technology has matured and data warehouses have become extremely successful. As a consequence, more and more data has been added to the data warehouses and their schemas have become increasingly complex. These systems still work great in order to generate pre-canned reports. However, with their current complexity, they tend to be a poor match for non tech-savvy business analysts who need answers to ad-hoc queries that were not anticipated. This paper describes the design, implementation, and experience of the SODA system (Search over DAta Warehouse). SODA bridges the gap between the business needs of analysts and the technical complexity of current data warehouses. SODA enables a Google-like search experience for data warehouses by taking keyword queries of business users and automatically generating executable SQL. The key idea is to use a graph pattern matching algorithm that uses the metadata model of the data warehouse. Our results with real data from a global player in the financial services industry show that SODA produces queries with high precision and recall, and makes it much easier for business users to interactively explore highly-complex data warehouses.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Crossref

ZORA

Estimating Position Bias without Intrusive Interventions

Author: Agarwal Aman
Joachims Thorsten
Miroslav Dud'i
O'Brien Maeve
Schnabel Tobias
Swaminathan Adith
Swaminathan Adith
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/12/2018
Field of study

Presentation bias is one of the key challenges when learning from implicit feedback in search engines, as it confounds the relevance signal. While it was recently shown how counterfactual learning-to-rank (LTR) approaches \cite{Joachims/etal/17a} can provably overcome presentation bias when observation propensities are known, it remains to show how to effectively estimate these propensities. In this paper, we propose the first method for producing consistent propensity estimates without manual relevance judgments, disruptive interventions, or restrictive relevance modeling assumptions. First, we show how to harvest a specific type of intervention data from historic feedback logs of multiple different ranking functions, and show that this data is sufficient for consistent propensity estimation in the position-based model. Second, we propose a new extremum estimator that makes effective use of this data. In an empirical evaluation, we find that the new estimator provides superior propensity estimates in two real-world systems -- Arxiv Full-text Search and Google Drive Search. Beyond these two points, we find that the method is robust to a wide range of settings in simulation studies

arXiv.org e-Print Archive

Crossref

Visual exploration and retrieval of XML document collections with the generic system X2

Author: Felix Weigel
François Bry
H Meuss
Holger Meuss
Klaus U. Schulz
S Ceri
S Mizzaro
Simone Leonardi
T Catarci
T Schlieder
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2005
Field of study

This article reports on the XML retrieval system X2 which has been developed at the University of Munich over the last five years. In a typical session with X2, the user first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically. After query evaluation, the full set of answers is presented in a visual and structured way. X2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of X2 which distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering and grouping retrieved elements once the complete answer set has been computed

Crossref

Open Access LMU