Search CORE

236 research outputs found

Efficient Computation of Group Skyline Queries on MapReduce

Author: Hsueh Sue-Chen
Lin Ming-Yen
Yang Chao-Wen
Publication venue: GSTF Journal on Computing (JoC)
Publication date: 10/08/2016
Field of study

Skyline query is one of the important issues indatabase research and has been applied in diverse applicationsincluding multi-criteria decision support systems and so on. Theresponse of a skyline query eliminates unnecessary tuples andreturns only the user-interested result. Traditional skyline querypicks out the outstanding tuples, based on one-to-one recordcomparisons. Some modern applications request, beyond thesingular ones, for superior combinations of records. For example,fantasy basketball is composed of 5 players, fantasy baseball of 9players, and a hackathon of several programmers. Group skylineaims at considering all the groups comprising several records,and finding out the non-dominated ones. Because of the highcomplexity, few studies have been conducted and none has beenpresented in either distributed or parallel computing. This paperis the first study that solves the group skyline in the distributedMapReduce framework. We propose the MRGS algorithm togenerate all the combinations, compute the winners at each localnode, and find out the answer globally. We further propose theMRIGS algorithm to release the bottleneck of MRGS onunbalanced computing load of nodes. Finally, we propose theMRIGS-P algorithm to prune the impossible combinations andproduce indexed and balanced MapReduce computation.Extensive experiments with NBA datasets show that MRIGS-P is6 times faster than the MRGS algorithm

GSTF Digital Library (GSTF-DL): Open Journal Systems (Global Science and Technology Forum)

Privacy Aware Parallel Computation of Skyline Sets Queries from Distributed Databases

Author: Arefin Mohammad Shamsul
Morimoto Yasuhiko
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 10/02/2015
Field of study

A skyline query finds objects that are not dominated by another object from a given set of objects. Skyline queries help us to filter unnecessary information efficiently and provide us clues for various decision making tasks. However, we cannot use skyline queries in privacy aware environment, since we have to hide individual's records values even though there is no ID information. Therefore, we considered skyline sets queries. The skyline set query returns skyline sets from all possible sets, each of which is composed of some objects in a database. With the growth of network infrastructure data are stored in distributed databases. In this paper, we expand the idea to compute skyline sets queries in parallel fashion from distributed databases without disclosing individual records to others. The proposed method utilizes an agent-based parallel computing framework that can efficiently compute skyline sets queries and can solve the privacy problems of skyline queries in distributed environment. The computation of skyline sets is performed simultaneously in all databases which increases parallelism and reduces the computation time

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

RRR: Rank-Regret Representative

Author: Asudeh Abolfazl
Das Gautam
Jagadish H. V.
Nazi Azade
Zhang Nan
Publication venue
Publication date: 01/01/2018
Field of study

Selecting the best items in a dataset is a common task in data exploration. However, the concept of "best" lies in the eyes of the beholder: different users may consider different attributes more important, and hence arrive at different rankings. Nevertheless, one can remove "dominated" items and create a "representative" subset of the data set, comprising the "best items" in it. A Pareto-optimal representative is guaranteed to contain the best item of each possible ranking, but it can be almost as big as the full data. Representative can be found if we relax the requirement to include the best item for every possible user, and instead just limit the users' "regret". Existing work defines regret as the loss in score by limiting consideration to the representative instead of the full data set, for any chosen ranking function. However, the score is often not a meaningful number and users may not understand its absolute value. Sometimes small ranges in score can include large fractions of the data set. In contrast, users do understand the notion of rank ordering. Therefore, alternatively, we consider the position of the items in the ranked list for defining the regret and propose the {\em rank-regret representative} as the minimal subset of the data containing at least one of the top-

k

of any possible ranking function. This problem is NP-complete. We use the geometric interpretation of items to bound their ranks on ranges of functions and to utilize combinatorial geometry notions for developing effective and efficient approximation algorithms for the problem. Experiments on real datasets demonstrate that we can efficiently find small subsets with small rank-regrets

arXiv.org e-Print Archive

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

Dynamic Skyline Computation with the Skyline Breaker Algorithm

Author: Köppl Dominik
Publication venue: Local Proceedings of the Workshop on Massive Data Algorithmics (MASSIVE), Wrocław, 2014
Publication date
Field of study

Given a sequential data input, we tackle parallel dynamic skyline computation of the read data by means of a spatial tree structure for indexing fine-grained feature vectors. For this purpose, we modified the Skyline Breaker algorithm that solves skyline computation with multiple local split decision trees concurrently. With this approach, we propose an algorithm for dynamic skyline computation that inherits the robustness against the dimension curse and different data distributions

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Efficient Algorithms for k-Regret Minimizing Sets

Author: Agarwal Pankaj K.
Kumar Nirman
Sintos Stavros
Suri Subhash
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 16th International Symposium on Experimental Algorithms (SEA 2017)
Publication date: 01/01/2017
Field of study

A regret minimizing set Q is a small size representation of a much larger database P so that user queries executed on Q return answers whose scores are not much worse than those on the full dataset. In particular, a k-regret minimizing set has the property that the regret ratio between the score of the top-1 item in Q and the score of the top-k item in P is minimized, where the score of an item is the inner product of the item\u27s attributes with a user\u27s weight (preference) vector. The problem is challenging because we want to find a single representative set Q whose regret ratio is small with respect to all possible user weight vectors. We show that k-regret minimization is NP-Complete for all dimensions d>=3, settling an open problem from Chester et al. [VLDB 2014]. Our main algorithmic contributions are two approximation algorithms, both with provable guarantees, one based on coresets and another based on hitting sets. We perform extensive experimental evaluation of our algorithms, using both real-world and synthetic data, and compare their performance against the solution proposed in [VLDB 14]. The results show that our algorithms are significantly faster and scalable to much larger sets than the greedy algorithm of Chester et al. for comparable quality answers

arXiv.org e-Print Archive

University of Memphis Digital Commons

Dagstuhl Research Online Publication Server

Efficient All Top-k Computation - A Unified Solution for All Top-k, Reverse Top-k and Top-m Influential Queries

Author: Cheung DWL
Ge S
Mamoulis N
U LH
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

published_or_final_versio

HKU Scholars Hub

ParetoPrep: Fast computation of Path Skylines Queries

Author: Jossé Gregor
Schubert Matthias
Shekelyan Michael
Publication venue
Publication date: 01/10/2014
Field of study

Computing cost optimal paths in network data is a very important task in many application areas like transportation networks, computer networks or social graphs. In many cases, the cost of an edge can be described by various cost criteria. For example, in a road network possible cost criteria are distance, time, ascent, energy consumption or toll fees. In such a multicriteria network, a route or path skyline query computes the set of all paths having pareto optimal costs, i.e. each result path is optimal for different user preferences. In this paper, we propose a new method for computing route skylines which significantly decreases processing time and memory consumption. Furthermore, our method does not rely on any precomputation or indexing method and thus, it is suitable for dynamically changing edge costs. Our experiments demonstrate that our method outperforms state of the art approaches and allows highly efficient path skyline computation without any preprocessing.Comment: 12 pages, 9 figures, technical repor

arXiv.org e-Print Archive

CiteSeerX

Spatial skyline query problem in Euclidean and road-network spaces

Author: Mao Ruijia
Publication venue
Publication date: 28/08/2020
Field of study

With the growth of data-intensive applications, along with the increase of both size and dimensionality of data, queries with advanced semantics have recently drawn researchers’ attention. Skyline query problem is one of them, which produces optimal results based on user preferences. In this thesis, we study the problem of spatial skyline query in the Euclidean and road network spaces. For a given data set P, we are required to compute the spatial skyline points of P with respect to an arbitrary query set Q. A point p ∈ P is a spatial skyline point if and only if, for any other data point r ∈ P , p is closer to at least one query point q ∈ Q as compared to r and has in the best case the same distance as r to the rest of the query points. We propose several efficient algorithms that outperform the existing algorithms

Simon Fraser University Institutional Repository