5 research outputs found
A Rule-based Skyline Computation over a dynamic database
Skyline query which relies on the notion of Pareto dominance filters the data items from a database by ensuring only those data items that are not worse than any others are selected as skylines. However, the dynamic nature of databases in which their states and/or structures change throughout their lifetime to incorporate the current and latest information of database applications, requires a new set of skylines to be derived. Blindly computing skylines on the new state/structure of a database is inefficient, as not all the data items are affected by the changes. Hence, this paper proposes a rule-based approach in tackling the above issue with the main aim at avoiding unnecessary skyline computations. Based on the type of operation that changes the state/structure of a database, i.e. insert/delete/update a data item(s) or add/remove a dimension(s), a set of rules are defined. Besides, the prominent dominance relationships when pairwise comparisons are performed are retained; which are then utilised in the process of computing a new set of skylines. Several analyses have been conducted to evaluate the performance and prove the efficiency of our proposed solution
Dominance relationship-based skyline query framework over dynamic and incomplete database
Skyline queries rely on the notion of Pareto dominance, filter the data items by keeping
only those data items that are the best, most preferred, also known as skylines, from a
database to meet the user’s preferences. Skyline query has been studied extensively
and a significant number of skyline algorithms have been proposed, mostly attempt to
resolve the optimisation problem that is mainly associated with a reduction in the
processing time of skyline computations. In today’s era, the presence of incomplete
data in a database is inevitable. The skyline algorithms in such situation will have to
deal with several issues besides the optimisation problem. The missing values in
databases give a negative influence on the number of pairwise comparisons that needs
to be performed between the data items. Moreover, the transitivity property of skylines
is no longer hold. Cyclic dominance is another issue that needs to be tackled as it
yields empty skyline results. Furthermore, databases are dynamic in nature in which
their states change throughout the time. These changes are necessary as databases must
reflect the current and latest information of the applications. The changes are normally
achieved through data manipulation operations and data definition operations. The
skylines derived before changes are made towards the initial database are no longer
valid in the new state of the database. Utilising the existing skyline algorithms would
require performing the algorithms on the new state of the database. However,
computing the skylines over the entire database after changes are made is inefficient
as not all the data items are affected by the changes.
In tackling the above stated issues, we propose a solution, named DyIn-Skyline, which
consists of three main phases, namely: Phase I – processing skyline queries over the
initial incomplete database, Phase II – processing skyline queries over a dynamic and
incomplete database, in which the changing state of the database is due to a data
manipulation operation(s) (insert, delete or update a data item(s)), and Phase III –
processing skyline queries over a dynamic and incomplete database, in which the changing state of the database is due to a data definition operation(s) (add or remove
a dimension(s)). For each phase, a framework is proposed. The proposed framework
in the Phase I consists of three main components, namely: Data Grouping Builder
(DGB), Bucket Skyline Identifier (BSI), and Final Skyline Identifier (FSI). We have
also introduced and designed three lists, namely: Bucket Dominating (BDG), Bucket
Dominated (BDD), and Domination History (DH) to keep track of the dominating data
items, dominated data items, and dominance relationships, respectively; this
information is useful and is utilised by the Phase II and Phase III of the DyIn-Skyline
solution. The framework of Phase II consists of three components, namely: Skyline-
Insert Identifier (S-II), which derives a set of skylines after a data item(s) is inserted
into a database, Skyline-Delete Identifier (S-DI), which derives a set of skylines after
an existing data item(s) is deleted from a database, and Skyline-Update Identifier (SUI),
which produces a set of skylines after an existing data item(s) of a database is
updated. Meanwhile, the framework of Phase III consists of two components, namely:
Skyline-Add Dimension Analyser (S-ADA) which derives a set of skylines after a new
dimension(s) is added to a database and Skyline-Remove Dimension Analyser (S-RDA)
which derives a set of skylines after an existing dimension(s) is removed from a
database.
Extensive experiments have been conducted to evaluate the performance and prove
the efficiency of our proposed solution, DyIn-Skyline, in processing skyline queries
over a dynamic and incomplete database. The performance results of DyIn-Skyline are
compared to other existing works that are the closest to this research, namely: ISkyline,
SIDS, and Incoskyline. In most cases, DyIn-Skyline shows a steady performance and
achieves better performance with regard to the number of pairwise comparisons and
processing time compared to the previous works. Unlike ISkyline, SIDS, and
Incoskyline which derive skylines over the entire database after changes are made
towards the database, i.e. the new state of the database, DyIn-Skyline avoids
unnecessary skyline computations. It relies on the information saved in the following
lists: Bucket Dominating (BDG), Bucket Dominated (BDD), and Domination History
(DH) and focuses only on those data items that are affected by the changes
Efficient skyline computation over an Incomplete database with changing states and structures
Skyline query has been studied extensively and a significant number of skyline algorithms have been proposed, mostly attempt to resolve the optimisation problem that is mainly associated with reduction in the processing time of skyline computations. While databases change their states and/or structures throughout their lifetime to reflect the current and latest information of the applications, the skyline set derived before changes are made towards the initial state of a database is no longer valid in the new state/structure of the database. The domination relationships between objects identified in the initial state might no longer hold in the new state. Nonetheless, computing the skylines over the entire new state/structure of the database is inefficient, as not all pairwise comparisons between the objects are necessary to be performed. In tackling the above issue, this paper proposes a solution, named1Skyline, which aims at avoiding unnecessary skyline computations when a database changes its state and structure due to a data definition operation(s) (add or remove a dimension(s)). This is achieved by identifying and retaining the prominent dominance relationships when pairwise comparisons are performed; which are then utilised in the process of computing a new skyline set. 1Skyline consists of two optimisation components, namely: Skyline which derives a new skyline set when a new dimension(s) is added to a database and Skyline which derives a new skyline set when an existing dimension(s) is removed from a database. To make our solution more useful, it is applied on a database with incomplete data. Extensive experiments have been conducted to evaluate the performance and prove the efficiency of our proposed solution
Efficient computation of skyline queries over a dynamic and incomplete database
Skyline queries rely on the notion of Pareto dominance, filter the data items by keeping only those data items that are the best, most preferred, also known as skylines, from a database to meet the user's preferences. Skyline query has been studied extensively and a significant number of skyline algorithms have been proposed, mostly attempt to resolve the optimisation problem that is mainly associated with a reduction in the processing time of skyline computations. In today's era, the presence of incomplete data in a database is inevitable. Furthermore, databases are dynamic in nature in which their states change throughout the time to reflect the current and latest information of the applications. The skylines derived before changes are made towards the initial database are no longer valid in the new state of the database. Blindly examining the entire database to identify the new set of skylines is unwise as not all data items are affected by the changes made towards the database. Hence, in this paper we propose a solution, named DyIn-Skyline, which is capable of deriving skylines over a dynamic and incomplete database, by exploiting only those data items that are affected by the changes. Several experiments have been conducted and the results show that our proposed solution outperforms the previous works with regard to the number of pairwise comparisons and processing time
Identifying skylines in dynamic incomplete database
Nowadays in database systems finding the best results that meet the preferences of users is the most important issue. Skyline queries will present the data items that are not being dominated by the other items in a database. Most of the operations assume the database is complete which means there are no missing values in the database dimensions. In reality, databases are not complete especially for multidimensional database. Missing values have a negative effect on finding skyline points. It changes the native of dominance relation, leads to cyclic dominance and unsatisfying the transitivity property of skylines. This problem becomes more severe in dynamic database in which new items are inserted or items are deleted or updated from the database. Besides, most of the works that handled the incomplete issue assumed that items are static. In this paper we propose the new approach which finds the most relevant data items that meet user’s preferences for dynamic incomplete databases