654 research outputs found
Finding Top-k Dominance on Incomplete Big Data Using Map-Reduce Framework
Incomplete data is one major kind of multi-dimensional dataset that has random-distributed missing nodes in its dimensions. It is very difficult to retrieve information from this type of dataset when it becomes huge. Finding top-k dominant values in this type of dataset is a challenging procedure. Some algorithms are present to enhance this process but are mostly efficient only when dealing with a small-size incomplete data. One of the algorithms that make the application of TKD query possible is the Bitmap Index Guided (BIG) algorithm. This algorithm strongly improves the performance for incomplete data, but it is not originally capable of finding top-k dominant values in incomplete big data, nor is it designed to do so. Several other algorithms have been proposed to find the TKD query, such as Skyband Based and Upper Bound Based algorithms, but their performance is also questionable. Algorithms developed previously were among the first attempts to apply TKD query on incomplete data; however, all these had weak performances or were not compatible with the incomplete data. This thesis proposes MapReduced Enhanced Bitmap Index Guided Algorithm (MRBIG) for dealing with the aforementioned issues. MRBIG uses the MapReduce framework to enhance the performance of applying top-k dominance queries on huge incomplete datasets. The proposed approach uses the MapReduce parallel computing approach using multiple computing nodes. The framework separates the tasks between several computing nodes that independently and simultaneously work to find the result. This method has achieved up to two times faster processing time in finding the TKD query result in comparison to previously presented algorithms
SKYLINE QUERY BASED ON USER PREFERENCES IN CELLULAR ENVIRONMENTS
The recommendation system is an important tool for providing personalized suggestions to users about products or services. However, previous research on individual recommendation systems using skyline queries has not considered the dynamic personal preferences of users. Therefore, this study aims to develop an individual recommendation model based on the current individual preferences and user location in a mobile environment. We propose an RFM (Recency, Frequency, Monetary) score-based algorithm to predict the current individual preferences of users. This research utilizes the skyline query method to recommend local cuisine that aligns with the individual preferences of users. The attributes used in selecting suitable local cuisine include individual preferences, price, and distance between the user and the local cuisine seller. The proposed algorithm has been implemented in the JALITA mobile-based Indonesian local cuisine recommendation system. The results effectively recommend local cuisine that matches the dynamic individual preferences and location of users. Based on the implementation results, individual recommendations are provided to mobile users anytime and anywhere they are located. In this study, three skyline objects are generated: soto betawi (C5), Mie Aceh Daging Goreng (C4), and Gado-gado betawi (C3), which are recommended local cuisine based on the current individual preferences (U1) and user location (L1). The implementation results are exemplified for one user located at (U1L1), providing recommendations for soto betawi (C5) with an individual preference score of 0.96, Mie Aceh Daging Goreng (C4) with an individual preference score of 0.93, and Gado-gado betawi (C3) with an individual preference score of 0.98. Thus, this research contributes to the field of individual recommendation systems by considering the dynamic user location and preferences
Distributed Indexing Schemes for k-Dominant Skyline Analytics on Uncertain Edge-IoT Data
Skyline queries typically search a Pareto-optimal set from a given data set
to solve the corresponding multiobjective optimization problem. As the number
of criteria increases, the skyline presumes excessive data items, which yield a
meaningless result. To address this curse of dimensionality, we proposed a
k-dominant skyline in which the number of skyline members was reduced by
relaxing the restriction on the number of dimensions, considering the
uncertainty of data. Specifically, each data item was associated with a
probability of appearance, which represented the probability of becoming a
member of the k-dominant skyline. As data items appear continuously in data
streams, the corresponding k-dominant skyline may vary with time. Therefore, an
effective and rapid mechanism of updating the k-dominant skyline becomes
crucial. Herein, we proposed two time-efficient schemes, Middle Indexing (MI)
and All Indexing (AI), for k-dominant skyline in distributed edge-computing
environments, where irrelevant data items can be effectively excluded from the
compute to reduce the processing duration. Furthermore, the proposed schemes
were validated with extensive experimental simulations. The experimental
results demonstrated that the proposed MI and AI schemes reduced the
computation time by approximately 13% and 56%, respectively, compared with the
existing method.Comment: 13 pages, 8 figures, 12 tables, to appear in IEEE Transactions on
Emerging Topics in Computin
- …