Search CORE

8,852 research outputs found

Literature Review on Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases

Author: Ketkee Kailas Gaikwad, Mininath Nighot
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2014
Field of study

This paper presenting a survey on finding itemsets with high utility. For finding itemsets there are many algorithms but those algorithms having a problem of producing a large number of candidate itemsets for high utility itemsets which reduces mining performance in terms of execution. Here we mainly focus on two algorithms utility pattern growth (UP-Growth) and UP-Growth+. Those algorithms are used for mining high utility itemsets, where effective methods are used for pruning candidate itemsets. Mining high utility itemsets Keep in a special data structure called UP-Tree. This, compact tree structure, UP-Tree, is used for make possible the mining performance and avoid scanning original database repeatedly. In this for generation of candidate itemsets only two scans of database. Another proposed algorithms UP Growth+ reduces the number of candidates effectively. It also has better performance than other algorithms in terms of runtime, especially when databases contain huge amount of long transactions. Utility-based data mining is a new research area which is interested in all types of utility factors in data mining processes. In which utility factors are targeted at integrate utility considerations in both predictive and descriptive data mining tasks. High utility itemset mining is a research area of utility based descriptive data mining. Utility based data mining is used for finding itemsets that contribute most to the total utility in that database

International Journal on Recent and Innovation Trends in Computing and Communication

Efficient chain structure for high-utility sequential pattern mining

Author: Djenouri Youcef
Fournier-Viger Philippe
Li Yuanfa
Lin Jerry Chun-Wei
Zhang Ji
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

High-utility sequential pattern mining (HUSPM) is an emerging topic in data mining, which considers both utility and sequence factors to derive the set of high-utility sequential patterns (HUSPs) from the quantitative databases. Several works have been presented to reduce the computational cost by variants of pruning strategies. In this paper, we present an efficient sequence-utility (SU)-chain structure, which can be used to store more relevant information to improve mining performance. Based on the SU-Chain structure, the existing pruning strategies can also be utilized here to early prune the unpromising candidates and obtain the satisfied HUSPs. Experiments are then compared with the state-of-the-art HUSPM algorithms and the results showed that the SU-Chain-based model can efficiently improve the efficiency performance than the existing HUSPM algorithms in terms of runtime and number of the determined candidates

SINTEF Open

NORA - Norwegian Open Research Archives

University of Southern Queensland ePrints

Scalable Mining of High-Utility Sequential Patterns With Three-Tier MapReduce Model

Author: Djenouri Youcef
Li Yuanfa
Lin Jerry Chun-Wei
Srivastava Gautam
Yu Philip S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

High-utility sequential pattern mining (HUSPM) is a hot research topic in recent decades since it combines both sequential and utility properties to reveal more information and knowledge rather than the traditional frequent itemset mining or sequential pattern mining. Several works of HUSPM have been presented but most of them are based on main memory to speed up mining performance. However, this assumption is not realistic and not suitable in large-scale environments since in real industry, the size of the collected data is very huge and it is impossible to fit the data into the main memory of a single machine. In this article, we first develop a parallel and distributed three-stage MapReduce model for mining high-utility sequential patterns based on large-scale databases. Two properties are then developed to hold the correctness and completeness of the discovered patterns in the developed framework. In addition, two data structures called sidset and utility-linked list are utilized in the developed framework to accelerate the computation for mining the required patterns. From the results, we can observe that the designed model has good performance in large-scale datasets in terms of runtime, memory, efficiency of the number of distributed nodes, and scalability compared to the serial HUSP-Span approach.acceptedVersio

SINTEF Open

NORA - Norwegian Open Research Archives

A Survey of Sequential Pattern Based E-Commerce Recommendation Systems

Author: Ezeife Christie I.
Karlapalepu Hemni
Publication venue: Scholarship at UWindsor
Publication date: 01/10/2023
Field of study

E-commerce recommendation systems usually deal with massive customer sequential databases, such as historical purchase or click stream sequences. Recommendation systems’ accuracy can be improved if complex sequential patterns of user purchase behavior are learned by integrating sequential patterns of customer clicks and/or purchases into the user–item rating matrix input of collaborative filtering. This review focuses on algorithms of existing E-commerce recommendation systems that are sequential pattern-based. It provides a comprehensive and comparative performance analysis of these systems, exposing their methodologies, achievements, limitations, and potential for solving more important problems in this domain. The review shows that integrating sequential pattern mining of historical purchase and/or click sequences into a user–item matrix for collaborative filtering can (i) improve recommendation accuracy, (ii) reduce user–item rating data sparsity, (iii) increase the novelty rate of recommendations, and (iv) improve the scalability of recommendation systems

Scholarship at UWindsor