Search CORE

5 research outputs found

Improved PrefixSpan Algorithm for Efficient Processing of Large Data

Author: Pratik Saraf, Sheetal Rathi
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/07/2015
Field of study

PrefixSpan (Prefix-projected Sequential pattern mining) algorithm is very well known algorithm for sequential data mining. It extracts the sequential patterns through pattern growth method. The algorithm performs very well for small datasets. As the size of datasets increases the overall time for finding the sequential patterns also get increased. The efficiency of PrefixSpan algorithm gets reduced while processing the large data. The cost of constructing the projected dataset is also huge which ultimately affect the memory utilization. DOI: 10.17762/ijritcc2321-8169.15072

International Journal on Recent and Innovation Trends in Computing and Communication

Mining probabilistically frequent sequential patterns in large uncertain databases

Author: Ng Wilfred Siu Hung
Yan Da
Zhao Zhou
Publication venue
Publication date: 01/01/2014
Field of study

Data uncertainty is inherent in many real-world applications such as environmental surveillance and mobile tracking. Mining sequential patterns from inaccurate data, such as those data arising from sensor readings and GPS trajectories, is important for discovering hidden knowledge in such applications. In this paper, we propose to measure pattern frequentness based on the possible world semantics. We establish two uncertain sequence data models abstracted from many real-life applications involving uncertain sequence data, and formulate the problem of mining probabilistically frequent sequential patterns (or p-FSPs) from data that conform to our models. However, the number of possible worlds is extremely large, which makes the mining prohibitively expensive. Inspired by the famous PrefixSpan algorithm, we develop two new algorithms, collectively called U-PrefixSpan, for p-FSP mining. U-PrefixSpan effectively avoids the problem of "possible worlds explosion", and when combined with our four pruning and validating methods, achieves even better performance. We also propose a fast validating method to further speed up our U-PrefixSpan algorithm. The efficiency and effectiveness of U-PrefixSpan are verified through extensive experiments on both real and synthetic datasets. Copyright © 2013 IEEE

Hong Kong University of Science and Technology Institutional Repository

Advances in knowledge discovery and data mining Part II

Author: CAO Tru
CHEUNG David Wai-Lok
HO Tu-Bao
LIM Ee Peng
MOTODA Hiroshi
ZHOU Zhi-Hua
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p

Institutional Knowledge at Singapore Management University

HKU Scholars Hub