Search CORE

4 research outputs found

HUPSMT: AN EFFICIENT ALGORITHM FOR MINING HIGH UTILITY-PROBABILITY SEQUENCES IN UNCERTAIN DATABASES WITH MULTIPLE MINIMUM UTILITY THRESHOLDS

Author: Anh Tran Ngoc
Bac Le Hoai
Hai Duong Van
Tin Truong Chi
Publication venue: 'Publishing House for Science and Technology, Vietnam Academy of Science and Technology'
Publication date: 18/03/2019
Field of study

The problem of high utility sequence mining (HUSM) in quantitative se-quence databases (QSDBs) is more general than that of frequent sequence mining in se-quence databases. An important limitation of HUSM is that a user-predened minimum tility threshold is used commonly to decide if a sequence is high utility. However, this is not convincing in many real-life applications as sequences may have diferent importance. Another limitation of HUSM is that data in QSDBs are assumed to be precise. But in the real world, collected data such as by sensor maybe uncertain. Thus, this paper proposes a framework for mining high utility-probability sequences (HUPSs) in uncertain QSDBs (UQS-DBs) with multiple minimum utility thresholds using a minimum utility. Two new width and depth pruning strategies are also introduced to early eliminate low utility or low probability sequences as well as their extensions, and to reduce sets of candidate items for extensions during the mining process. Based on these strategies, a novel ecient algorithm named HUPSMT is designed for discovering HUPSs. Finally, an experimental study conducted in both real-life and synthetic UQSDBs shows the performance of HUPSMT in terms of time and memory consumption

Vietnam Academy of Science and Technology: Journals Online

OPR-Miner: Order-preserving rule mining for time series

Author: Fournier-Viger Philippe
Guo Lei
Li Yan
Wu Xindong
Wu Youxi
Zhao Xiaoqian
Zhu Xingquan
Publication venue
Publication date: 09/10/2022
Field of study

Discovering frequent trends in time series is a critical task in data mining. Recently, order-preserving matching was proposed to find all occurrences of a pattern in a time series, where the pattern is a relative order (regarded as a trend) and an occurrence is a sub-time series whose relative order coincides with the pattern. Inspired by the order-preserving matching, the existing order-preserving pattern (OPP) mining algorithm employs order-preserving matching to calculate the support, which leads to low efficiency. To address this deficiency, this paper proposes an algorithm called efficient frequent OPP miner (EFO-Miner) to find all frequent OPPs. EFO-Miner is composed of four parts: a pattern fusion strategy to generate candidate patterns, a matching process for the results of sub-patterns to calculate the support of super-patterns, a screening strategy to dynamically reduce the size of prefix and suffix arrays, and a pruning strategy to further dynamically prune candidate patterns. Moreover, this paper explores the order-preserving rule (OPR) mining and proposes an algorithm called OPR-Miner to discover strong rules from all frequent OPPs using EFO-Miner. Experimental results verify that OPR-Miner gives better performance than other competitive algorithms. More importantly, clustering and classification experiments further validate that OPR-Miner achieves good performance

arXiv.org e-Print Archive

Efficient Vertical Mining of High Average-Utility Itemsets Based on Novel Upper-Bounds

Author: Bac Le
Hai Duong
Philippe Fournier-Viger
Tin Truong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref