9 research outputs found
Privacy Preserving Utility Mining: A Survey
In big data era, the collected data usually contains rich information and
hidden knowledge. Utility-oriented pattern mining and analytics have shown a
powerful ability to explore these ubiquitous data, which may be collected from
various fields and applications, such as market basket analysis, retail,
click-stream analysis, medical analysis, and bioinformatics. However, analysis
of these data with sensitive private information raises privacy concerns. To
achieve better trade-off between utility maximizing and privacy preserving,
Privacy-Preserving Utility Mining (PPUM) has become a critical issue in recent
years. In this paper, we provide a comprehensive overview of PPUM. We first
present the background of utility mining, privacy-preserving data mining and
PPUM, then introduce the related preliminaries and problem formulation of PPUM,
as well as some key evaluation criteria for PPUM. In particular, we present and
discuss the current state-of-the-art PPUM algorithms, as well as their
advantages and deficiencies in detail. Finally, we highlight and discuss some
technical challenges and open directions for future research on PPUM.Comment: 2018 IEEE International Conference on Big Data, 10 page
Towards Correlated Sequential Rules
The goal of high-utility sequential pattern mining (HUSPM) is to efficiently
discover profitable or useful sequential patterns in a large number of
sequences. However, simply being aware of utility-eligible patterns is
insufficient for making predictions. To compensate for this deficiency,
high-utility sequential rule mining (HUSRM) is designed to explore the
confidence or probability of predicting the occurrence of consequence
sequential patterns based on the appearance of premise sequential patterns. It
has numerous applications, such as product recommendation and weather
prediction. However, the existing algorithm, known as HUSRM, is limited to
extracting all eligible rules while neglecting the correlation between the
generated sequential rules. To address this issue, we propose a novel algorithm
called correlated high-utility sequential rule miner (CoUSR) to integrate the
concept of correlation into HUSRM. The proposed algorithm requires not only
that each rule be correlated but also that the patterns in the antecedent and
consequent of the high-utility sequential rule be correlated. The algorithm
adopts a utility-list structure to avoid multiple database scans. Additionally,
several pruning strategies are used to improve the algorithm's efficiency and
performance. Based on several real-world datasets, subsequent experiments
demonstrated that CoUSR is effective and efficient in terms of operation time
and memory consumption.Comment: Preprint. 7 figures, 6 table