64,394 research outputs found
Towards Correlated Sequential Rules
The goal of high-utility sequential pattern mining (HUSPM) is to efficiently
discover profitable or useful sequential patterns in a large number of
sequences. However, simply being aware of utility-eligible patterns is
insufficient for making predictions. To compensate for this deficiency,
high-utility sequential rule mining (HUSRM) is designed to explore the
confidence or probability of predicting the occurrence of consequence
sequential patterns based on the appearance of premise sequential patterns. It
has numerous applications, such as product recommendation and weather
prediction. However, the existing algorithm, known as HUSRM, is limited to
extracting all eligible rules while neglecting the correlation between the
generated sequential rules. To address this issue, we propose a novel algorithm
called correlated high-utility sequential rule miner (CoUSR) to integrate the
concept of correlation into HUSRM. The proposed algorithm requires not only
that each rule be correlated but also that the patterns in the antecedent and
consequent of the high-utility sequential rule be correlated. The algorithm
adopts a utility-list structure to avoid multiple database scans. Additionally,
several pruning strategies are used to improve the algorithm's efficiency and
performance. Based on several real-world datasets, subsequent experiments
demonstrated that CoUSR is effective and efficient in terms of operation time
and memory consumption.Comment: Preprint. 7 figures, 6 table
Privacy Preserving Utility Mining: A Survey
In big data era, the collected data usually contains rich information and
hidden knowledge. Utility-oriented pattern mining and analytics have shown a
powerful ability to explore these ubiquitous data, which may be collected from
various fields and applications, such as market basket analysis, retail,
click-stream analysis, medical analysis, and bioinformatics. However, analysis
of these data with sensitive private information raises privacy concerns. To
achieve better trade-off between utility maximizing and privacy preserving,
Privacy-Preserving Utility Mining (PPUM) has become a critical issue in recent
years. In this paper, we provide a comprehensive overview of PPUM. We first
present the background of utility mining, privacy-preserving data mining and
PPUM, then introduce the related preliminaries and problem formulation of PPUM,
as well as some key evaluation criteria for PPUM. In particular, we present and
discuss the current state-of-the-art PPUM algorithms, as well as their
advantages and deficiencies in detail. Finally, we highlight and discuss some
technical challenges and open directions for future research on PPUM.Comment: 2018 IEEE International Conference on Big Data, 10 page
Smartphone apps usage patterns as a predictor of perceived stress levels at workplace
Explosion of number of smartphone apps and their diversity has created a
fertile ground to study behaviour of smartphone users. Patterns of app usage,
specifically types of apps and their duration are influenced by the state of
the user and this information can be correlated with the self-reported state of
the users. The work in this paper is along the line of understanding patterns
of app usage and investigating relationship of these patterns with the
perceived stress level within the workplace context. Our results show that
using a subject-centric behaviour model we can predict stress levels based on
smartphone app usage. The results we have achieved, of average accuracy of 75%
and precision of 85.7%, can be used as an indicator of overall stress levels in
work environments and in turn inform stress reduction organisational policies,
especially when considering interrelation between stress and productivity of
workers
Use of a controlled experiment and computational models to measure the impact of sequential peer exposures on decision making
It is widely believed that one's peers influence product adoption behaviors.
This relationship has been linked to the number of signals a decision-maker
receives in a social network. But it is unclear if these same principles hold
when the pattern by which it receives these signals vary and when peer
influence is directed towards choices which are not optimal. To investigate
that, we manipulate social signal exposure in an online controlled experiment
using a game with human participants. Each participant in the game makes a
decision among choices with differing utilities. We observe the following: (1)
even in the presence of monetary risks and previously acquired knowledge of the
choices, decision-makers tend to deviate from the obvious optimal decision when
their peers make similar decision which we call the influence decision, (2)
when the quantity of social signals vary over time, the forwarding probability
of the influence decision and therefore being responsive to social influence
does not necessarily correlate proportionally to the absolute quantity of
signals. To better understand how these rules of peer influence could be used
in modeling applications of real world diffusion and in networked environments,
we use our behavioral findings to simulate spreading dynamics in real world
case studies. We specifically try to see how cumulative influence plays out in
the presence of user uncertainty and measure its outcome on rumor diffusion,
which we model as an example of sub-optimal choice diffusion. Together, our
simulation results indicate that sequential peer effects from the influence
decision overcomes individual uncertainty to guide faster rumor diffusion over
time. However, when the rate of diffusion is slow in the beginning, user
uncertainty can have a substantial role compared to peer influence in deciding
the adoption trajectory of a piece of questionable information
Mining Frequent Graph Patterns with Differential Privacy
Discovering frequent graph patterns in a graph database offers valuable
information in a variety of applications. However, if the graph dataset
contains sensitive data of individuals such as mobile phone-call graphs and
web-click graphs, releasing discovered frequent patterns may present a threat
to the privacy of individuals. {\em Differential privacy} has recently emerged
as the {\em de facto} standard for private data analysis due to its provable
privacy guarantee. In this paper we propose the first differentially private
algorithm for mining frequent graph patterns.
We first show that previous techniques on differentially private discovery of
frequent {\em itemsets} cannot apply in mining frequent graph patterns due to
the inherent complexity of handling structural information in graphs. We then
address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling
based algorithm. Unlike previous work on frequent itemset mining, our
techniques do not rely on the output of a non-private mining algorithm.
Instead, we observe that both frequent graph pattern mining and the guarantee
of differential privacy can be unified into an MCMC sampling framework. In
addition, we establish the privacy and utility guarantee of our algorithm and
propose an efficient neighboring pattern counting technique as well.
Experimental results show that the proposed algorithm is able to output
frequent patterns with good precision
- …