1 research outputs found
Free-rider Episode Screening via Dual Partition Model
One of the drawbacks of frequent episode mining is that overwhelmingly many
of the discovered patterns are redundant. Free-rider episode, as a typical
example, consists of a real pattern doped with some additional noise events.
Because of the possible high support of the inside noise events, such
free-rider episodes may have abnormally high support that they cannot be
filtered by frequency based framework. An effective technique for filtering
free-rider episodes is using a partition model to divide an episode into two
consecutive subepisodes and comparing the observed support of such episode with
its expected support under the assumption that these two subepisodes occur
independently. In this paper, we take more complex subepisodes into
consideration and develop a novel partition model named EDP for free-rider
episode filtering from a given set of episodes. It combines (1) a dual
partition strategy which divides an episode to an underlying real pattern and
potential noises; (2) a novel definition of the expected support of a
free-rider episode based on the proposed partition strategy. We can deem the
episode interesting if the observed support is substantially higher than the
expected support estimated by our model. The experiments on synthetic and
real-world datasets demonstrate EDP can effectively filter free-rider episodes
compared with existing state-of-the-arts.Comment: The 23rd International Conference on Database Systems for Advanced
Applications(DASFAA 2018), 16 Page