A Subsequence Interleaving Model for Sequential Pattern Mining

Fowkes, Jaroslav; Sutton, Charles

journal article

oai:pure.ed.ac.uk:publications/66e722ba-b617-4c34-8cd2-ad041b21177e

A Subsequence Interleaving Model for Sequential Pattern Mining

Authors: Jaroslav Fowkes
Charles Sutton
Publication date: 13 August 2016
Publisher
Doi

Abstract

Recent sequential pattern mining methods have used the minimum description length (MDL) principle to define an encoding scheme which describes an algorithm for mining the most compressing patterns in a database. We present a novel subsequence interleaving model based on a probabilistic model of the sequence database, which allows us to searchfor the most compressing set of patterns without designing a specific encoding scheme. Our proposed algorithm is able to efficiently mine the most relevant sequential patterns and rank them using an associated measure of interestingness.The efficient inference in our model is a direct result of our use of a structural expectation-maximization framework, in which the expectation-step takes the form of a submodular optimization problem subject to a coverage constraint.We show on both synthetic and real world datasets that ourmodel mines a set of sequential patterns with low spuriousness and redundancy, high interpretability and usefulness in real-world applications. Furthermore, we demonstrate that the quality of the patterns from our approach is comparable to, if not better than, existing state of the art sequential pattern mining algorithms

contributionToPeriodical

Similar works

Full text

Open in the Core reader

Download PDF

Edinburgh Research Explorer

oai:pure.ed.ac.uk:publications...

Last time updated on 19/04/2017

This paper was published in Edinburgh Research Explorer.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: info:eu-repo/semantics/openAccess