27,713 research outputs found
Prefix-Projection Global Constraint for Sequential Pattern Mining
Sequential pattern mining under constraints is a challenging data mining
task. Many efficient ad hoc methods have been developed for mining sequential
patterns, but they are all suffering from a lack of genericity. Recent works
have investigated Constraint Programming (CP) methods, but they are not still
effective because of their encoding. In this paper, we propose a global
constraint based on the projected databases principle which remedies to this
drawback. Experiments show that our approach clearly outperforms CP approaches
and competes well with ad hoc methods on large datasets
A review of associative classification mining
Associative classification mining is a promising approach in data mining that utilizes the
association rule discovery techniques to construct classification systems, also known as
associative classifiers. In the last few years, a number of associative classification algorithms
have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. These algorithms
employ several different rule discovery, rule ranking, rule pruning, rule prediction and rule
evaluation methods. This paper focuses on surveying and comparing the state-of-the-art associative
classification techniques with regards to the above criteria. Finally, future directions in associative
classification, such as incremental learning and mining low-quality data sets, are also
highlighted in this paper
Mining Frequent Neighborhood Patterns in Large Labeled Graphs
Over the years, frequent subgraphs have been an important sort of targeted
patterns in the pattern mining literatures, where most works deal with
databases holding a number of graph transactions, e.g., chemical structures of
compounds. These methods rely heavily on the downward-closure property (DCP) of
the support measure to ensure an efficient pruning of the candidate patterns.
When switching to the emerging scenario of single-graph databases such as
Google Knowledge Graph and Facebook social graph, the traditional support
measure turns out to be trivial (either 0 or 1). However, to the best of our
knowledge, all attempts to redefine a single-graph support resulted in measures
that either lose DCP, or are no longer semantically intuitive.
This paper targets mining patterns in the single-graph setting. We resolve
the "DCP-intuitiveness" dilemma by shifting the mining target from frequent
subgraphs to frequent neighborhoods. A neighborhood is a specific topological
pattern where a vertex is embedded, and the pattern is frequent if it is shared
by a large portion (above a given threshold) of vertices. We show that the new
patterns not only maintain DCP, but also have equally significant semantics as
subgraph patterns. Experiments on real-life datasets display the feasibility of
our algorithms on relatively large graphs, as well as the capability of mining
interesting knowledge that is not discovered in prior works.Comment: 9 page
- …