1,007 research outputs found
A Model-Based Frequency Constraint for Mining Associations from Transaction Data
Mining frequent itemsets is a popular method for finding associated items in
databases. For this method, support, the co-occurrence frequency of the items
which form an association, is used as the primary indicator of the
associations's significance. A single user-specified support threshold is used
to decided if associations should be further investigated. Support has some
known problems with rare items, favors shorter itemsets and sometimes produces
misleading associations.
In this paper we develop a novel model-based frequency constraint as an
alternative to a single, user-specified minimum support. The constraint
utilizes knowledge of the process generating transaction data by applying a
simple stochastic mixture model (the NB model) which allows for transaction
data's typically highly skewed item frequency distribution. A user-specified
precision threshold is used together with the model to find local frequency
thresholds for groups of itemsets. Based on the constraint we develop the
notion of NB-frequent itemsets and adapt a mining algorithm to find all
NB-frequent itemsets in a database. In experiments with publicly available
transaction databases we show that the new constraint provides improvements
over a single minimum support threshold and that the precision threshold is
more robust and easier to set and interpret by the user
A Mining-Based Compression Approach for Constraint Satisfaction Problems
In this paper, we propose an extension of our Mining for SAT framework to
Constraint satisfaction Problem (CSP). We consider n-ary extensional
constraints (table constraints). Our approach aims to reduce the size of the
CSP by exploiting the structure of the constraints graph and of its associated
microstructure. More precisely, we apply itemset mining techniques to search
for closed frequent itemsets on these two representation. Using Tseitin
extension, we rewrite the whole CSP to another compressed CSP equivalent with
respect to satisfiability. Our approach contrast with previous proposed
approach by Katsirelos and Walsh, as we do not change the structure of the
constraints.Comment: arXiv admin note: substantial text overlap with arXiv:1304.441
- …