1,174 research outputs found
A Novel Algorithm for Discovering Frequent Closures and Generators
The Important construction of many association rules needs the calculation of Frequent Closed Item Sets and Frequent Generator Item Sets (FCIS/FGIS). However, these two odd jobs are joined very rarely. Most of the existing methods apply level wise Breadth-First search. Though the Depth-First search depends on different characteristics of data, it is often better than others. Hence, in this paper it is named as FCFG algorithm that combines the Frequent closed item sets and frequent generators. This proposed algorithm (FCFG) extracts frequent itemsets (FIs) in a Depth-First search method. Then this algorithm extracts FCIS and FGIS from FIs by a level wise approach. Then it associates the generators to their closures. In FCFG algorithm, a generic technique is extended from an arbitrary FI-miner algorithm in order to support the generation of minimal non-redundant association rules. Experimental results indicate that FCFG algorithm performs better when compared with other level wise methods in most of the cases
A scalable mining of frequent quadratic concepts in d-folksonomies
Folksonomy mining is grasping the interest of web 2.0 community since it
represents the core data of social resource sharing systems. However, a
scrutiny of the related works interested in mining folksonomies unveils that
the time stamp dimension has not been considered. For example, the wealthy
number of works dedicated to mining tri-concepts from folksonomies did not take
into account time dimension. In this paper, we will consider a folksonomy
commonly composed of triples and we shall consider the
time as a new dimension. We motivate our approach by highlighting the battery
of potential applications. Then, we present the foundations for mining
quadri-concepts, provide a formal definition of the problem and introduce a new
efficient algorithm, called QUADRICONS for its solution to allow for mining
folksonomies in time, i.e., d-folksonomies. We also introduce a new closure
operator that splits the induced search space into equivalence classes whose
smallest elements are the quadri-minimal generators. Carried out experiments on
large-scale real-world datasets highlight good performances of our algorithm
Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules
Association rules are among the most widely employed data analysis methods in
the field of Data Mining. An association rule is a form of partial implication
between two sets of binary variables. In the most common approach, association
rules are parameterized by a lower bound on their confidence, which is the
empirical conditional probability of their consequent given the antecedent,
and/or by some other parameter bounds such as "support" or deviation from
independence. We study here notions of redundancy among association rules from
a fundamental perspective. We see each transaction in a dataset as an
interpretation (or model) in the propositional logic sense, and consider
existing notions of redundancy, that is, of logical entailment, among
association rules, of the form "any dataset in which this first rule holds must
obey also that second rule, therefore the second is redundant". We discuss
several existing alternative definitions of redundancy between association
rules and provide new characterizations and relationships among them. We show
that the main alternatives we discuss correspond actually to just two variants,
which differ in the treatment of full-confidence implications. For each of
these two notions of redundancy, we provide a sound and complete deduction
calculus, and we show how to construct complete bases (that is,
axiomatizations) of absolutely minimum size in terms of the number of rules. We
explore finally an approach to redundancy with respect to several association
rules, and fully characterize its simplest case of two partial premises.Comment: LMCS accepted pape
Closed Association Rules
In this paper we present a new basis for association rules called Closed Association Rules (CR). This basis contains all valid association rules that can be generated from frequent closed itemsets. CR is a lossless representation of all association rules. Regarding the number of rules, our basis is between all association rules (AR) and minimal non-redundant association rules (MNR), filling a gap between them. The new basis provides a framework for some other bases and we show that MNR is a subset of CR. Our experiments show that CR is a good alternative for all association rules. The number of generated rules can be much less, and beside frequent closed itemsets nothing else is required
An Efficient Hybrid Algorithm for Mining Frequent Closures and Generators
Conference site: http://cla2008.inf.upol.cz/ .International audienceThe effective construction of many association rule bases requires the computation of both frequent closed and frequent generator itemsets (FCIs/FGs). However, these two tasks are rarely combined. Most of the existing solutions apply levelwise breadth-first traversal, though depth-first traversal, depending on data characteristics, is often superior. Hence, we address here a hybrid algorithm that combines the two different traversals. The proposed algorithm, Eclat-Z, extracts frequent itemsets (FIs) in a depth-first way. Then, the algorithm filters FCIs and FGs among FIs in a levelwise manner, and associates the generators to their closures. In Eclat-Z we present a generic technique for extending an arbitrary FI-miner algorithm in order to support the generation of minimal non-redundant association rules too. Experimental results indicate that Eclat-Z outperforms pure levelwise methods in most cases
Computing Functional Dependencies with Pattern Structures
The treatment of many-valued data with FCA has been achieved by means of scaling. This method has some drawbacks, since the size of the resulting formal contexts depends usually on the number of di erent values that are present in a table, which can be very large.
Pattern structures have been proved to deal with many-valued data, offering a viable and sound alternative to scaling in order to represent and analyze sets of many-valued data with FCA.
Functional dependencies have already been dealt with FCA using the binarization of a table, that is, creating a formal context out of a set of data. Unfortunately, although this method is standard and simple, it has an important drawback, which is the fact that the resulting context is
quadratic in number of objects w.r.t. the original set of data.
In this paper, we examine how we can extract the functional dependencies that hold in a set of data using pattern structures. This allows to build an equivalent concept lattice avoiding the step of binarization, and thus comes with better concept representation and computation.Postprint (published version
- …