1,174 research outputs found

    A Novel Algorithm for Discovering Frequent Closures and Generators

    Get PDF
    The Important construction of many association rules needs the calculation of Frequent Closed Item Sets and Frequent Generator Item Sets (FCIS/FGIS). However, these two odd jobs are joined very rarely. Most of the existing methods apply level wise Breadth-First search. Though the Depth-First search depends on different characteristics of data, it is often better than others. Hence, in this paper it is named as FCFG algorithm that combines the Frequent closed item sets and frequent generators. This proposed algorithm (FCFG) extracts frequent itemsets (FIs) in a Depth-First search method. Then this algorithm extracts FCIS and FGIS from FIs by a level wise approach. Then it associates the generators to their closures. In FCFG algorithm, a generic technique is extended from an arbitrary FI-miner algorithm in order to support the generation of minimal non-redundant association rules. Experimental results indicate that FCFG algorithm performs better when compared with other level wise methods in most of the cases

    A scalable mining of frequent quadratic concepts in d-folksonomies

    Full text link
    Folksonomy mining is grasping the interest of web 2.0 community since it represents the core data of social resource sharing systems. However, a scrutiny of the related works interested in mining folksonomies unveils that the time stamp dimension has not been considered. For example, the wealthy number of works dedicated to mining tri-concepts from folksonomies did not take into account time dimension. In this paper, we will consider a folksonomy commonly composed of triples and we shall consider the time as a new dimension. We motivate our approach by highlighting the battery of potential applications. Then, we present the foundations for mining quadri-concepts, provide a formal definition of the problem and introduce a new efficient algorithm, called QUADRICONS for its solution to allow for mining folksonomies in time, i.e., d-folksonomies. We also introduce a new closure operator that splits the induced search space into equivalence classes whose smallest elements are the quadri-minimal generators. Carried out experiments on large-scale real-world datasets highlight good performances of our algorithm

    Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

    Full text link
    Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We discuss several existing alternative definitions of redundancy between association rules and provide new characterizations and relationships among them. We show that the main alternatives we discuss correspond actually to just two variants, which differ in the treatment of full-confidence implications. For each of these two notions of redundancy, we provide a sound and complete deduction calculus, and we show how to construct complete bases (that is, axiomatizations) of absolutely minimum size in terms of the number of rules. We explore finally an approach to redundancy with respect to several association rules, and fully characterize its simplest case of two partial premises.Comment: LMCS accepted pape

    Closed Association Rules

    Get PDF
    In this paper we present a new basis for association rules called Closed Association Rules (CR). This basis contains all valid association rules that can be generated from frequent closed itemsets. CR is a lossless representation of all association rules. Regarding the number of rules, our basis is between all association rules (AR) and minimal non-redundant association rules (MNR), filling a gap between them. The new basis provides a framework for some other bases and we show that MNR is a subset of CR. Our experiments show that CR is a good alternative for all association rules. The number of generated rules can be much less, and beside frequent closed itemsets nothing else is required

    An Efficient Hybrid Algorithm for Mining Frequent Closures and Generators

    Get PDF
    Conference site: http://cla2008.inf.upol.cz/ .International audienceThe effective construction of many association rule bases requires the computation of both frequent closed and frequent generator itemsets (FCIs/FGs). However, these two tasks are rarely combined. Most of the existing solutions apply levelwise breadth-first traversal, though depth-first traversal, depending on data characteristics, is often superior. Hence, we address here a hybrid algorithm that combines the two different traversals. The proposed algorithm, Eclat-Z, extracts frequent itemsets (FIs) in a depth-first way. Then, the algorithm filters FCIs and FGs among FIs in a levelwise manner, and associates the generators to their closures. In Eclat-Z we present a generic technique for extending an arbitrary FI-miner algorithm in order to support the generation of minimal non-redundant association rules too. Experimental results indicate that Eclat-Z outperforms pure levelwise methods in most cases

    Computing Functional Dependencies with Pattern Structures

    Get PDF
    The treatment of many-valued data with FCA has been achieved by means of scaling. This method has some drawbacks, since the size of the resulting formal contexts depends usually on the number of di erent values that are present in a table, which can be very large. Pattern structures have been proved to deal with many-valued data, offering a viable and sound alternative to scaling in order to represent and analyze sets of many-valued data with FCA. Functional dependencies have already been dealt with FCA using the binarization of a table, that is, creating a formal context out of a set of data. Unfortunately, although this method is standard and simple, it has an important drawback, which is the fact that the resulting context is quadratic in number of objects w.r.t. the original set of data. In this paper, we examine how we can extract the functional dependencies that hold in a set of data using pattern structures. This allows to build an equivalent concept lattice avoiding the step of binarization, and thus comes with better concept representation and computation.Postprint (published version
    • …
    corecore