27 research outputs found

    The Coron System

    Get PDF
    Coron is a domain and platform independent, multi-purposed data mining toolkit, which incorporates not only a rich collection of data mining algorithms, but also allows a number of auxiliary operations. To the best of our knowledge, a data mining toolkit designed specifically for itemset extraction and association rule generation like Coron does not exist elsewhere. Coron also provides support for preparing and filtering data, and for interpreting the extracted units of knowledge

    Efficient Mining of Frequent Closures with Precedence Links and Associated Generators

    Get PDF
    The effective construction of many association rule bases require the computation of frequent closures, generators, and precedence links between closures. However, these tasks are rarely combined, and no scalable algorithm exists at present for their joint computation. We propose here a method that solves this challenging problem in two separated steps. First, we introduce a new algorithm called Touch for finding frequent closed itemsets (FCIs) and their generators (FGs). Touch applies depth-first traversal, and experimental results indicate that this algorithm is highly efficient and outperforms its levelwise competitors. Second, we propose another algorithm called Snow for extracting efficiently the precedence from the output of Touch. To do so, we apply hypergraph theory. Snow is a generic algorithm that can be used with any FCI/FG-miner. The two algorithms, Touch and Snow, provide a complete solution for constructing iceberg lattices. Furthermore, due to their modular design, parts of the algorithms can also be used independently

    An Efficient Hybrid Algorithm for Mining Frequent Closures and Generators

    Get PDF
    Conference site: http://cla2008.inf.upol.cz/ .International audienceThe effective construction of many association rule bases requires the computation of both frequent closed and frequent generator itemsets (FCIs/FGs). However, these two tasks are rarely combined. Most of the existing solutions apply levelwise breadth-first traversal, though depth-first traversal, depending on data characteristics, is often superior. Hence, we address here a hybrid algorithm that combines the two different traversals. The proposed algorithm, Eclat-Z, extracts frequent itemsets (FIs) in a depth-first way. Then, the algorithm filters FCIs and FGs among FIs in a levelwise manner, and associates the generators to their closures. In Eclat-Z we present a generic technique for extending an arbitrary FI-miner algorithm in order to support the generation of minimal non-redundant association rules too. Experimental results indicate that Eclat-Z outperforms pure levelwise methods in most cases

    Why and How Knowledge Discovery Can Be Useful for Solving Problems with CBR

    Get PDF
    International audienceIn this talk, we discuss and illustrate links existing between knowledge discovery in databases (KDD), knowledge representation and reasoning (KRR), and case-based reasoning (CBR). KDD techniques especially based on Formal Concept Analysis (FCA) are well formalized and allow the design of concept lattices from binary and complex data. These concept lattices provide a realistic basis for knowledge base organization and ontology engineering. More generally, they can be used for representing knowledge and reasoning in knowledge systems and CBR systems as well

    International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at IJCAI 2013, Beijing, China, August 4 2013)

    Get PDF
    International audienceThis second edition of the FCA4AI workshop (the first edition was associated to the ECAI 2012 Conference, see http://www.fca4ai.hse.ru/), shows again that there are many AI researchers interested in FCA. Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classification. FCA allows one to build a concept lattice and a system of dependencies (implications) which can be used for many AI needs, e.g. knowledge processing involving learning, knowledge discovery, knowledge representation and reasoning, ontology engineering, as well as information retrieval and text processing. Thus, there exist many natural links between FCA and AI. Accordingly, the focus in this workshop was on how can FCA support AI activities (knowledge processing) and how can FCA be extended in order to help AI researchers to solve new and complex problems in their domains

    Context-based Grouping and Recommendation in MANETs

    No full text
    International audienceWe propose in this chapter a context grouping mechanism for context distribution over MANETs. Context distribution is becoming a key aspect for successful context-aware applications in mobile and ubiquitous computing environments. Such applications need, for adaptation purposes, context information that is acquired by multiple context sensors distributed over the environment. Nevertheless, applications are not interested in all available context information. Context distribution mechanisms have to cope with the dynamicity that characterizes MANETs and also prevent context information to be delivered to nodes (and applications) that are not interested in it. Our grouping mechanism organizes the distribution of context information in groups whose definition is context based: each context group is defined based on a criteria set (e.g. the shared location and interest) and has a dissemination set, which controls the information that can be shared in the group. We propose a personalized and dynamic way of defining and joining groups by providing a lattice-based classification and recommendation mechanism that analyzes the interrelations between groups and users, and recommend new groups to users, based on the interests and preferences of the user

    Finding frequent closed itemsets with an extended version of the Eclat algorithm

    Get PDF
    Apriori is the most well-known algorithm for finding frequent itemsets (FIs) in a dataset. For generating interesting association rules, we also need the so-called frequent closed itemsets (FCIs) that form a subset of FIs. Apriori has a simple extension called Apriori-Close that can filter FCIs among FIs. However, it is known that vertical itemset mining algorithms outperform the Apriori-like levelwise algorithms. Eclat is another well-known vertical miner that can produce the same output as Apriori, i.e. it also finds the FIs in a dataset. Here we propose an extension of Eclat, called Eclat-Close that can filter FCIs among FIs. This way Eclat-Close can be used as an alternative of Apriori-Close. Experimental results show that Eclat-Close performs much better than Apriori-Close, especially on dense, highly-correlated datasets

    A Formal Concept Analysis Approach to Association Rule Mining: The QuICL Algorithms

    Get PDF
    Association rule mining (ARM) is the task of identifying meaningful implication rules exhibited in a data set. Most research has focused on extracting frequent item (FI) sets and thus fallen short of the overall ARM objective. The FI miners fail to identify the upper covers that are needed to generate a set of association rules whose size can be exploited by an end user. An alternative to FI mining can be found in formal concept analysis (FCA), a branch of applied mathematics. FCA derives a concept lattice whose concepts identify closed FI sets and connections identify the upper covers. However, most FCA algorithms construct a complete lattice and therefore include item sets that are not frequent. An iceberg lattice, on the other hand, is a concept lattice whose concepts contain only FI sets. Only three algorithms to construct an iceberg lattice were found in literature. Given that an iceberg concept lattice provides an analysis tool to succinctly identify association rules, this study investigated additional algorithms to construct an iceberg concept lattice. This report presents the development and analysis of the Quick Iceberg Concept Lattice (QuICL) algorithms. These algorithms provide incremental construction of an iceberg lattice. QuICL uses recursion instead of iteration to navigate the lattice and establish connections, thereby eliminating costly processing incurred by past algorithms. The QuICL algorithms were evaluated against leading FI miners and FCA construction algorithms using benchmarks cited in literature. Results demonstrate that QuICL provides performance on the order of FI miners yet additionally derive the upper covers. QuICL, when combined with known algorithms to extract a basis of association rules from a lattice, offer a best known ARM solution. Beyond this, the QuICL algorithms have proved to be very efficient, providing an order of magnitude gains over other incremental lattice construction algorithms. For example, on the Mushroom data set, QuICL completes in less than 3 seconds. Past algorithms exceed 200 seconds. On T10I4D100k, QuICL completes in less than 120 seconds. Past algorithms approach 10,000 seconds. QuICL is proved to be the best known all around incremental lattice construction algorithm. Runtime complexity is shown to be O(l d i) where l is the cardinality of the lattice, d is the average degree of the lattice, and i is a mean function on the frequent item extents