27 research outputs found
The Coron System
Coron is a domain and platform independent, multi-purposed data mining
toolkit, which incorporates not only a rich collection of data mining
algorithms, but also allows a number of auxiliary operations. To the best of
our knowledge, a data mining toolkit designed specifically for itemset
extraction and association rule generation like Coron does not exist elsewhere.
Coron also provides support for preparing and filtering data, and for
interpreting the extracted units of knowledge
Efficient Mining of Frequent Closures with Precedence Links and Associated Generators
The effective construction of many association rule bases require the computation of frequent closures, generators, and precedence links between closures. However, these tasks are rarely combined, and no scalable algorithm exists at present for their joint computation. We propose here a method that solves this challenging problem in two separated steps. First, we introduce a new algorithm called Touch for finding frequent closed itemsets (FCIs) and their generators (FGs). Touch applies depth-first traversal, and experimental results indicate that this algorithm is highly efficient and outperforms its levelwise competitors. Second, we propose another algorithm called Snow for extracting efficiently the precedence from the output of Touch. To do so, we apply hypergraph theory. Snow is a generic algorithm that can be used with any FCI/FG-miner. The two algorithms, Touch and Snow, provide a complete solution for constructing iceberg lattices. Furthermore, due to their modular design, parts of the algorithms can also be used independently
An Efficient Hybrid Algorithm for Mining Frequent Closures and Generators
Conference site: http://cla2008.inf.upol.cz/ .International audienceThe effective construction of many association rule bases requires the computation of both frequent closed and frequent generator itemsets (FCIs/FGs). However, these two tasks are rarely combined. Most of the existing solutions apply levelwise breadth-first traversal, though depth-first traversal, depending on data characteristics, is often superior. Hence, we address here a hybrid algorithm that combines the two different traversals. The proposed algorithm, Eclat-Z, extracts frequent itemsets (FIs) in a depth-first way. Then, the algorithm filters FCIs and FGs among FIs in a levelwise manner, and associates the generators to their closures. In Eclat-Z we present a generic technique for extending an arbitrary FI-miner algorithm in order to support the generation of minimal non-redundant association rules too. Experimental results indicate that Eclat-Z outperforms pure levelwise methods in most cases
Why and How Knowledge Discovery Can Be Useful for Solving Problems with CBR
International audienceIn this talk, we discuss and illustrate links existing between knowledge discovery in databases (KDD), knowledge representation and reasoning (KRR), and case-based reasoning (CBR). KDD techniques especially based on Formal Concept Analysis (FCA) are well formalized and allow the design of concept lattices from binary and complex data. These concept lattices provide a realistic basis for knowledge base organization and ontology engineering. More generally, they can be used for representing knowledge and reasoning in knowledge systems and CBR systems as well
International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at IJCAI 2013, Beijing, China, August 4 2013)
International audienceThis second edition of the FCA4AI workshop (the first edition was associated to the ECAI 2012 Conference, see http://www.fca4ai.hse.ru/), shows again that there are many AI researchers interested in FCA. Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classification. FCA allows one to build a concept lattice and a system of dependencies (implications) which can be used for many AI needs, e.g. knowledge processing involving learning, knowledge discovery, knowledge representation and reasoning, ontology engineering, as well as information retrieval and text processing. Thus, there exist many natural links between FCA and AI. Accordingly, the focus in this workshop was on how can FCA support AI activities (knowledge processing) and how can FCA be extended in order to help AI researchers to solve new and complex problems in their domains
Context-based Grouping and Recommendation in MANETs
International audienceWe propose in this chapter a context grouping mechanism for context distribution over MANETs. Context distribution is becoming a key aspect for successful context-aware applications in mobile and ubiquitous computing environments. Such applications need, for adaptation purposes, context information that is acquired by multiple context sensors distributed over the environment. Nevertheless, applications are not interested in all available context information. Context distribution mechanisms have to cope with the dynamicity that characterizes MANETs and also prevent context information to be delivered to nodes (and applications) that are not interested in it. Our grouping mechanism organizes the distribution of context information in groups whose definition is context based: each context group is defined based on a criteria set (e.g. the shared location and interest) and has a dissemination set, which controls the information that can be shared in the group. We propose a personalized and dynamic way of defining and joining groups by providing a lattice-based classification and recommendation mechanism that analyzes the interrelations between groups and users, and recommend new groups to users, based on the interests and preferences of the user
Finding frequent closed itemsets with an extended version of the Eclat algorithm
Apriori is the most well-known algorithm for finding frequent itemsets
(FIs) in a dataset. For generating interesting association rules, we also need
the so-called frequent closed itemsets (FCIs) that form a subset of FIs. Apriori
has a simple extension called Apriori-Close that can filter FCIs among
FIs. However, it is known that vertical itemset mining algorithms outperform
the Apriori-like levelwise algorithms. Eclat is another well-known vertical
miner that can produce the same output as Apriori, i.e. it also finds the FIs
in a dataset. Here we propose an extension of Eclat, called Eclat-Close that
can filter FCIs among FIs. This way Eclat-Close can be used as an alternative
of Apriori-Close. Experimental results show that Eclat-Close performs much
better than Apriori-Close, especially on dense, highly-correlated datasets
A Formal Concept Analysis Approach to Association Rule Mining: The QuICL Algorithms
Association rule mining (ARM) is the task of identifying meaningful implication rules exhibited in a data set. Most research has focused on extracting frequent item (FI) sets and thus fallen short of the overall ARM objective. The FI miners fail to identify the upper covers that are needed to generate a set of association rules whose size can be exploited by an end user. An alternative to FI mining can be found in formal concept analysis (FCA), a branch of applied mathematics. FCA derives a concept lattice whose concepts identify closed FI sets and connections identify the upper covers. However, most FCA algorithms construct a complete lattice and therefore include item sets that are not frequent. An iceberg lattice, on the other hand, is a concept lattice whose concepts contain only FI sets. Only three algorithms to construct an iceberg lattice were found in literature. Given that an iceberg concept lattice provides an analysis tool to succinctly identify association rules, this study investigated additional algorithms to construct an iceberg concept lattice. This report presents the development and analysis of the Quick Iceberg Concept Lattice (QuICL) algorithms. These algorithms provide incremental construction of an iceberg lattice. QuICL uses recursion instead of iteration to navigate the lattice and establish connections, thereby eliminating costly processing incurred by past algorithms. The QuICL algorithms were evaluated against leading FI miners and FCA construction algorithms using benchmarks cited in literature. Results demonstrate that QuICL provides performance on the order of FI miners yet additionally derive the upper covers. QuICL, when combined with known algorithms to extract a basis of association rules from a lattice, offer a best known ARM solution. Beyond this, the QuICL algorithms have proved to be very efficient, providing an order of magnitude gains over other incremental lattice construction algorithms. For example, on the Mushroom data set, QuICL completes in less than 3 seconds. Past algorithms exceed 200 seconds. On T10I4D100k, QuICL completes in less than 120 seconds. Past algorithms approach 10,000 seconds. QuICL is proved to be the best known all around incremental lattice construction algorithm. Runtime complexity is shown to be O(l d i) where l is the cardinality of the lattice, d is the average degree of the lattice, and i is a mean function on the frequent item extents