681 research outputs found
Revisiting Numerical Pattern Mining with Formal Concept Analysis
In this paper, we investigate the problem of mining numerical data in the
framework of Formal Concept Analysis. The usual way is to use a scaling
procedure --transforming numerical attributes into binary ones-- leading either
to a loss of information or of efficiency, in particular w.r.t. the volume of
extracted patterns. By contrast, we propose to directly work on numerical data
in a more precise and efficient way, and we prove it. For that, the notions of
closed patterns, generators and equivalent classes are revisited in the
numerical context. Moreover, two original algorithms are proposed and used in
an evaluation involving real-world data, showing the predominance of the
present approach
The Coron System
Coron is a domain and platform independent, multi-purposed data mining
toolkit, which incorporates not only a rich collection of data mining
algorithms, but also allows a number of auxiliary operations. To the best of
our knowledge, a data mining toolkit designed specifically for itemset
extraction and association rule generation like Coron does not exist elsewhere.
Coron also provides support for preparing and filtering data, and for
interpreting the extracted units of knowledge
Characterization of order-like dependencies with formal concept analysis
Functional Dependencies (FDs) play a key role in many fields
of the relational database model, one of the most widely used database
systems. FDs have also been applied in data analysis, data quality, knowl-
edge discovery and the like, but in a very limited scope, because of their
fixed semantics. To overcome this limitation, many generalizations have
been defined to relax the crisp definition of FDs. FDs and a few of their
generalizations have been characterized with Formal Concept Analysis
which reveals itself to be an interesting unified framework for charac-
terizing dependencies, that is, understanding and computing them in a
formal way. In this paper, we extend this work by taking into account
order-like dependencies. Such dependencies, well defined in the database
field, consider an ordering on the domain of each attribute, and not sim-
ply an equality relation as with standard FDs.Peer ReviewedPostprint (published version
Mining Biclusters of Similar Values with Triadic Concept Analysis
Biclustering numerical data became a popular data-mining task in the
beginning of 2000's, especially for analysing gene expression data. A bicluster
reflects a strong association between a subset of objects and a subset of
attributes in a numerical object/attribute data-table. So called biclusters of
similar values can be thought as maximal sub-tables with close values. Only few
methods address a complete, correct and non redundant enumeration of such
patterns, which is a well-known intractable problem, while no formal framework
exists. In this paper, we introduce important links between biclustering and
formal concept analysis. More specifically, we originally show that Triadic
Concept Analysis (TCA), provides a nice mathematical framework for
biclustering. Interestingly, existing algorithms of TCA, that usually apply on
binary data, can be used (directly or with slight modifications) after a
preprocessing step for extracting maximal biclusters of similar values.Comment: Concept Lattices and their Applications (CLA) (2011
A Note on Classification-Based Reasoning and Semi-Structured Objects
Colloque avec actes sans comité de lecture. internationale.International audienceIn this talk, we present a work in progress on the representation and manipulation of semi-structured data in an object-based representation environment. This research work is carried out in the field of knowledge representation and reasoning in order to build intelligent systems (according to artificial intelligence standards)
FCA and Knowledge Discovery (Tutorial)
International audienceIn this tutorial we will introduce and discuss how FCA and two main extensions, namely Pattern Structures and Relational Concept Analysis (RCA), can be used for knowledge discovery purposes, especially in pattern and rule mining, in data and knowledge processing, data analysis, and classification. Indeed, FCA is aimed at building a concept lattice starting from a binary table where objects are in rows and attributes in columns. But FCA can deal with more complex data. Pattern Structures allow to consider objects with descriptions based on numbers, intervals, sequences, trees and general graphs. RCA was introduced for taking into account relational data and especially relations between objects. These two extensions rely on adapted FCA algorithms and can be efficiently used in real-world applications for knowledge discovery, e.g. text mining and ontology engineering, information retrieval and recommendation, analysis of sequences based on stability, semantic web and classification of Linked Open Data, biclustering, and functional dependencies
Hi\'{e}rarchisation des r\`{e}gles d'association en fouille de textes
Extraction of association rules is widely used as a data mining method.
However, one of the limit of this approach comes from the large number of
extracted rules and the difficulty for a human expert to deal with the totality
of these rules. We propose to solve this problem by structuring the set of
rules into hierarchy. The expert can then therefore explore the rules, access
from one rule to another one more general when we raise up in the hierarchy,
and in other hand, or a more specific rules. Rules are structured at two
levels. The global level aims at building a hierarchy from the set of rules
extracted. Thus we define a first type of rule-subsomption relying on Galois
lattices. The second level consists in a local and more detailed analysis of
each rule. It generate for a given rule a set of generalization rules
structured into a local hierarchy. This leads to the definition of a second
type of subsomption. This subsomption comes from inductive logic programming
and integrates a terminological model
Classification problems in object-based representation systems
Colloque avec actes et comité de lecture.Classification is a process that consists in two dual operations: generating a set of classes and then classifying given objects into the created classes. The class generation may be understood as a learning process and object classification as a problem-solving process. The goal of this position paper is to introduce and to make precise the notion of a classification problem in object-based representation systems, e.g. a query against a class hierarchy, to define a subsumption relation between classifications problems, and to analyze the way a classification problem can be solved with respect to a class hierarchy
Une introduction aux logiques de descriptions
Ce rapport de recherche prĂ©sente une introduction Ă©lĂ©mentaire aux logiques de descriptions, qui forment une famille de langages de reprĂ©sentation de conÂnaisÂsanÂces. Les logiques de descriptions permettent de reprĂ©senter les connaissances d'un domaine de rĂ©fĂ©rence Ă l'aide de concepts (classes d'individus), de rĂ´les (relations entre classes) et d'individus. Une sĂ©mantique est associĂ©e aux concepts, aux rĂ´les et aux individus par l'intermĂ©diaire d'une interprĂ©tation. Les concepts et les rĂ´les sont organisĂ©s en hiĂ©rarchies sur lesquelles opèrent les processus de classification et d'instanciation, qui sont Ă la base du raisonnement terminologique
- …