22 research outputs found
Discovery of the D-basis in binary tables based on hypergraph dualization
Discovery of (strong) association rules, or implications, is an important
task in data management, and it nds application in arti cial intelligence,
data mining and the semantic web. We introduce a novel approach
for the discovery of a speci c set of implications, called the D-basis, that provides
a representation for a reduced binary table, based on the structure of
its Galois lattice. At the core of the method are the D-relation de ned in
the lattice theory framework, and the hypergraph dualization algorithm that
allows us to e ectively produce the set of transversals for a given Sperner hypergraph.
The latter algorithm, rst developed by specialists from Rutgers
Center for Operations Research, has already found numerous applications in
solving optimization problems in data base theory, arti cial intelligence and
game theory. One application of the method is for analysis of gene expression
data related to a particular phenotypic variable, and some initial testing is
done for the data provided by the University of Hawaii Cancer Cente
Discovery of the D-basis in binary tables based on hypergraph dualization
Discovery of (strong) association rules, or implications, is an important
task in data management, and it nds application in arti cial intelligence,
data mining and the semantic web. We introduce a novel approach
for the discovery of a speci c set of implications, called the D-basis, that provides
a representation for a reduced binary table, based on the structure of
its Galois lattice. At the core of the method are the D-relation de ned in
the lattice theory framework, and the hypergraph dualization algorithm that
allows us to e ectively produce the set of transversals for a given Sperner hypergraph.
The latter algorithm, rst developed by specialists from Rutgers
Center for Operations Research, has already found numerous applications in
solving optimization problems in data base theory, arti cial intelligence and
game theory. One application of the method is for analysis of gene expression
data related to a particular phenotypic variable, and some initial testing is
done for the data provided by the University of Hawaii Cancer Cente
On the complexity of enumerating pseudo-intents
AbstractWe investigate whether the pseudo-intents of a given formal context can efficiently be enumerated. We show that they cannot be enumerated in a specified lexicographic order with polynomial delay unless P=NP. Furthermore we show that if the restriction on the order of enumeration is removed, then the problem becomes at least as hard as enumerating minimal transversals of a given hypergraph. We introduce the notion of minimal pseudo-intents and show that recognizing minimal pseudo-intents is polynomial. Despite their less complicated nature, surprisingly it turns out that minimal pseudo-intents cannot be enumerated in output-polynomial time unless P=NP
On the Usability of Probably Approximately Correct Implication Bases
We revisit the notion of probably approximately correct implication bases
from the literature and present a first formulation in the language of formal
concept analysis, with the goal to investigate whether such bases represent a
suitable substitute for exact implication bases in practical use-cases. To this
end, we quantitatively examine the behavior of probably approximately correct
implication bases on artificial and real-world data sets and compare their
precision and recall with respect to their corresponding exact implication
bases. Using a small example, we also provide qualitative insight that
implications from probably approximately correct bases can still represent
meaningful knowledge from a given data set.Comment: 17 pages, 8 figures; typos added, corrected x-label on graph