443 research outputs found
Visualizing and Interacting with Concept Hierarchies
Concept Hierarchies and Formal Concept Analysis are theoretically well
grounded and largely experimented methods. They rely on line diagrams called
Galois lattices for visualizing and analysing object-attribute sets. Galois
lattices are visually seducing and conceptually rich for experts. However they
present important drawbacks due to their concept oriented overall structure:
analysing what they show is difficult for non experts, navigation is
cumbersome, interaction is poor, and scalability is a deep bottleneck for
visual interpretation even for experts. In this paper we introduce semantic
probes as a means to overcome many of these problems and extend usability and
application possibilities of traditional FCA visualization methods. Semantic
probes are visual user centred objects which extract and organize reduced
Galois sub-hierarchies. They are simpler, clearer, and they provide a better
navigation support through a rich set of interaction possibilities. Since probe
driven sub-hierarchies are limited to users focus, scalability is under control
and interpretation is facilitated. After some successful experiments, several
applications are being developed with the remaining problem of finding a
compromise between simplicity and conceptual expressivity
Computing iceberg concept lattices with Titanic
International audienceWe introduce the notion of iceberg concept lattices and show their use in knowledge discovery in databases. Iceberg lattices are a conceptual clustering method, which is well suited for analyzing very large databases. They also serve as a condensed representation of frequent itemsets, as starting point for computing bases of association rules, and as a visualization method for association rules. Iceberg concept lattices are based on the theory of Formal Concept Analysis, a mathematical theory with applications in data analysis, information retrieval, and knowledge discovery. We present a new algorithm called TITANIC for computing (iceberg) concept lattices. It is based on data mining techniques with a level-wise approach. In fact, TITANIC can be used for a more general problem: Computing arbitrary closure systems when the closure operator comes along with a so-called weight function. The use of weight functions for computing closure systems has not been discussed in the literature up to now. Applications providing such a weight function include association rule mining, functional dependencies in databases, conceptual clustering, and ontology engineering. The algorithm is experimentally evaluated and compared with Ganter's Next-Closure algorithm. The evaluation shows an important gain in efficiency, especially for weakly correlated data
A conceptual approach to gene expression analysis enhanced by visual analytics
The analysis of gene expression data is a complex task for biologists wishing to understand the role of genes in the formation of diseases such as cancer. Biologists need greater support when trying to discover, and comprehend, new relationships within their data. In this paper, we describe an approach to the analysis of gene expression data where overlapping groupings are generated by Formal Concept Analysis and interactively analyzed in a tool called CUBIST. The CUBIST workflow involves querying a semantic database and converting the result into a formal context, which can be simplified to make it manageable, before it is visualized as a concept lattice and associated charts
Can FCA-based Recommender System Suggest a Proper Classifier?
The paper briefly introduces multiple classifier systems and describes a new
algorithm, which improves classification accuracy by means of recommendation of
a proper algorithm to an object classification. This recommendation is done
assuming that a classifier is likely to predict the label of the object
correctly if it has correctly classified its neighbors. The process of
assigning a classifier to each object is based on Formal Concept Analysis. We
explain the idea of the algorithm with a toy example and describe our first
experiments with real-world datasets.Comment: 10 pages, 1 figure, 4 tables, ECAI 2014, workshop "What FCA can do
for "Artifficial Intelligence
Multiway pruning for efficient iceberg cubing
Effective pruning is essential for efficient iceberg cube computation. Previous studies have focused on exclusive pruning: regions of a search space that do not satisfy some condition are excluded from computation. In this paper we propose inclusive and anti-pruning. With inclusive pruning, necessary conditions that solutions must satisfy are identified and regions that can not be reached by such conditions are pruned from computation. With anti-pruning, regions of solutions are identified and pruning is not applied. We propose the multiway pruning strategy combining exclusive, inclusive and anti-pruning with bounding aggregate functions in iceberg cube computation. Preliminary experiments demonstrate that the multiway-pruning strategy improves the efficiency of iceberg cubing algorithms with only exclusive pruning
Diamond Dicing
In OLAP, analysts often select an interesting sample of the data. For
example, an analyst might focus on products bringing revenues of at least 100
000 dollars, or on shops having sales greater than 400 000 dollars. However,
current systems do not allow the application of both of these thresholds
simultaneously, selecting products and shops satisfying both thresholds. For
such purposes, we introduce the diamond cube operator, filling a gap among
existing data warehouse operations.
Because of the interaction between dimensions the computation of diamond
cubes is challenging. We compare and test various algorithms on large data sets
of more than 100 million facts. We find that while it is possible to implement
diamonds in SQL, it is inefficient. Indeed, our custom implementation can be a
hundred times faster than popular database engines (including a row-store and a
column-store).Comment: 29 page
Conceptual Views on Tree Ensemble Classifiers
Random Forests and related tree-based methods are popular for supervised
learning from table based data. Apart from their ease of parallelization, their
classification performance is also superior. However, this performance,
especially parallelizability, is offset by the loss of explainability.
Statistical methods are often used to compensate for this disadvantage. Yet,
their ability for local explanations, and in particular for global
explanations, is limited. In the present work we propose an algebraic method,
rooted in lattice theory, for the (global) explanation of tree ensembles. In
detail, we introduce two novel conceptual views on tree ensemble classifiers
and demonstrate their explanatory capabilities on Random Forests that were
trained with standard parameters
- …