482 research outputs found
Visual analytics in FCA-based clustering
Visual analytics is a subdomain of data analysis which combines both human
and machine analytical abilities and is applied mostly in decision-making and
data mining tasks. Triclustering, based on Formal Concept Analysis (FCA), was
developed to detect groups of objects with similar properties under similar
conditions. It is used in Social Network Analysis (SNA) and is a basis for
certain types of recommender systems. The problem of triclustering algorithms
is that they do not always produce meaningful clusters. This article describes
a specific triclustering algorithm and a prototype of a visual analytics
platform for working with obtained clusters. This tool is designed as a testing
frameworkis and is intended to help an analyst to grasp the results of
triclustering and recommender algorithms, and to make decisions on
meaningfulness of certain triclusters and recommendations.Comment: 11 pages, 3 figures, 2 algorithms, 3rd International Conference on
Analysis of Images, Social Networks and Texts (AIST'2014). in Supplementary
Proceedings of the 3rd International Conference on Analysis of Images, Social
Networks and Texts (AIST 2014), Vol. 1197, CEUR-WS.org, 201
Semi-supervised learning on closed set lattices
We propose a new approach for semi-supervised learning using closed set lattices, which have been recently used for frequent pattern mining within the framework of the data analysis technique of Formal Concept Analysis (FCA). We present a learning algorithm, called SELF (SEmi-supervised Learning via FCA), which performs as a multiclass classifier and a label ranker for mixed-type data containing both discrete and continuous variables, while only few learning algorithms such as the decision tree-based classifier can directly handle mixed-type data. From both labeled and unlabeled data, SELF constructs a closed set lattice, which is a partially ordered set of data clusters with respect to subset inclusion, via FCA together with discretizing continuous variables, followed by learning classification rules through finding maximal clusters on the lattice. Moreover, it can weight each classification rule using the lattice, which gives a partial order of preference over class labels. We illustrate experimentally the competitive performance of SELF in classification and ranking compared to other learning algorithms using UCI datasets
Concept discovery innovations in law enforcement: a perspective.
In the past decades, the amount of information available to law enforcement agencies has increased significantly. Most of this information is in textual form, however analyses have mainly focused on the structured data. In this paper, we give an overview of the concept discovery projects at the Amsterdam-Amstelland police where Formal Concept Analysis (FCA) is being used as text mining instrument. FCA is combined with statistical techniques such as Hidden Markov Models (HMM) and Emergent Self Organizing Maps (ESOM). The combination of this concept discovery and refinement technique with statistical techniques for analyzing high-dimensional data not only resulted in new insights but often in actual improvements of the investigation procedures.Formal concept analysis; Intelligence led policing; Knowledge discovery;
Distributed Computation of Generalized One-Sided Concept Lattices on Sparse Data Tables
In this paper we present the study on the usage of distributed version of the algorithm for generalized one-sided concept lattices (GOSCL), which provides a special case for fuzzy version of data analysis approach called formal concept analysis (FCA). The methods of this type create the conceptual model of the input data based on the theory of concept lattices and were successfully applied in several domains. GOSCL is able to create one-sided concept lattices for data tables with different attribute types processed as fuzzy sets. One of the problems with the creation of FCA-based models is their computational complexity. In order to reduce the computation times, we have designed the distributed version of the algorithm for GOSCL. The algorithm is able to work well especially for data where the number of newly generated concepts is reduced, i.e., for sparse input data tables which are often used in domains like text-mining and information retrieval. Therefore, we present the experimental results on sparse data tables in order to show the applicability of the algorithm on the generated data and the selected text-mining datasets
A Recursive Bateson-Inspired Model for the Generation of Semantic Formal Concepts from Spatial Sensory Data
Neural-symbolic approaches to machine learning incorporate the advantages
from both connectionist and symbolic methods. Typically, these models employ a
first module based on a neural architecture to extract features from complex
data. Then, these features are processed as symbols by a symbolic engine that
provides reasoning, concept structures, composability, better generalization
and out-of-distribution learning among other possibilities. However, neural
approaches to the grounding of symbols in sensory data, albeit powerful, still
require heavy training and tedious labeling for the most part. This paper
presents a new symbolic-only method for the generation of hierarchical concept
structures from complex spatial sensory data. The approach is based on
Bateson's notion of difference as the key to the genesis of an idea or a
concept. Following his suggestion, the model extracts atomic features from raw
data by computing elemental sequential comparisons in a stream of multivariate
numerical values. Higher-level constructs are built from these features by
subjecting them to further comparisons in a recursive process. At any stage in
the recursion, a concept structure may be obtained from these constructs and
features by means of Formal Concept Analysis. Results show that the model is
able to produce fairly rich yet human-readable conceptual representations
without training. Additionally, the concept structures obtained through the
model (i) present high composability, which potentially enables the generation
of 'unseen' concepts, (ii) allow formal reasoning, and (iii) have inherent
abilities for generalization and out-of-distribution learning. Consequently,
this method may offer an interesting angle to current neural-symbolic research.
Future work is required to develop a training methodology so that the model can
be tested against a larger dataset
- âŠ