51 research outputs found
Galois Connections between Semimodules and Applications in Data Mining
In [1] a generalisation of Formal Concept Analysis was introduced
with data mining applications in mind, K-Formal Concept Analysis,
where incidences take values in certain kinds of semirings, instead
of the standard Boolean carrier set. A fundamental result was missing
there, namely the second half of the equivalent of the main theorem of
Formal Concept Analysis. In this continuation we introduce the structural
lattice of such generalised contexts, providing a limited equivalent
to the main theorem of K-Formal Concept Analysis which allows to interpret
the standard version as a privileged case in yet another direction.
We motivate our results by providing instances of their use to analyse
the confusion matrices of multiple-input multiple-output classifiers
Towards the algebraization of Formal Concept Analysis over complete dioids
Actas de: XVII Congreso Español sobre TecnologĂas y LĂłgica Fuzzy (ESTYLF 2014). Zaragoza, 5-7 de febrero de 2014.Complete dioids are already complete residuated lattices. Formal contexts with entries in them generate Concept Lattices with the help of the polar maps. Previous work has already established the spectral nature of some formal concepts for contexts over certain kinds of dioids. This paper tries to raise the awareness that linear algebra over exotic semirings should be one place to look to understand the properties of FCA over L-lattices.FJVA was partially supported by EU FP7 project LiMoSINe (contract 288024) for this research. CPM was partially supported by the Spanish Government-ComisiĂłn Interministerial de Ciencia y TecnologĂa project 2011-268007/TEC.Publicad
Two Information-Theoretic Tools to Assess the Performance of Multi-class Classifiers
We develop two tools to analyze the behavior of multiple-class, or multi-class, classifiers by means of entropic measures on their confusion matrix or contingency table. First we obtain a balance equation on the entropies that captures interesting properties of the classifier. Second, by normalizing this balance equation we first obtain a 2-simplex in a three-dimensional entropy space and then the de Finetti entropy diagram or entropy triangle. We also give examples of the assessment of classifiers with these tools.Spanish Government-ComisiĂłn Interministerial de Ciencia y TecnologĂa projects 2008-06382/TEC and 2008-02473/TEC and the regional projects S-505/TIC/0223 (DGUI-CM) and CCG08-UC3M/TIC-4457 (Comunidad AutĂłnoma de Madrid – UC3M)Publicad
The evaluation of data sources using multivariate entropy tools
We introduce from first principles an analysis of the information content of multivariate distributions as information sources. Specifically, we generalize a balance equation and a visualization device, the Entropy Triangle, for multivariate distributions and find notable differences with similar analyses done on joint distributions as models of information channels.
As an example application, we extend a framework for the analysis of classifiers to also encompass the analysis of data sets. With such tools we analyze a handful of UCI machine learning task to start addressing the question of how well do datasets convey the information they are supposed to capture about the phenomena they stand for
Spectral Lattices of reducible matrices over completed idempotent semifields
Proceedings of: 10th International Conference on Concept Lattices and Their Applications. (CLA 2013). La Rochelle, France, October 15-18, 2013.Previous work has shown a relation between L-valued extensions of FCA and the spectra of some matrices related to L-valued contexts. We investigate the spectra of reducible matrices over completed idempotent semifields in the framework of dioids, naturally-ordered semirings, that encompass several of those extensions. Considering special sets of eigenvectors also brings out complete lattices in the picture and we argue that such structure may be more important than standard eigenspace structure for matrices over completed idempotent semifields.FJVA is supported by EU FP7 project LiMoSINe (contract 288024). CPM has been partially supported by the Spanish Government-ComisiĂłn Interministerial de Ciencia y TecnologĂa project TEC2011-26807 for this paper.Publicad
Four-fold Formal Concept Analysis based on Complete Idempotent Semifields
Formal Concept Analysis (FCA) is a well-known supervised boolean data-mining technique rooted in Lattice and Order Theory, that has several extensions to, e.g., fuzzy and idempotent semirings. At the heart of FCA lies a Galois connection between two powersets. In this paper we extend the FCA formalism to include all four Galois connections between four different semivectors spaces over idempotent semifields, at the same time. The result is K¯¯¯¯-four-fold Formal Concept Analysis (K¯¯¯¯-4FCA) where K¯¯¯¯ is the idempotent semifield biasing the analysis. Since complete idempotent semifields come in dually-ordered pairs—e.g., the complete max-plus and min-plus semirings—the basic construction shows dual-order-, row–column- and Galois-connection-induced dualities that appear simultaneously a number of times to provide the full spectrum of variability. Our results lead to a fundamental theorem of K¯¯¯¯-four-fold Formal Concept Analysis that properly defines quadrilattices as 4-tuples of (order-dually) isomorphic lattices of vectors and discuss its relevance vis-à -vis previous formal conceptual analyses and some affordances of their results
The Singular Value Decomposition over Completed Idempotent Semifields
In this paper, we provide a basic technique for Lattice Computing: an analogue of the Singular Value Decomposition for rectangular matrices over complete idempotent semifields (i-SVD). These algebras are already complete lattices and many of their instances—the complete schedule algebra or completed max-plus semifield, the tropical algebra, and the max-times algebra—are useful in a range of applications, e.g., morphological processing. We further the task of eliciting the relation between i-SVD and the extension of Formal Concept Analysis to complete idempotent semifields (K-FCA) started in a prior work. We find out that for a matrix with entries considered in a complete idempotent semifield, the Galois connection at the heart of K-FCA provides two basis of left- and right-singular vectors to choose from, for reconstructing the matrix. These are join-dense or meet-dense sets of object or attribute concepts of the concept lattice created by the connection, and they are almost surely not pairwise orthogonal. We conclude with an attempt analogue of the fundamental theorem of linear algebra that gathers all results and discuss it in the wider setting of matrix factorization.This research was funded by the Spanish Government-MinECo project TEC2017-84395-P and the Dept. of Research and Innovation of Madrid Regional Authority project EMPATIA-CM (Y2018/TCS-5046)
K-Formal Concept Analysis as linear algebra over idempotent semifields
We report on progress in characterizing K-valued FCA in algebraic terms, where K is an idempotent semifield. In this data mining-inspired approach, incidences are matrices and sets of objects and attributes are vectors. The algebraization allows us to write matrix-calculus formulae describing the polars and the fixpoint equations for extents and intents. Adopting also the point of view of the theory of linear operators between vector spaces we explore the similarities and differences of the idempotent semimodules of extents and intents with the subspaces related to a linear operator in standard algebra. This allows us to shed some light into Formal Concept Analysis from the point of view of the theory of linear operators over idempotent semimodules.
In the opposite direction, we state the importance of FCA-related concepts for dual order homomorphisms of linear spaces over idempotent semifields, specially congruences, the lattices of extents, intents and formal concepts
Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle
Data transformation, e.g., feature transformation and selection, is an integral part of any machine learning procedure. In this paper, we introduce an information-theoretic model and tools to assess the quality of data transformations in machine learning tasks. In an unsupervised fashion, we analyze the transformation of a discrete, multivariate source of information (X) over bar into a discrete, multivariate sink of information (Y) over bar related by a distribution P-(XY) over bar. The first contribution is a decomposition of the maximal potential entropy of ((X) over bar, (Y) over bar), which we call a balance equation, into its (a) non-transferable, (b) transferable, but not transferred, and (c) transferred parts. Such balance equations can be represented in (de Finetti) entropy diagrams, our second set of contributions. The most important of these, the aggregate channel multivariate entropy triangle, is a visual exploratory tool to assess the effectiveness of multivariate data transformations in transferring information from input to output variables. We also show how these decomposition and balance equations also apply to the entropies of (X) over bar and (Y) over bar, respectively, and generate entropy triangles for them. As an example, we present the application of these tools to the assessment of information transfer efficiency for Principal Component Analysis and Independent Component Analysis as unsupervised feature transformation and selection procedures in supervised classification tasks.This research was funded by he Spanish Government-MinECo projects TEC2014-53390-P and TEC2017-84395-P
- …