3 research outputs found

    Cheating to achieve Formal Concept Analysis over a large formal context

    Get PDF
    International audienceResearchers are facing one of the main problems of the Information Era. As more articles are made electronically available, it gets harder to follow trends in the different domains of research. Cheap, coherent and fast to construct knowledge models of research domains will be much required when information becomes unmanageable. While Formal Concept Analysis (FCA) has been widely used on several areas to construct knowledge artifacts for this purpose (Ontology development, Information Retrieval, Software Refactoring, Knowledge Discovery), the large amount of documents and terminology used on research domains makes it not a very good option (because of the high computational cost and humanly-unprocessable output). In this article we propose a novel heuristic to create a taxonomy from a large term-document dataset using Latent Semantic Analysis and Formal Concept Analysis. We provide and discuss its implementation on a real dataset from the Software Architecture community obtained from the ISI Web of Knowledge (4400 documents)

    FCA2VEC: Embedding Techniques for Formal Concept Analysis

    Full text link
    Embedding large and high dimensional data into low dimensional vector spaces is a necessary task to computationally cope with contemporary data sets. Superseding latent semantic analysis recent approaches like word2vec or node2vec are well established tools in this realm. In the present paper we add to this line of research by introducing fca2vec, a family of embedding techniques for formal concept analysis (FCA). Our investigation contributes to two distinct lines of research. First, we enable the application of FCA notions to large data sets. In particular, we demonstrate how the cover relation of a concept lattice can be retrieved from a computational feasible embedding. Secondly, we show an enhancement for the classical node2vec approach in low dimension. For both directions the overall constraint of FCA of explainable results is preserved. We evaluate our novel procedures by computing fca2vec on different data sets like, wiki44 (a dense part of the Wikidata knowledge graph), the Mushroom data set and a publication network derived from the FCA community.Comment: 25 page
    corecore