13 research outputs found

    Mining Definitions from RDF Annotations Using Formal Concept Analysis

    Get PDF
    International audienceThe popularization and quick growth of Linked Open Data (LOD) has led to challenging aspects regarding quality assessment and data exploration of the RDF triples that shape the LOD cloud. Particularly, we are interested in the completeness of the data and the their potential to provide concept definitions in terms of necessary and sufficient conditions. In this work we propose a novel technique based on Formal Concept Analysis which organizes RDF data into a concept lattice. This allows data exploration as well as the discovery of implication rules which are used to automatically detect missing information and then to complete RDF data.Moreover, this is a way of reconciling syntax and semantics in the LOD cloud. Finally experiments on the DBpedia knowledge base show that the approach is well-founded and effective

    FCA modelling for CPS interoperability optimization in Industry 4.0

    Get PDF
    International audienceCyber-Physical Systems (CPS) lead to the 4-th Industrial Revolution (Industry 4.0) that will have benefits of high flexibility of production, easy and so more accessible participation of all involved parties of business processes. The Industry 4.0 production paradigm is characterized by autonomous behaviour and intercommunicating properties of its production elements across all levels of manufacturing processes so one of the key concept in this domain will be the semantic interoperability of systems. This goal can benefit of formal methods well known various scientific domains like artificial intelligence, machine learning and algebra. So the current investigation is on the promising approach named Formal Concept Analysis (FCA) to structure the knowledge and to optimize the CPS interoperability

    Characterizing covers of functional dependencies using FCA

    Get PDF
    Functional dependencies (FDs) can be used for various important operations on data, for instance, checking the consistency and the quality of a database (including databases that contain complex data). Consequently, a generic framework that allows mining a sound, complete, non-redundant and yet compact set of FDs is an important tool for many different applications. There are different definitions of such sets of FDs (usually called cover). In this paper, we present the characterization of two different kinds of covers for FDs in terms of pattern structures. The convenience of such a characterization is that it allows an easy implementation of efficient mining algorithms which can later be easily adapted to other kinds of similar dependencies. Finally, we present empirical evidence that the proposed approach can perform better than state-ofthe-art FD miner algorithms in large databases.Peer ReviewedPostprint (published version

    Discovering and Comparing Relational Knowledge, the Example of Pharmacogenomics

    Get PDF
    Article in Proceedings of the EKAW Doctoral Consortium 2018 co-located with the 21st International Conference on Knowledge Engineering and Knowledge Management (EKAW 2018)Pharmacogenomics (PGx) studies the influence of the genome in drug response, with knowledge units of the form of ternary relationships genomic variation-drug-phenotype. State-of-the-art PGx knowledge is available in the biomedical literature as well as in specialized knowledge bases. Additionally, Electronic Health Records of hospitals can be mined to discover such knowledge units that can then be compared with the state of the art, in order to confirm or temper relationships lacking validation or clinical counterpart. However, both discovering and comparing PGx relationships face multiple challenges: heterogeneous descriptions of knowledge units (languages, vocabularies and granularities), missing values and importance of the time dimension. In this research, we aim at proposing a framework based on Semantic Web technologies and Formal Concept Analysis to discover, represent and compare PGx knowledge units. We present the first results, consisting of creating an integrated knowledge base of PGx knowledge units from various sources and defining comparison methods, as well as the remaining issues to tackle

    Characterizing approximate-matching dependencies in formal concept analysis with pattern structures

    Get PDF
    Functional dependencies (FDs) provide valuable knowledge on the relations between attributes of a data table. A functional dependency holds when the values of an attribute can be determined by another. It has been shown that FDs can be expressed in terms of partitions of tuples that are in agreement w.r.t. the values taken by some subsets of attributes. To extend the use of FDs, several generalizations have been proposed. In this work, we study approximatematching dependencies that generalize FDs by relaxing the constraints on the attributes, i.e. agreement is based on a similarity relation rather than on equality. Such dependencies are attracting attention in the database field since they allow uncrisping the basic notion of FDs extending its application to many different fields, such as data quality, data mining, behavior analysis, data cleaning or data partition, among others. We show that these dependencies can be formalized in the framework of Formal Concept Analysis (FCA) using a previous formalization introduced for standard FDs. Our new results state that, starting from the conceptual structure of a pattern structure, and generalizing the notion of relation between tuples, approximate-matching dependencies can be characterized as implications in a pattern concept lattice. We finally show how to use basic FCA algorithms to construct a pattern concept lattice that entails these dependencies after a slight and tractable binarization of the original data.Postprint (author's final draft

    Using formal concept analysis for checking the structure of an ontology in LOD: the example of DBpedia

    Get PDF
    International audienceLinked Open Data (LOD) constitute a large and growing collection of inter-domain data sets. LOD are represented as RDF graphs that allow interlinking with ontologies, facilitating data integration, knowledge engineering and in a certain sense knowledge discovery. However, ontologies associated with LOD are of different quality and not necessarily adapted to all data sets under study. In this paper, we propose an original approach, based on Formal Concept Analysis (FCA), which builds an optimal lattice-based structure for classifying RDF resources w.r.t. their predicates. We introduce the notion of lattice annotation, which enables comparing our classification with an ontology schema, to confirm subsumption axioms or suggest new ones. We conducted experiments on the DBpedia data set and its domain ontologies, DBpedia On-tology and YAGO. Results show that our approach is well-founded and illustrates the ability of FCA to guide a possible structuring of LOD

    Three Approaches for Mining Definitions from Relational Data in the Web of Data

    Get PDF
    International audienceIn this paper we study a classification process on relational data that can be applied to the web of data. We start with a set of objects and relations between objects, and extensional classes of objects. We then study how to provide a definition to classes, i.e. to build an intensional description of the class, w.r.t. the relations involving class objects. To this end, we propose three different approaches based on Formal Concept Analysis (FCA), redescription mining and Minimum Description Length (MDL). Relying on some experiments on RDF data from DBpedia, where objects correspond to resources, relations to predicates and classes to categories, we compare the capabilities and the comple-mentarity of the three approaches. This research work is a contribution to understanding the connections existing between FCA and other data mining formalisms which are gaining importance in knowledge discovery, namely redescription mining and MDL

    Elements About Exploratory, Knowledge-Based, Hybrid, and Explainable Knowledge Discovery

    Get PDF
    International audienceKnowledge Discovery in Databases (KDD) and especially pattern mining can be interpreted along several dimensions, namely data, knowledge, problem-solving and interactivity. These dimensions are not disconnected and have a direct impact on the quality, applicability, and efficiency of KDD. Accordingly, we discuss some objectives of KDD based on these dimensions, namely exploration, knowledge orientation, hybridization, and explanation. The data space and the pattern space can be explored in several ways, depending on specific evaluation functions and heuristics, possibly related to domain knowledge. Furthermore, numerical data are complex and supervised numerical machine learning methods are usually the best candidates for efficiently mining such data. However, the work and output of numerical methods are most of the time hard to understand, while symbolic methods are usually more intelligible. This calls for hybridization, combining numerical and symbolic mining methods to improve the applicability and interpretability of KDD. Moreover, suitable explanations about the operating models and possible subsequent decisions should complete KDD, and this is far from being the case at the moment. For illustrating these dimensions and objectives, we analyze a concrete case about the mining of biological data, where we characterize these dimensions and their connections. We also discuss dimensions and objectives in the framework of Formal Concept Analysis and we draw some perspectives for future research

    Using Redescriptions and Formal Concept Analysis for Mining Definitions Linked Data

    Get PDF
    International audienceIn this article, we compare the use of Redescription Mining (RM) and Association Rule Mining (ARM) for discovering class definitions in Linked Open Data (LOD). RM is aimed at mining alternate descriptions from two datasets related to the same set of individuals. We reuse RM for providing category definitions in DBpedia in terms of necessary and sufficient conditions (NSC). Implications and AR can be jointly used for mining category definitions still in terms of NSC. In this paper, we firstly, recall the basics of redescription mining and make precise the principles of definition discovery. Then we detail a series of experiments carried out on datasets extracted from DBpedia. We analyze the different outputs related to RM and ARM applications, and we discuss the strengths and limitations of both approaches. Finally, we point out possible improvements of the approaches

    Mining Definitions from RDF Annotations Using Formal Concept Analysis

    No full text
    International audienceThe popularization and quick growth of Linked Open Data (LOD) has led to challenging aspects regarding quality assessment and data exploration of the RDF triples that shape the LOD cloud. Particularly, we are interested in the completeness of the data and the their potential to provide concept definitions in terms of necessary and sufficient conditions. In this work we propose a novel technique based on Formal Concept Analysis which organizes RDF data into a concept lattice. This allows data exploration as well as the discovery of implication rules which are used to automatically detect missing information and then to complete RDF data.Moreover, this is a way of reconciling syntax and semantics in the LOD cloud. Finally experiments on the DBpedia knowledge base show that the approach is well-founded and effective
    corecore