5,075 research outputs found

    Exploring finite models in the Description Logic ELgfp

    Get PDF
    In a previous ICFCA paper we have shown that, in the Description Logics EL and ELgfp, the set of general concept inclusions holding in a finite model always has a finite basis. In this paper, we address the problem of how to compute this basis efficiently, by adapting methods from formal concept analysis

    Computing iceberg concept lattices with Titanic

    Get PDF
    International audienceWe introduce the notion of iceberg concept lattices and show their use in knowledge discovery in databases. Iceberg lattices are a conceptual clustering method, which is well suited for analyzing very large databases. They also serve as a condensed representation of frequent itemsets, as starting point for computing bases of association rules, and as a visualization method for association rules. Iceberg concept lattices are based on the theory of Formal Concept Analysis, a mathematical theory with applications in data analysis, information retrieval, and knowledge discovery. We present a new algorithm called TITANIC for computing (iceberg) concept lattices. It is based on data mining techniques with a level-wise approach. In fact, TITANIC can be used for a more general problem: Computing arbitrary closure systems when the closure operator comes along with a so-called weight function. The use of weight functions for computing closure systems has not been discussed in the literature up to now. Applications providing such a weight function include association rule mining, functional dependencies in databases, conceptual clustering, and ontology engineering. The algorithm is experimentally evaluated and compared with Ganter's Next-Closure algorithm. The evaluation shows an important gain in eïŹƒciency, especially for weakly correlated data

    Polynomial growth of concept lattices, canonical bases and generators:: extremal set theory in Formal Concept Analysis

    Get PDF
    We prove that there exist three distinct, comprehensive classes of (formal) contexts with polynomially many concepts. Namely: contexts which are nowhere dense, of bounded breadth or highly convex. Already present in G. Birkhoff's classic monograph is the notion of breadth of a lattice; it equals the number of atoms of a largest boolean suborder. Even though it is natural to define the breadth of a context as being that of its concept lattice, this idea had not been exploited before. We do this and establish many equivalences. Amongst them, it is shown that the breadth of a context equals the size of its largest minimal generator, its largest contranominal-scale subcontext, as well as the Vapnik-Chervonenkis dimension of both its system of extents and of intents. The polynomiality of the aforementioned classes is proven via upper bounds (also known as majorants) for the number of maximal bipartite cliques in bipartite graphs. These are results obtained by various authors in the last decades. The fact that they yield statements about formal contexts is a reward for investigating how two established fields interact, specifically Formal Concept Analysis (FCA) and graph theory. We improve considerably the breadth bound. Such improvement is twofold: besides giving a much tighter expression, we prove that it limits the number of minimal generators. This is strictly more general than upper bounding the quantity of concepts. Indeed, it automatically implies a bound on these, as well as on the number of proper premises. A corollary is that this improved result is a bound for the number of implications in the canonical basis too. With respect to the quantity of concepts, this sharper majorant is shown to be best possible. Such fact is established by constructing contexts whose concept lattices exhibit exactly that many elements. These structures are termed, respectively, extremal contexts and extremal lattices. The usual procedure of taking the standard context allows one to work interchangeably with either one of these two extremal structures. Extremal lattices are equivalently defined as finite lattices which have as many elements as possible, under the condition that they obey two upper limits: one for its number of join-irreducibles, other for its breadth. Subsequently, these structures are characterized in two ways. Our first characterization is done using the lattice perspective. Initially, we construct extremal lattices by the iterated operation of finding smaller, extremal subsemilattices and duplicating their elements. Then, it is shown that every extremal lattice must be obtained through a recursive application of this construction principle. A byproduct of this contribution is that extremal lattices are always meet-distributive. Despite the fact that this approach is revealing, the vicinity of its findings contains unanswered combinatorial questions which are relevant. Most notably, the number of meet-irreducibles of extremal lattices escapes from control when this construction is conducted. Aiming to get a grip on the number of meet-irreducibles, we succeed at proving an alternative characterization of these structures. This second approach is based on implication logic, and exposes an interesting link between number of proper premises, pseudo-extents and concepts. A guiding idea in this scenario is to use implications to construct lattices. It turns out that constructing extremal structures with this method is simpler, in the sense that a recursive application of the construction principle is not needed. Moreover, we obtain with ease a general, explicit formula for the Whitney numbers of extremal lattices. This reveals that they are unimodal, too. Like the first, this second construction method is shown to be characteristic. A particular case of the construction is able to force - with precision - a high number of (in the sense of "exponentially many'') meet-irreducibles. Such occasional explosion of meet-irreducibles motivates a generalization of the notion of extremal lattices. This is done by means of considering a more refined partition of the class of all finite lattices. In this finer-grained setting, each extremal class consists of lattices with bounded breadth, number of join irreducibles and meet-irreducibles as well. The generalized problem of finding the maximum number of concepts reveals itself to be challenging. Instead of attempting to classify these structures completely, we pose questions inspired by TurĂĄn's seminal result in extremal combinatorics. Most prominently: do extremal lattices (in this more general sense) have the maximum permitted breadth? We show a general statement in this setting: for every choice of limits (breadth, number of join-irreducibles and meet-irreducibles), we produce some extremal lattice with the maximum permitted breadth. The tools which underpin all the intuitions in this scenario are hypergraphs and exact set covers. In a rather unexpected, but interesting turn of events, we obtain for free a simple and interesting theorem about the general existence of "rich'' subcontexts. Precisely: every context contains an object/attribute pair which, after removed, results in a context with at least half the original number of concepts

    Conceptual Factors and Fuzzy Data

    Get PDF
    With the growing number of large data sets, the necessity of complexity reduction applies today more than ever before. Moreover, some data may also be vague or uncertain. Thus, whenever we have an instrument for data analysis, the questions of how to apply complexity reduction methods and how to treat fuzzy data arise rather naturally. In this thesis, we discuss these issues for the very successful data analysis tool Formal Concept Analysis. In fact, we propose different methods for complexity reduction based on qualitative analyses, and we elaborate on various methods for handling fuzzy data. These two topics split the thesis into two parts. Data reduction is mainly dealt with in the first part of the thesis, whereas we focus on fuzzy data in the second part. Although each chapter may be read almost on its own, each one builds on and uses results from its predecessors. The main crosslink between the chapters is given by the reduction methods and fuzzy data. In particular, we will also discuss complexity reduction methods for fuzzy data, combining the two issues that motivate this thesis.KomplexitĂ€tsreduktion ist eines der wichtigsten Verfahren in der Datenanalyse. Mit stĂ€ndig wachsenden DatensĂ€tzen gilt dies heute mehr denn je. In vielen Gebieten stĂ¶ĂŸt man zudem auf vage und ungewisse Daten. Wann immer man ein Instrument zur Datenanalyse hat, stellen sich daher die folgenden zwei Fragen auf eine natĂŒrliche Weise: Wie kann man im Rahmen der Analyse die Variablenanzahl verkleinern, und wie kann man Fuzzy-Daten bearbeiten? In dieser Arbeit versuchen wir die eben genannten Fragen fĂŒr die Formale Begriffsanalyse zu beantworten. Genauer gesagt, erarbeiten wir verschiedene Methoden zur KomplexitĂ€tsreduktion qualitativer Daten und entwickeln diverse Verfahren fĂŒr die Bearbeitung von Fuzzy-DatensĂ€tzen. Basierend auf diesen beiden Themen gliedert sich die Arbeit in zwei Teile. Im ersten Teil liegt der Schwerpunkt auf der KomplexitĂ€tsreduktion, wĂ€hrend sich der zweite Teil der Verarbeitung von Fuzzy-Daten widmet. Die verschiedenen Kapitel sind dabei durch die beiden Themen verbunden. So werden insbesondere auch Methoden fĂŒr die KomplexitĂ€tsreduktion von Fuzzy-DatensĂ€tzen entwickelt

    Learning Terminological Knowledge with High Confidence from Erroneous Data

    Get PDF
    Description logics knowledge bases are a popular approach to represent terminological and assertional knowledge suitable for computers to work with. Despite that, the practicality of description logics is impaired by the difficulties one has to overcome to construct such knowledge bases. Previous work has addressed this issue by providing methods to learn valid terminological knowledge from data, making use of ideas from formal concept analysis. A basic assumption here is that the data is free of errors, an assumption that can in general not be made for practical applications. This thesis presents extensions of these results that allow to handle errors in the data. For this, knowledge that is "almost valid" in the data is retrieved, where the notion of "almost valid" is formalized using the notion of confidence from data mining. This thesis presents two algorithms which achieve this retrieval. The first algorithm just extracts all almost valid knowledge from the data, while the second algorithm utilizes expert interaction to distinguish errors from rare but valid counterexamples

    Relational Approach to the L-Fuzzy Concept Analysis

    Get PDF
    Modern industrial production systems benefit from the classification and processing of objects and their attributes. In general, the object classification procedure can coincide with vagueness. Vagueness is a common problem in object analysis that exists at various stages of classification, including ambiguity in input data, overlapping boundaries between classes or regions, and uncertainty in defining or extracting the properties and relationships of objects. To manage the ambiguity mentioned in the classification of objects, using a framework for L-fuzzy relations, and displaying such uncertainties by it can be a solution. Obtaining the least unreliable and uncertain output associated with the original data is the main concern of this thesis. Therefore, my general approach to this research can be categorized as follows: We developed an L-Fuzzy Concept Analysis as a generalization of a regular Concept Analysis. We start our work by providing the input data. Data is stored in a table (database). The next step is the creation of the contexts and concepts from the given original data using some structures. In the next stage, rules, or patterns (Attribute Implications) from the data will be generated. This includes all rules and a minimal base of rules. All of them are using L-fuzziness due to uncertainty. This requires L-fuzzy relations that will be implemented as L -valued matrices. In the end, everything is nicely packed in a convenient application and implemented in Java programming language. Generally, our approach is done in an algebraic framework that covers both regular and L -Fuzzy FCA, simultaneously. The tables we started with are already L-valued (not crisp) in our implementation. In other words, we work with the L-Fuzzy data directly. This is the idea here. We start with vague data. In simple terms, the data is shown using L -valued tables (vague data) trying to relate objects with their attributes at the start of the implementation. Generating attribute implications from many-valued contexts by a relational theory is the purpose of this thesis, i.e, a range of degrees is used to indicate the relationship between objects and their properties. The smallest degree corresponds to the classical no and the greatest degree corresponds to the classical yes in the table
    • 

    corecore