49 research outputs found

    Gene Expressio Array Exploration Using K-Formal Concept Analysis

    Get PDF
    Proceeding of: 9th International Conference, ICFCA 2011, Nicosia, Cyprus, May 2-6, 2011.DNA micro-arrays are a mechanism for eliciting gene expression values, the concentration of the transcription products of a set of genes, under different chemical conditions. The phenomena of interest—up-regulation, down-regulation and co-regulation—are hypothesized to stem from the functional relationships among transcription products. In [1,2,3] a generalisation of Formal Concept Analysis was developed with data mining applications in mind, K-Formal Concept Analysis, where incidences take values in certain kinds of semirings, instead of the usual Boolean carrier set. In this paper, we use (Rmin+)- and (Rmax+) to analyse gene expression data for Arabidopsis thaliana. We introduce the mechanism to render the data in the appropriate algebra and profit by the wealth of different Galois Connections available in Generalized Formal Concept Analysis to carry different analysis for up- and down-regulated genes.Spanish Government-Comision Interministerial de Ciencia y Tecnología projects 2008-06382/TEC and 2008-02473/TEC and the regional projects S-505/TIC/0223 (DGUI-CM) and CCG08-UC3M/TIC- 4457 (Comunidad Aut onoma de Madrid - UC3M)

    Revisiting Pattern Structure Projections

    Get PDF
    International audienceFormal concept analysis (FCA) is a well-founded method for data analysis and has many applications in data mining. Pattern structures is an extension of FCA for dealing with complex data such as sequences or graphs. However the computational complexity of computing with pattern structures is high and projections of pattern structures were introduced for simplifying computation. In this paper we introduce o-projections of pattern structures, a generalization of projections which defines a wider class of projections preserving the properties of the original approach. Moreover, we show that o-projections form a semilattice and we discuss the correspondence between o-projections and the representation contexts of o-projected pattern structures

    A Partial-Closure Canonicity Test to Increase the Efficiency of CbO-Type Algorithms

    Get PDF
    Computing formal concepts is a fundamental part of Formal Concept Analysis and the design of increasingly efficient algorithms to carry out this task is a continuing strand of FCA research. Most approaches suffer from the repeated computation of the same formal concepts and, initially, algorithms concentrated on efficient searches through already computed results to detect these repeats, until the so-called canonicity test was introduced. The canonicity test meant that it was sufficient to examine the attributes of a computed concept to determine its newness: searching through previously computed concepts was no longer necessary. The employment of this test in Close-by-One type algorithms has proved to be highly effective. The typical CbO approach is to compute a concept and then test its canonicity. This paper describes a more efficient approach, whereby a concept need only be partially computed in order to carry out the test. Only if it passes the test does the computation of the concept need to be completed. This paper presents this ‘partial-closure’ canonicity test in the In-Close algorithm and compares it to a traditional CbO algorithm to demonstrate the increase in efficiency

    Elements About Exploratory, Knowledge-Based, Hybrid, and Explainable Knowledge Discovery

    Get PDF
    International audienceKnowledge Discovery in Databases (KDD) and especially pattern mining can be interpreted along several dimensions, namely data, knowledge, problem-solving and interactivity. These dimensions are not disconnected and have a direct impact on the quality, applicability, and efficiency of KDD. Accordingly, we discuss some objectives of KDD based on these dimensions, namely exploration, knowledge orientation, hybridization, and explanation. The data space and the pattern space can be explored in several ways, depending on specific evaluation functions and heuristics, possibly related to domain knowledge. Furthermore, numerical data are complex and supervised numerical machine learning methods are usually the best candidates for efficiently mining such data. However, the work and output of numerical methods are most of the time hard to understand, while symbolic methods are usually more intelligible. This calls for hybridization, combining numerical and symbolic mining methods to improve the applicability and interpretability of KDD. Moreover, suitable explanations about the operating models and possible subsequent decisions should complete KDD, and this is far from being the case at the moment. For illustrating these dimensions and objectives, we analyze a concrete case about the mining of biological data, where we characterize these dimensions and their connections. We also discuss dimensions and objectives in the framework of Formal Concept Analysis and we draw some perspectives for future research

    Creating corroborated crisis reports from social media data through formal concept analysis

    Get PDF
    During a crisis citizens reach for their smart phones to report, comment and explore information surrounding the crisis. These actions often involve social media and this data forms a large repository of real-time, crisis related information. Law enforcement agencies and other first responders see this information as having untapped potential. That is, it has the capacity extend their situational awareness beyond the scope of a usual command and control centre. Despite this potential, the sheer volume, the speed at which it arrives, and unstructured nature of social media means that making sense of this data is not a trivial task and one that is not yet satisfactorily solved; both in crisis management and beyond. Therefore we propose a multi-stage process to extract meaning from this data that will provide relevant and near real-time information to command and control to assist in decision support. This process begins with the capture of real-time social media data, the development of specific LEA and crisis focused taxonomies for categorisation and entity extraction, the application of formal concept analysis for aggregation and corroboration and the presentation of this data via map-based and other visualisations. We demonstrate that this novel use of formal concept analysis in combination with context-based entity extraction has the potential to inform law enforcement and/or humanitarian responders about on-going crisis events using social media data in the context of the 2015 Nepal earthquake. Keywords : formal concept analysis, crisis management, disaster response, visualisation, entity extraction

    Why and How Knowledge Discovery Can Be Useful for Solving Problems with CBR

    Get PDF
    International audienceIn this talk, we discuss and illustrate links existing between knowledge discovery in databases (KDD), knowledge representation and reasoning (KRR), and case-based reasoning (CBR). KDD techniques especially based on Formal Concept Analysis (FCA) are well formalized and allow the design of concept lattices from binary and complex data. These concept lattices provide a realistic basis for knowledge base organization and ontology engineering. More generally, they can be used for representing knowledge and reasoning in knowledge systems and CBR systems as well

    Quantitative Concept Analysis

    Get PDF
    Formal Concept Analysis (FCA) begins from a context, given as a binary relation between some objects and some attributes, and derives a lattice of concepts, where each concept is given as a set of objects and a set of attributes, such that the first set consists of all objects that satisfy all attributes in the second, and vice versa. Many applications, though, provide contexts with quantitative information, telling not just whether an object satisfies an attribute, but also quantifying this satisfaction. Contexts in this form arise as rating matrices in recommender systems, as occurrence matrices in text analysis, as pixel intensity matrices in digital image processing, etc. Such applications have attracted a lot of attention, and several numeric extensions of FCA have been proposed. We propose the framework of proximity sets (proxets), which subsume partially ordered sets (posets) as well as metric spaces. One feature of this approach is that it extracts from quantified contexts quantified concepts, and thus allows full use of the available information. Another feature is that the categorical approach allows analyzing any universal properties that the classical FCA and the new versions may have, and thus provides structural guidance for aligning and combining the approaches.Comment: 16 pages, 3 figures, ICFCA 201

    Contextual Subgraph Discovery With Mobility Models

    Get PDF
    International audienceStarting from a relational database that gathers information on people mobility – such as origin/destination places, date and time, means of transport – as well as demographic data, we adopt a graph-based representation that results from the aggregation of individual travels. In such a graph, the vertices are places or points of interest (POI) and the edges stand for the trips. Travel information as well as user demographics are labels associated to the edges. We tackle the problem of discovering exceptional contextual subgraphs, i.e., subgraphs related to a context – a restriction on the attribute values – that are unexpected according to a model. Previous work considers a simple model based on the number of trips associated with an edge without taking into account its length or the surrounding demography. In this article, we consider richer models based on statistical physics and demonstrate their ability to capture complex phenomena which were previously ignored
    corecore