516 research outputs found

    FCA and pattern structures for mining care trajectories

    Get PDF
    International audienceIn this paper, we are interested in the analysis of sequential data and we propose an original framework based on Formal Concept Analysis (FCA). For that, we introduce sequential pattern structures, an original specification of pattern structures for dealing with sequential data. Pattern structures are used in FCA for dealing with complex data such as intervals or graphs. Here they are adapted to sequences. For that, we introduce a subsumption operation for sequence comparison, based on subsequence matching. Then, a projection, i.e. a kind of data reduction of sequential pattern structures, is suggested in order to increase the eficiency of the approach. Finally, we discuss an application to a dataset including patient trajectories (the motivation of this work), which is a sequential dataset and can be processed with the introduced framework. This research work provides a new and eficient extension of FCA to deal with complex (not binary) data, which can be an alternative to the analysis of sequential dataset

    On mining complex sequential data by means of FCA and pattern structures

    Get PDF
    Nowadays data sets are available in very complex and heterogeneous ways. Mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of "complex" sequential data by means of interesting sequential patterns. We approach the problem using the elegant mathematical framework of Formal Concept Analysis (FCA) and its extension based on "pattern structures". Pattern structures are used for mining complex data (such as sequences or graphs) and are based on a subsumption operation, which in our case is defined with respect to the partial order on sequences. We show how pattern structures along with projections (i.e., a data reduction of sequential structures), are able to enumerate more meaningful patterns and increase the computing efficiency of the approach. Finally, we show the applicability of the presented method for discovering and analyzing interesting patient patterns from a French healthcare data set on cancer. The quantitative and qualitative results (with annotations and analysis from a physician) are reported in this use case which is the main motivation for this work. Keywords: data mining; formal concept analysis; pattern structures; projections; sequences; sequential data.Comment: An accepted publication in International Journal of General Systems. The paper is created in the wake of the conference on Concept Lattice and their Applications (CLA'2013). 27 pages, 9 figures, 3 table

    Linked data and online classifications to organise mined patterns in patient data

    Get PDF
    In this paper, we investigate the use of web data resources in medicine, especially through medical classifications made available using the principles of Linked Data, to support the interpretation of patterns mined from patient care trajectories. Interpreting such patterns is naturally a challenge for an analyst, as it requires going through large amounts of results and access to sufficient background knowledge. We employ linked data, especially as exposed through the BioPortal system, to create a navigation structure within the patterns obtained form sequential pattern mining. We show how this approach provides a flexible way to explore data about trajectories of diagnoses and treatments according to different medical classifications

    RCA-Seq: an Original Approach for Enhancing the Analysis of Sequential Data Based on Hierarchies of Multilevel Closed Partially-Ordered Patterns

    Get PDF
    International audienceMethods for analysing sequential data generally produce a huge number of sequential patterns that have then to be evaluated and interpreted by domain experts. To diminish this number and thus the difficulty of the interpretation task, methods that directly extract a more compact representation of sequential patterns, namely closed partially-ordered patterns (CPO-patterns), were introduced. In spite of the fewer number of obtained CPO-patterns, their analysis is still a challenging task for experts since they are unorgan-ised and besides, do not provide a global view of the discovered regularities. To address these problems, we present and formalise an original approach within the framework of Relational Concept Analysis (RCA), referred to as RCA-Seq, that focuses on facilitating the interpretation task of experts. The hierarchical RCA result allows to directly obtain and organize the relationships between the extracted CPO-patterns. Moreover, a generalisation order on items is also revealed, and multilevel CPO-patterns are obtained. Therefore, a hierarchy of such CPO-patterns guides the interpretation task, helps experts in better understanding the extracted patterns, and minimises the chance of overlooking interesting CPO-patterns. RCA-Seq is compared with another approach that relies on pattern structures. In addition, we highlight the adaptability of RCA-Seq by integrating a user-defined tax-* onomy over the items, and by considering user-specified constraints on the order relations on itemsets

    LearnFCA: A Fuzzy FCA and Probability Based Approach for Learning and Classification

    Get PDF
    Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering. This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide a literature review of it’s applications and various approaches adopted by researchers in the areas of dataanalysis, knowledge management with emphasis to data-learning and classification problems. We propose LearnFCA, a novel approach based on FuzzyFCA and probability theory for learning and classification problems. LearnFCA uses an enhanced version of FuzzyLattice which has been developed to store class labels and probability vectors and has the capability to be used for classifying instances with encoded and unlabelled features. We evaluate LearnFCA on encodings from three datasets - mnist, omniglot and cancer images with interesting results and varying degrees of success. Adviser: Dr Jitender Deogu

    LEARNFCA: A FUZZY FCA AND PROBABILITY BASED APPROACH FOR LEARNING AND CLASSIFICATION

    Get PDF
    Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering. This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide a literature review of it’s applications and various approaches adopted by researchers in the areas of dataanalysis, knowledge management with emphasis to data-learning and classification problems. We propose LearnFCA, a novel approach based on FuzzyFCA and probability theory for learning and classification problems. LearnFCA uses an enhanced version of FuzzyLattice which has been developed to store class labels and probability vectors and has the capability to be used for classifying instances with encoded and unlabelled features. We evaluate LearnFCA on encodings from three datasets - mnist, omniglot and cancer images with interesting results and varying degrees of success. Adviser: Jitender Deogu

    A FCA-based analysis of sequential care trajectories

    Get PDF
    International audienceThis paper presents a research work in the domains of sequential pattern mining and formal concept analysis. Using a combined method, we show how concept lattices and interestingness measures such as stability can improve the task of discovering knowledge in symbolic sequential data. We give example of a real medical application to illustrate how this approach can be useful to discover patterns of trajectories of care in a french medico-economical database

    Discovering and Comparing Relational Knowledge, the Example of Pharmacogenomics

    Get PDF
    Article in Proceedings of the EKAW Doctoral Consortium 2018 co-located with the 21st International Conference on Knowledge Engineering and Knowledge Management (EKAW 2018)Pharmacogenomics (PGx) studies the influence of the genome in drug response, with knowledge units of the form of ternary relationships genomic variation-drug-phenotype. State-of-the-art PGx knowledge is available in the biomedical literature as well as in specialized knowledge bases. Additionally, Electronic Health Records of hospitals can be mined to discover such knowledge units that can then be compared with the state of the art, in order to confirm or temper relationships lacking validation or clinical counterpart. However, both discovering and comparing PGx relationships face multiple challenges: heterogeneous descriptions of knowledge units (languages, vocabularies and granularities), missing values and importance of the time dimension. In this research, we aim at proposing a framework based on Semantic Web technologies and Formal Concept Analysis to discover, represent and compare PGx knowledge units. We present the first results, consisting of creating an integrated knowledge base of PGx knowledge units from various sources and defining comparison methods, as well as the remaining issues to tackle

    The representation of sequential patterns and their projections within Formal Concept Analysis

    Get PDF
    International audienceNowadays data sets are available in very complex and heterogeneous ways. The mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of "complex" sequential data by means of interesting sequential patterns. We approach the problem using an elegant mathematical framework: Formal Concept Analysis (FCA) and its extension based on "pattern structures". Pattern structures are used for mining complex data (such as sequences or graphs) and are based on a subsumption operation, which in our case is defined with respect to the partial order on sequences. We show how pattern structures along with projections (i.e., a data reduction of sequential structures), are able to enumerate more meaningful patterns and increase the computing efficiency of the approach. Finally, we show the applicability of the presented method for discovering and analyzing interesting patients' patterns from a French healthcare data set of cancer patients. The quantitative and qualitative results are reported in this use case which is the main motivation for this work
    • …
    corecore