14 research outputs found
On mining complex sequential data by means of FCA and pattern structures
Nowadays data sets are available in very complex and heterogeneous ways.
Mining of such data collections is essential to support many real-world
applications ranging from healthcare to marketing. In this work, we focus on
the analysis of "complex" sequential data by means of interesting sequential
patterns. We approach the problem using the elegant mathematical framework of
Formal Concept Analysis (FCA) and its extension based on "pattern structures".
Pattern structures are used for mining complex data (such as sequences or
graphs) and are based on a subsumption operation, which in our case is defined
with respect to the partial order on sequences. We show how pattern structures
along with projections (i.e., a data reduction of sequential structures), are
able to enumerate more meaningful patterns and increase the computing
efficiency of the approach. Finally, we show the applicability of the presented
method for discovering and analyzing interesting patient patterns from a French
healthcare data set on cancer. The quantitative and qualitative results (with
annotations and analysis from a physician) are reported in this use case which
is the main motivation for this work.
Keywords: data mining; formal concept analysis; pattern structures;
projections; sequences; sequential data.Comment: An accepted publication in International Journal of General Systems.
The paper is created in the wake of the conference on Concept Lattice and
their Applications (CLA'2013). 27 pages, 9 figures, 3 table
Advancing FCA Workflow in FCART System for Knowledge Discovery in Quantitative Data
AbstractWe describe new features in FCART software system, an integrated environment for knowledge and data engineers with a set of research tools based on Formal Concept Analysis. The system is intended for knowledge discovery from various data sources, including structured quantitative data and text collections. Final version of data transformation from external data source into concept lattice is considered. We introduce new version of local data storage, query language for conceptual scaling of data snapshots as multi-valued contexts, and new tools for working with formal concepts
Towards an Automatic Extraction of Smartphone Users' Contextual Behaviors
International audienceThis paper presents a new method for automatically extracting smartphone users' contextual behaviors from the digital traces collected during their interactions with their devices. Our goal is in particular to understand the impact of users' context (e.g., location, time, environment, etc.) on the applications they run on their smartphones. We propose a methodology to analyze digital traces and to automatically identify the significant information that characterizes users' behaviors. In earlier work, we have used Formal Concept Analysis and Galois lattices to extract relevant knowledge from heterogeneous and complex contextual data; however, the interpretation of the obtained Galois lattices was performed manually. In this article, we aim at automating this interpretation process, through the provision of original metrics. Therefore our methodology returns relevant information without requiring any expertise in data analysis. We illustrate our contribution on real data collected from volunteer users
FCA and pattern structures for mining care trajectories
International audienceIn this paper, we are interested in the analysis of sequential data and we propose an original framework based on Formal Concept Analysis (FCA). For that, we introduce sequential pattern structures, an original specification of pattern structures for dealing with sequential data. Pattern structures are used in FCA for dealing with complex data such as intervals or graphs. Here they are adapted to sequences. For that, we introduce a subsumption operation for sequence comparison, based on subsequence matching. Then, a projection, i.e. a kind of data reduction of sequential pattern structures, is suggested in order to increase the eficiency of the approach. Finally, we discuss an application to a dataset including patient trajectories (the motivation of this work), which is a sequential dataset and can be processed with the introduced framework. This research work provides a new and eficient extension of FCA to deal with complex (not binary) data, which can be an alternative to the analysis of sequential dataset
Combining Formal Concept Analysis and Translation to Assign Frames and Thematic Role Sets to French Verbs
International audienceWe present an application of Formal Concept Analysis in the domain of Natural Language Processing: We give a general overview of the framework, describe its goals, the data it is based on, the way it works and we illustrate the kind of data we expect as a result. More specifically, we examine the ability of the stability, separation and probability indices to select the most relevant concepts with respect to our FCA application. We show that the sum of stability and separation gives results close to those obtained when using the entire lattice
L’analyse relationnelle de concepts pour la fouille de données temporelles – Application à l’étude de données hydroécologiques
National audienceCet article présente une méthode d'exploration de données temporelles, fondée sur l'analyse relationnelle de concepts (ARC) et appliquée à des données séquentielles construites à partir d'échantillons physico-chimiques et biologiques prélevés dans des cours d'eau. Notre but est de mettre au jour des sous-séquences pertinentes et hiérarchisées, associant les deux types de paramètres. Pour faciliter la lecture, ces sous-séquences sont représentées sous la forme de motifs partiellement ordonnés (po-motifs). Le processus de fouille de données se décompose en plusieurs étapes : construction d'un modèle temporel ad hoc et mise en oeuvre de l'ARC ; extraction des sous-séquences synthétisées sous la forme de po-motifs ; sélection des po-motifs intéressants grâce à une mesure exploitant la distribution des extensions de concepts. Le processus a été testé sur un jeu de données réelles et évalué quantitativement et qualitativement
Automatic Validation of Terminology by Means of Formal Concept Analysis
International audienceTerm extraction tools extract candidate terms and annotate their occurrences in the texts. However, not all these occurrences are terminological and, at present, this is still a very challenging issue to distinguish when a candidate term is really used with a termino-logical meaning. The validation of term annotations is presented as a bi-classification model that classifies each term occurrence as a termi-nological or non-terminological occurrence. A context-based hypothesis approach is applied to a training corpus: we assume that the words in the sentence which contains the studied occurrence can be used to build positive and negative hypotheses that are further used to classify unde-termined examples. The method is applied and evaluated on a french corpus in the linguistic domain and we also mention some improvements suggested by a quantitative and qualitative evaluation
The representation of sequential patterns and their projections within Formal Concept Analysis
International audienceNowadays data sets are available in very complex and heterogeneous ways. The mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of "complex" sequential data by means of interesting sequential patterns. We approach the problem using an elegant mathematical framework: Formal Concept Analysis (FCA) and its extension based on "pattern structures". Pattern structures are used for mining complex data (such as sequences or graphs) and are based on a subsumption operation, which in our case is defined with respect to the partial order on sequences. We show how pattern structures along with projections (i.e., a data reduction of sequential structures), are able to enumerate more meaningful patterns and increase the computing efficiency of the approach. Finally, we show the applicability of the presented method for discovering and analyzing interesting patients' patterns from a French healthcare data set of cancer patients. The quantitative and qualitative results are reported in this use case which is the main motivation for this work
Theoretical lattices and formal concept analysis, tools for metatheoretic structuralism
We propose to take advantage of the computational methodologies of formal concept analysis and network visualization to represent and study the internal structure of axiomatized theories. This exercise was put into practice by comparing more than 44 theoretical models of space-time and gravitation. The lattices can be explored with interactive visualizations known as macroscopes that highlight relations of specialization, theorization, hierarchical orderings, communities and classes of components. In this text we exemplify with the reconstruction of classical particle mechanics, theories of space-time and gravitation