451 research outputs found

    LearnFCA: A Fuzzy FCA and Probability Based Approach for Learning and Classification

    Get PDF
    Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering. This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide a literature review of it’s applications and various approaches adopted by researchers in the areas of dataanalysis, knowledge management with emphasis to data-learning and classification problems. We propose LearnFCA, a novel approach based on FuzzyFCA and probability theory for learning and classification problems. LearnFCA uses an enhanced version of FuzzyLattice which has been developed to store class labels and probability vectors and has the capability to be used for classifying instances with encoded and unlabelled features. We evaluate LearnFCA on encodings from three datasets - mnist, omniglot and cancer images with interesting results and varying degrees of success. Adviser: Dr Jitender Deogu

    Concept learning consistency under three‑way decision paradigm

    Get PDF
    Concept Mining is one of the main challenges both in Cognitive Computing and in Machine Learning. The ongoing improvement of solutions to address this issue raises the need to analyze whether the consistency of the learning process is preserved. This paper addresses a particular problem, namely, how the concept mining capability changes under the reconsideration of the hypothesis class. The issue will be raised from the point of view of the so-called Three-Way Decision (3WD) paradigm. The paradigm provides a sound framework to reconsider decision-making processes, including those assisted by Machine Learning. Thus, the paper aims to analyze the influence of 3WD techniques in the Concept Learning Process itself. For this purpose, we introduce new versions of the Vapnik-Chervonenkis dimension. Likewise, to illustrate how the formal approach can be instantiated in a particular model, the case of concept learning in (Fuzzy) Formal Concept Analysis is considered.This work is supported by State Investigation Agency (Agencia Estatal de Investigación), project PID2019-109152GB-100/AEI/10.13039/501100011033. We acknowledge the reviewers for their suggestions and guidance on additional references that have enriched our paper. Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature

    Simplification logic for the management of unknown information

    Get PDF
    This paper aims to contribute to the extension of classical Formal Concept Analysis (FCA), allowing the management of unknown information. In a preliminary paper, we define a new kind of attribute implications to represent the knowledge from the information currently available. The whole FCA framework has to be appropriately extended to manage unknown information. This paper introduces a new logic for reasoning with this kind of implications, which belongs to the family of logics with an underlying Simplification paradigm. Specifically, we introduce a new algebra, named weak dual Heyting Algebra, that allows us to extend the Simplification logic for these new implications. To provide a solid framework, we also prove its soundness and completeness and show the advantages of the Simplification paradigm. Finally, to allow further use of this extension of FCA in applications, an algorithm for automated reasoning, which is directly built from logic, is defined.Funding for open access charge: Universidad de Málaga / CBUA This article is Supported by Grants TIN2017-89023-P, PRE2018-085199 and PID2021-127870OB-I00 of the Ministry of Science and Innovation of Spain and UMA2018-FEDERJA-001 of the Junta de Andalucia and European Social Fund

    Explainable temporal data mining techniques to support the prediction task in Medicine

    Get PDF
    In the last decades, the increasing amount of data available in all fields raises the necessity to discover new knowledge and explain the hidden information found. On one hand, the rapid increase of interest in, and use of, artificial intelligence (AI) in computer applications has raised a parallel concern about its ability (or lack thereof) to provide understandable, or explainable, results to users. In the biomedical informatics and computer science communities, there is considerable discussion about the `` un-explainable" nature of artificial intelligence, where often algorithms and systems leave users, and even developers, in the dark with respect to how results were obtained. Especially in the biomedical context, the necessity to explain an artificial intelligence system result is legitimate of the importance of patient safety. On the other hand, current database systems enable us to store huge quantities of data. Their analysis through data mining techniques provides the possibility to extract relevant knowledge and useful hidden information. Relationships and patterns within these data could provide new medical knowledge. The analysis of such healthcare/medical data collections could greatly help to observe the health conditions of the population and extract useful information that can be exploited in the assessment of healthcare/medical processes. Particularly, the prediction of medical events is essential for preventing disease, understanding disease mechanisms, and increasing patient quality of care. In this context, an important aspect is to verify whether the database content supports the capability of predicting future events. In this thesis, we start addressing the problem of explainability, discussing some of the most significant challenges need to be addressed with scientific and engineering rigor in a variety of biomedical domains. We analyze the ``temporal component" of explainability, focusing on detailing different perspectives such as: the use of temporal data, the temporal task, the temporal reasoning, and the dynamics of explainability in respect to the user perspective and to knowledge. Starting from this panorama, we focus our attention on two different temporal data mining techniques. The first one, based on trend abstractions, starting from the concept of Trend-Event Pattern and moving through the concept of prediction, we propose a new kind of predictive temporal patterns, namely Predictive Trend-Event Patterns (PTE-Ps). The framework aims to combine complex temporal features to extract a compact and non-redundant predictive set of patterns composed by such temporal features. The second one, based on functional dependencies, we propose a methodology for deriving a new kind of approximate temporal functional dependencies, called Approximate Predictive Functional Dependencies (APFDs), based on a three-window framework. We then discuss the concept of approximation, the data complexity of deriving an APFD, the introduction of two new error measures, and finally the quality of APFDs in terms of coverage and reliability. Exploiting these methodologies, we analyze intensive care unit data from the MIMIC dataset

    Cognitive Models and Computational Approaches for improving Situation Awareness Systems

    Get PDF
    2016 - 2017The world of Internet of Things is pervaded by complex environments with smart services available every time and everywhere. In such a context, a serious open issue is the capability of information systems to support adaptive and collaborative decision processes in perceiving and elaborating huge amounts of data. This requires the design and realization of novel socio-technical systems based on the “human-in-the-loop” paradigm. The presence of both humans and software in such systems demands for adequate levels of Situation Awareness (SA). To achieve and maintain proper levels of SA is a daunting task due to the intrinsic technical characteristics of systems and the limitations of human cognitive mechanisms. In the scientific literature, such issues hindering the SA formation process are defined as SA demons. The objective of this research is to contribute to the resolution of the SA demons by means of the identification of information processing paradigms for an original support to the SA and the definition of new theoretical and practical approaches based on cognitive models and computational techniques. The research work starts with an in-depth analysis and some preliminary verifications of methods, techniques, and systems of SA. A major outcome of this analysis is that there is only a limited use of the Granular Computing paradigm (GrC) in the SA field, despite the fact that SA and GrC share many concepts and principles. The research work continues with the definition of contributions and original results for the resolution of significant SA demons, exploiting some of the approaches identified in the analysis phase (i.e., ontologies, data mining, and GrC). The first contribution addresses the issues related to the bad perception of data by users. We propose a semantic approach for the quality-aware sensor data management which uses a data imputation technique based on association rule mining. The second contribution proposes an original ontological approach to situation management, namely the Adaptive Goal-driven Situation Management. The approach uses the ontological modeling of goals and situations and a mechanism that suggests the most relevant goals to the users at a given moment. Lastly, the adoption of the GrC paradigm allows the definition of a novel model for representing and reasoning on situations based on a set theoretical framework. This model has been instantiated using the rough sets theory. The proposed approaches and models have been implemented in prototypical systems. Their capabilities in improving SA in real applications have been evaluated with typical methodologies used for SA systems. [edited by Author]XXX cicl

    Semantic and Visual Analysis of Metadata to Search and Select Heterogeneous Information Resources

    Get PDF
    An increasing number of activities in several disciplinary and industrial fields such as the scientific research, the industrial design and the environmental management, rely on the production and employment of informative resources representing objects, information and knowledge. The vast availability of digitalized information resources (documents, images, maps, videos, 3D model) highlights the need for appropriate methods to effectively share and employ all these resources. In particular, tools to search and select information resources produced by third parties are required to successfully achieve our daily work activities. Headway in this direction is made adopting the metadata, a description of the most relevant features characterizing the information resources. However, a plenty of features have to be considered to fully describe the information resources in sophisticated fields as those mentioned. This brings to a complexity of metadata and to a growing need for tools which face with this complexity. The thesis aims at developing methods to analyze metadata easing the search and comparison of information resources. The goal is to select the resources which best fit the user\u27s needs in specific activities. In particular, the thesis faces with the problem of metadata complexity and supports in the discovery of selection criteria which are unknown to the user. The metadata analysis consists of two approaches: visual and semantic analysis. The visual analysis involves the user as much as possible to let him discover the most suitable selection criteria. The semantic analysis supports in the browsing and selection of information resources taking into account the user\u27s knowledge which is properly formalized

    Attribute Exploration of Gene Regulatory Processes

    Get PDF
    This thesis aims at the logical analysis of discrete processes, in particular of such generated by gene regulatory networks. States, transitions and operators from temporal logics are expressed in the language of Formal Concept Analysis. By the attribute exploration algorithm, an expert or a computer program is enabled to validate a minimal and complete set of implications, e.g. by comparison of predictions derived from literature with observed data. Here, these rules represent temporal dependencies within gene regulatory networks including coexpression of genes, reachability of states, invariants or possible causal relationships. This new approach is embedded into the theory of universal coalgebras, particularly automata, Kripke structures and Labelled Transition Systems. A comparison with the temporal expressivity of Description Logics is made. The main theoretical results concern the integration of background knowledge into the successive exploration of the defined data structures (formal contexts). Applying the method a Boolean network from literature modelling sporulation of Bacillus subtilis is examined. Finally, we developed an asynchronous Boolean network for extracellular matrix formation and destruction in the context of rheumatoid arthritis.Comment: 111 pages, 9 figures, file size 2.1 MB, PhD thesis University of Jena, Germany, Faculty of Mathematics and Computer Science, 2011. Online available at http://www.db-thueringen.de/servlets/DocumentServlet?id=1960

    Big Data Computing for Geospatial Applications

    Get PDF
    The convergence of big data and geospatial computing has brought forth challenges and opportunities to Geographic Information Science with regard to geospatial data management, processing, analysis, modeling, and visualization. This book highlights recent advancements in integrating new computing approaches, spatial methods, and data management strategies to tackle geospatial big data challenges and meanwhile demonstrates opportunities for using big data for geospatial applications. Crucial to the advancements highlighted in this book is the integration of computational thinking and spatial thinking and the transformation of abstract ideas and models to concrete data structures and algorithms

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining
    corecore