6,224 research outputs found

    LearnFCA: A Fuzzy FCA and Probability Based Approach for Learning and Classification

    Get PDF
    Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering. This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide a literature review of it’s applications and various approaches adopted by researchers in the areas of dataanalysis, knowledge management with emphasis to data-learning and classification problems. We propose LearnFCA, a novel approach based on FuzzyFCA and probability theory for learning and classification problems. LearnFCA uses an enhanced version of FuzzyLattice which has been developed to store class labels and probability vectors and has the capability to be used for classifying instances with encoded and unlabelled features. We evaluate LearnFCA on encodings from three datasets - mnist, omniglot and cancer images with interesting results and varying degrees of success. Adviser: Dr Jitender Deogu

    LEARNFCA: A FUZZY FCA AND PROBABILITY BASED APPROACH FOR LEARNING AND CLASSIFICATION

    Get PDF
    Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering. This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide a literature review of it’s applications and various approaches adopted by researchers in the areas of dataanalysis, knowledge management with emphasis to data-learning and classification problems. We propose LearnFCA, a novel approach based on FuzzyFCA and probability theory for learning and classification problems. LearnFCA uses an enhanced version of FuzzyLattice which has been developed to store class labels and probability vectors and has the capability to be used for classifying instances with encoded and unlabelled features. We evaluate LearnFCA on encodings from three datasets - mnist, omniglot and cancer images with interesting results and varying degrees of success. Adviser: Jitender Deogu

    MULTIVALUED SUBSETS UNDER INFORMATION THEORY

    Get PDF
    In the fields of finance, engineering and varied sciences, Data Mining/ Machine Learning has held an eminent position in predictive analysis. Complex algorithms and adaptive decision models have contributed towards streamlining directed research as well as improve on the accuracies in forecasting. Researchers in the fields of mathematics and computer science have made significant contributions towards the development of this field. Classification based modeling, which holds a significant position amongst the different rule-based algorithms, is one of the most widely used decision making tools. The decision tree has a place of profound significance in classification-based modeling. A number of heuristics have been developed over the years to prune the decision making process. Some key benchmarks in the evolution of the decision tree could to attributed to the researchers like Quinlan (ID3 and C4.5), Fayyad (GID3/3*, continuous value discretization), etc. The most common heuristic applied for these trees is the entropy discussed under information theory by Shannon. The current application with entropy covered under the term `Information Gain\u27 is directed towards individual assessment of the attribute-value sets. The proposed study takes a look at the effects of combining the attribute-value sets, aimed at improving the information gain. Couple of key applications have been tested and presented with statistical conclusions. The first being the application towards the feature selection process, a key step in the data mining process, while the second application is targeted towards the discretization of data. A search-based heuristic tool is applied towards identifying the subsets sharing a better gain value than the ones presented in the GID approach

    Proceedings of the 5th International Workshop "What can FCA do for Artificial Intelligence?", FCA4AI 2016(co-located with ECAI 2016, The Hague, Netherlands, August 30th 2016)

    Get PDF
    International audienceThese are the proceedings of the fifth edition of the FCA4AI workshop (http://www.fca4ai.hse.ru/). Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classification that can be used for many purposes, especially for Artificial Intelligence (AI) needs. The objective of the FCA4AI workshop is to investigate two main main issues: how can FCA support various AI activities (knowledge discovery, knowledge representation and reasoning, learning, data mining, NLP, information retrieval), and how can FCA be extended in order to help AI researchers to solve new and complex problems in their domain. Accordingly, topics of interest are related to the following: (i) Extensions of FCA for AI: pattern structures, projections, abstractions. (ii) Knowledge discovery based on FCA: classification, data mining, pattern mining, functional dependencies, biclustering, stability, visualization. (iii) Knowledge processing based on concept lattices: modeling, representation, reasoning. (iv) Application domains: natural language processing, information retrieval, recommendation, mining of web of data and of social networks, etc

    Conceptual Factors and Fuzzy Data

    Get PDF
    With the growing number of large data sets, the necessity of complexity reduction applies today more than ever before. Moreover, some data may also be vague or uncertain. Thus, whenever we have an instrument for data analysis, the questions of how to apply complexity reduction methods and how to treat fuzzy data arise rather naturally. In this thesis, we discuss these issues for the very successful data analysis tool Formal Concept Analysis. In fact, we propose different methods for complexity reduction based on qualitative analyses, and we elaborate on various methods for handling fuzzy data. These two topics split the thesis into two parts. Data reduction is mainly dealt with in the first part of the thesis, whereas we focus on fuzzy data in the second part. Although each chapter may be read almost on its own, each one builds on and uses results from its predecessors. The main crosslink between the chapters is given by the reduction methods and fuzzy data. In particular, we will also discuss complexity reduction methods for fuzzy data, combining the two issues that motivate this thesis.Komplexitätsreduktion ist eines der wichtigsten Verfahren in der Datenanalyse. Mit ständig wachsenden Datensätzen gilt dies heute mehr denn je. In vielen Gebieten stößt man zudem auf vage und ungewisse Daten. Wann immer man ein Instrument zur Datenanalyse hat, stellen sich daher die folgenden zwei Fragen auf eine natürliche Weise: Wie kann man im Rahmen der Analyse die Variablenanzahl verkleinern, und wie kann man Fuzzy-Daten bearbeiten? In dieser Arbeit versuchen wir die eben genannten Fragen für die Formale Begriffsanalyse zu beantworten. Genauer gesagt, erarbeiten wir verschiedene Methoden zur Komplexitätsreduktion qualitativer Daten und entwickeln diverse Verfahren für die Bearbeitung von Fuzzy-Datensätzen. Basierend auf diesen beiden Themen gliedert sich die Arbeit in zwei Teile. Im ersten Teil liegt der Schwerpunkt auf der Komplexitätsreduktion, während sich der zweite Teil der Verarbeitung von Fuzzy-Daten widmet. Die verschiedenen Kapitel sind dabei durch die beiden Themen verbunden. So werden insbesondere auch Methoden für die Komplexitätsreduktion von Fuzzy-Datensätzen entwickelt

    A graph theoretic approach to scene matching

    Get PDF
    The ability to match two scenes is a fundamental requirement in a variety of computer vision tasks. A graph theoretic approach to inexact scene matching is presented which is useful in dealing with problems due to imperfect image segmentation. A scene is described by a set of graphs, with nodes representing objects and arcs representing relationships between objects. Each node has a set of values representing the relations between pairs of objects, such as angle, adjacency, or distance. With this method of scene representation, the task in scene matching is to match two sets of graphs. Because of segmentation errors, variations in camera angle, illumination, and other conditions, an exact match between the sets of observed and stored graphs is usually not possible. In the developed approach, the problem is represented as an association graph, in which each node represents a possible mapping of an observed region to a stored object, and each arc represents the compatibility of two mappings. Nodes and arcs have weights indicating the merit or a region-object mapping and the degree of compatibility between two mappings. A match between the two graphs corresponds to a clique, or fully connected subgraph, in the association graph. The task is to find the clique that represents the best match. Fuzzy relaxation is used to update the node weights using the contextual information contained in the arcs and neighboring nodes. This simplifies the evaluation of cliques. A method of handling oversegmentation and undersegmentation problems is also presented. The approach is tested with a set of realistic images which exhibit many types of sementation errors

    Rice Blast Disease Forecasting for Northern Philippines

    Get PDF
    Rice blast disease has become an enigmatic problem in several rice growing ecosystems of both tropical and temperate regions of the world. In this study, we develop models for predicting the occurrence and severity of rice blast disease, with the aim of helping to prevent or at least mitigate the spread of such disease. Data from 2 government agencies in selected provinces from northern Philippines were gathered, cleaned and synchronized for the purpose of building the predictive models. After the data synchronization, dimensionality reduction of the feature space was done, using Principal Component Analysis (PCA), to determine the most important weather features that contribute to the occurrence of the rice blast disease. Using these identified features, ANN and SVM binary classifiers (for prediction of the occurrence or non-occurrence of rice blast) and regression models (for estimation of the severity of an occurring rice blast) were built and tested. These classifiers and regression models produced sufficiently accurate results, with the SVM models showing a significantly better predictive power than the corresponding ANN models. These findings can be used in developing a system for forecasting rice blast, which may help reduce the occurrence of the disease
    • …
    corecore