78 research outputs found

    Learning programs by learning from failures

    Full text link
    We describe an inductive logic programming (ILP) approach called learning from failures. In this approach, an ILP system (the learner) decomposes the learning problem into three separate stages: generate, test, and constrain. In the generate stage, the learner generates a hypothesis (a logic program) that satisfies a set of hypothesis constraints (constraints on the syntactic form of hypotheses). In the test stage, the learner tests the hypothesis against training examples. A hypothesis fails when it does not entail all the positive examples or entails a negative example. If a hypothesis fails, then, in the constrain stage, the learner learns constraints from the failed hypothesis to prune the hypothesis space, i.e. to constrain subsequent hypothesis generation. For instance, if a hypothesis is too general (entails a negative example), the constraints prune generalisations of the hypothesis. If a hypothesis is too specific (does not entail all the positive examples), the constraints prune specialisations of the hypothesis. This loop repeats until either (i) the learner finds a hypothesis that entails all the positive and none of the negative examples, or (ii) there are no more hypotheses to test. We introduce Popper, an ILP system that implements this approach by combining answer set programming and Prolog. Popper supports infinite problem domains, reasoning about lists and numbers, learning textually minimal programs, and learning recursive programs. Our experimental results on three domains (toy game problems, robot strategies, and list transformations) show that (i) constraints drastically improve learning performance, and (ii) Popper can outperform existing ILP systems, both in terms of predictive accuracies and learning times.Comment: Accepted for the machine learning journa

    dRAP-Independent: A Data Distribution Algorithm for Mining First-Order Frequent Patterns

    Get PDF
    In this paper we present dRAP-Independent, an algorithm for independent distributed mining of first-order frequent patterns. This system is based on RAP, an algorithm for finding maximal frequent patterns in first-order logic. dRAP-Independent utilizes a modified data partitioning schema introduced by Savasere et al. and offers good performance and low communication overhead. We analyze the performance of the algorithm on four different tasks: Mutagenicity prediction -- a standard ILP benchmark, information extraction from biological texts, context-sensitive spelling correction, and morphological disambiguation of Czech. The results of the analysis show that the algorithm can generate more patterns than the serial algorithm RAP in the same overall time

    From Statistical Relational to Neurosymbolic Artificial Intelligence: a Survey

    Full text link
    This survey explores the integration of learning and reasoning in two different fields of artificial intelligence: neurosymbolic and statistical relational artificial intelligence. Neurosymbolic artificial intelligence (NeSy) studies the integration of symbolic reasoning and neural networks, while statistical relational artificial intelligence (StarAI) focuses on integrating logic with probabilistic graphical models. This survey identifies seven shared dimensions between these two subfields of AI. These dimensions can be used to characterize different NeSy and StarAI systems. They are concerned with (1) the approach to logical inference, whether model or proof-based; (2) the syntax of the used logical theories; (3) the logical semantics of the systems and their extensions to facilitate learning; (4) the scope of learning, encompassing either parameter or structure learning; (5) the presence of symbolic and subsymbolic representations; (6) the degree to which systems capture the original logic, probabilistic, and neural paradigms; and (7) the classes of learning tasks the systems are applied to. By positioning various NeSy and StarAI systems along these dimensions and pointing out similarities and differences between them, this survey contributes fundamental concepts for understanding the integration of learning and reasoning.Comment: To appear in Artificial Intelligence. Shorter version at IJCAI 2020 survey track, https://www.ijcai.org/proceedings/2020/0688.pd

    Complex Aggregates over Clusters of Elements

    Get PDF
    Complex aggregates have been proposed as a way to bridge the gap between approaches that handle sets by imposing conditions on specific elements, and approaches that handle them by imposing conditions on aggregated values. A complex aggregate summarises a subset of the elements in a set, where this subset is defined by conditions on the attribute values. In this paper, we present a new type of complex aggregate, where this subset is defined to be a cluster of the set. This is useful if subsets that are relevant for the task at hand are difficult to describe in terms of attribute conditions. This work is motivated from the analysis of flow cytometry data, where the sets are cells, and the subsets are cell populations. We describe two approaches to aggregate over clusters on an abstract level, and validate one of them empirically, motivating future research in this direction

    Automatic & Semi-Automatic Methods for Supporting Ontology Change

    Get PDF

    Multi-Instance Multi-Label Learning

    Get PDF
    In this paper, we propose the MIML (Multi-Instance Multi-Label learning) framework where an example is described by multiple instances and associated with multiple class labels. Compared to traditional learning frameworks, the MIML framework is more convenient and natural for representing complicated objects which have multiple semantic meanings. To learn from MIML examples, we propose the MimlBoost and MimlSvm algorithms based on a simple degeneration strategy, and experiments show that solving problems involving complicated objects with multiple semantic meanings in the MIML framework can lead to good performance. Considering that the degeneration process may lose information, we propose the D-MimlSvm algorithm which tackles MIML problems directly in a regularization framework. Moreover, we show that even when we do not have access to the real objects and thus cannot capture more information from real objects by using the MIML representation, MIML is still useful. We propose the InsDif and SubCod algorithms. InsDif works by transforming single-instances into the MIML representation for learning, while SubCod works by transforming single-label examples into the MIML representation for learning. Experiments show that in some tasks they are able to achieve better performance than learning the single-instances or single-label examples directly.Comment: 64 pages, 10 figures; Artificial Intelligence, 201

    Conducting Inductive Logic Programming Directly in Database Management Systems

    Get PDF
    University of Minnesota M.S. thesis.July 2015. Major: Computer Science. Advisor: Richard Maclin. 1 computer file (PDF); vii, 70 pages.Inductive logic programming (ILP) is a research area formed at the intersection of machine learning and logic programming. Given a set of background knowledge as well as positive and negative examples of a concept, an ILP system attempts to learn rules that cover all the positive examples and none of the negative examples by using the background knowledge. Over the years, ILP is being used extensively in medical applications. Existing ILP systems are implemented in Prolog, using first­-order logic. But, Prolog does not integrate well with database systems, where a lot of the data of interest is stored. Prolog is also not often used in business applications. This thesis presents a novel approach of storing the facts (background knowledge, examples) required for ILP in databases and using Java for easy access and retrieval of the stored knowledge. Since most of the ILP machine learning data sets can be stored easily in databases, this approach provides an easier to use technique. Facts are stored in the form of tables in database and rules are stored as database views by using a database join on the multiple predicates in a fact. A Sequential covering algorithm that uses the best­ first search approach to learn rules for ILP problems is implemented in this thesis. The results obtained on two real­-world test data sets by using this approach are compared with traditional systems. The accuracy of the system presented in this thesis is on par with the accuracy of the traditional systems. These results are very assuring and the system provides an easy-­to-­use approach for the ILP users

    Kernel Methods for Knowledge Structures

    Get PDF

    LOGIC AND CONSTRAINT PROGRAMMING FOR COMPUTATIONAL SUSTAINABILITY

    Get PDF
    Computational Sustainability is an interdisciplinary field that aims to develop computational and mathematical models and methods for decision making concerning the management and allocation of resources in order to help solve environmental problems. This thesis deals with a broad spectrum of such problems (energy efficiency, water management, limiting greenhouse gas emissions and fuel consumption) giving a contribution towards their solution by means of Logic Programming (LP) and Constraint Programming (CP), declarative paradigms from Artificial Intelligence of proven solidity. The problems described in this thesis were proposed by experts of the respective domains and tested on the real data instances they provided. The results are encouraging and show the aptness of the chosen methodologies and approaches. The overall aim of this work is twofold: both to address real world problems in order to achieve practical results and to get, from the application of LP and CP technologies to complex scenarios, feedback and directions useful for their improvement
    corecore