78 research outputs found
Learning programs by learning from failures
We describe an inductive logic programming (ILP) approach called learning
from failures. In this approach, an ILP system (the learner) decomposes the
learning problem into three separate stages: generate, test, and constrain. In
the generate stage, the learner generates a hypothesis (a logic program) that
satisfies a set of hypothesis constraints (constraints on the syntactic form of
hypotheses). In the test stage, the learner tests the hypothesis against
training examples. A hypothesis fails when it does not entail all the positive
examples or entails a negative example. If a hypothesis fails, then, in the
constrain stage, the learner learns constraints from the failed hypothesis to
prune the hypothesis space, i.e. to constrain subsequent hypothesis generation.
For instance, if a hypothesis is too general (entails a negative example), the
constraints prune generalisations of the hypothesis. If a hypothesis is too
specific (does not entail all the positive examples), the constraints prune
specialisations of the hypothesis. This loop repeats until either (i) the
learner finds a hypothesis that entails all the positive and none of the
negative examples, or (ii) there are no more hypotheses to test. We introduce
Popper, an ILP system that implements this approach by combining answer set
programming and Prolog. Popper supports infinite problem domains, reasoning
about lists and numbers, learning textually minimal programs, and learning
recursive programs. Our experimental results on three domains (toy game
problems, robot strategies, and list transformations) show that (i) constraints
drastically improve learning performance, and (ii) Popper can outperform
existing ILP systems, both in terms of predictive accuracies and learning
times.Comment: Accepted for the machine learning journa
dRAP-Independent: A Data Distribution Algorithm for Mining First-Order Frequent Patterns
In this paper we present dRAP-Independent, an algorithm for independent distributed mining of first-order frequent patterns. This system is based on RAP, an algorithm for finding maximal frequent patterns in first-order logic. dRAP-Independent utilizes a modified data partitioning schema introduced by Savasere et al. and offers good performance and low communication overhead. We analyze the performance of the algorithm on four different tasks: Mutagenicity prediction -- a standard ILP benchmark, information extraction from biological texts, context-sensitive spelling correction, and morphological disambiguation of Czech. The results of the analysis show that the algorithm can generate more patterns than the serial algorithm RAP in the same overall time
From Statistical Relational to Neurosymbolic Artificial Intelligence: a Survey
This survey explores the integration of learning and reasoning in two
different fields of artificial intelligence: neurosymbolic and statistical
relational artificial intelligence. Neurosymbolic artificial intelligence
(NeSy) studies the integration of symbolic reasoning and neural networks, while
statistical relational artificial intelligence (StarAI) focuses on integrating
logic with probabilistic graphical models. This survey identifies seven shared
dimensions between these two subfields of AI. These dimensions can be used to
characterize different NeSy and StarAI systems. They are concerned with (1) the
approach to logical inference, whether model or proof-based; (2) the syntax of
the used logical theories; (3) the logical semantics of the systems and their
extensions to facilitate learning; (4) the scope of learning, encompassing
either parameter or structure learning; (5) the presence of symbolic and
subsymbolic representations; (6) the degree to which systems capture the
original logic, probabilistic, and neural paradigms; and (7) the classes of
learning tasks the systems are applied to. By positioning various NeSy and
StarAI systems along these dimensions and pointing out similarities and
differences between them, this survey contributes fundamental concepts for
understanding the integration of learning and reasoning.Comment: To appear in Artificial Intelligence. Shorter version at IJCAI 2020
survey track, https://www.ijcai.org/proceedings/2020/0688.pd
Complex Aggregates over Clusters of Elements
Complex aggregates have been proposed as a way to bridge the gap between approaches that handle sets by imposing conditions on specific elements, and approaches that handle them by imposing conditions on aggregated values. A complex aggregate summarises a subset of the elements in a set, where this subset is defined by conditions on the attribute values. In this paper, we present a new type of complex aggregate, where this subset is defined to be a cluster of the set. This is useful if subsets that are relevant for the task at hand are difficult to describe in terms of attribute conditions. This work is motivated from the analysis of flow cytometry data, where the sets are cells, and the subsets are cell populations. We describe two approaches to aggregate over clusters on an abstract level, and validate one of them empirically, motivating future research in this direction
Multi-Instance Multi-Label Learning
In this paper, we propose the MIML (Multi-Instance Multi-Label learning)
framework where an example is described by multiple instances and associated
with multiple class labels. Compared to traditional learning frameworks, the
MIML framework is more convenient and natural for representing complicated
objects which have multiple semantic meanings. To learn from MIML examples, we
propose the MimlBoost and MimlSvm algorithms based on a simple degeneration
strategy, and experiments show that solving problems involving complicated
objects with multiple semantic meanings in the MIML framework can lead to good
performance. Considering that the degeneration process may lose information, we
propose the D-MimlSvm algorithm which tackles MIML problems directly in a
regularization framework. Moreover, we show that even when we do not have
access to the real objects and thus cannot capture more information from real
objects by using the MIML representation, MIML is still useful. We propose the
InsDif and SubCod algorithms. InsDif works by transforming single-instances
into the MIML representation for learning, while SubCod works by transforming
single-label examples into the MIML representation for learning. Experiments
show that in some tasks they are able to achieve better performance than
learning the single-instances or single-label examples directly.Comment: 64 pages, 10 figures; Artificial Intelligence, 201
Conducting Inductive Logic Programming Directly in Database Management Systems
University of Minnesota M.S. thesis.July 2015. Major: Computer Science. Advisor: Richard Maclin. 1 computer file (PDF); vii, 70 pages.Inductive logic programming (ILP) is a research area formed at the intersection of machine learning and logic programming. Given a set of background knowledge as well as positive and negative examples of a concept, an ILP system attempts to learn rules that cover all the positive examples and none of the negative examples by using the background knowledge. Over the years, ILP is being used extensively in medical applications. Existing ILP systems are implemented in Prolog, using first-order logic. But, Prolog does not integrate well with database systems, where a lot of the data of interest is stored. Prolog is also not often used in business applications. This thesis presents a novel approach of storing the facts (background knowledge, examples) required for ILP in databases and using Java for easy access and retrieval of the stored knowledge. Since most of the ILP machine learning data sets can be stored easily in databases, this approach provides an easier to use technique. Facts are stored in the form of tables in database and rules are stored as database views by using a database join on the multiple predicates in a fact. A Sequential covering algorithm that uses the best first search approach to learn rules for ILP problems is implemented in this thesis. The results obtained on two real-world test data sets by using this approach are compared with traditional systems. The accuracy of the system presented in this thesis is on par with the accuracy of the traditional systems. These results are very assuring and the system provides an easy-to-use approach for the ILP users
LOGIC AND CONSTRAINT PROGRAMMING FOR COMPUTATIONAL SUSTAINABILITY
Computational Sustainability is an interdisciplinary field that aims to develop computational
and mathematical models and methods for decision making concerning
the management and allocation of resources in order to help solve environmental
problems.
This thesis deals with a broad spectrum of such problems (energy efficiency, water
management, limiting greenhouse gas emissions and fuel consumption) giving
a contribution towards their solution by means of Logic Programming (LP) and
Constraint Programming (CP), declarative paradigms from Artificial Intelligence
of proven solidity.
The problems described in this thesis were proposed by experts of the respective
domains and tested on the real data instances they provided. The results are encouraging
and show the aptness of the chosen methodologies and approaches.
The overall aim of this work is twofold: both to address real world problems
in order to achieve practical results and to get, from the application of LP and
CP technologies to complex scenarios, feedback and directions useful for their
improvement
- …