77 research outputs found
Efficient Learning and Evaluation of Complex Concepts in Inductive Logic Programming
Inductive Logic Programming (ILP) is a subfield of Machine Learning with foundations in logic
programming. In ILP, logic programming, a subset of first-order logic, is used as a uniform
representation language for the problem specification and induced theories. ILP has been
successfully applied to many real-world problems, especially in the biological domain (e.g. drug
design, protein structure prediction), where relational information is of particular importance.
The expressiveness of logic programs grants flexibility in specifying the learning task and understandability
to the induced theories. However, this flexibility comes at a high computational
cost, constraining the applicability of ILP systems. Constructing and evaluating complex concepts
remain two of the main issues that prevent ILP systems from tackling many learning
problems. These learning problems are interesting both from a research perspective, as they
raise the standards for ILP systems, and from an application perspective, where these target
concepts naturally occur in many real-world applications. Such complex concepts cannot
be constructed or evaluated by parallelizing existing top-down ILP systems or improving the
underlying Prolog engine. Novel search strategies and cover algorithms are needed.
The main focus of this thesis is on how to efficiently construct and evaluate complex hypotheses
in an ILP setting. In order to construct such hypotheses we investigate two approaches.
The first, the Top Directed Hypothesis Derivation framework, implemented in the ILP system
TopLog, involves the use of a top theory to constrain the hypothesis space. In the second approach
we revisit the bottom-up search strategy of Golem, lifting its restriction on determinate
clauses which had rendered Golem inapplicable to many key areas. These developments led to
the bottom-up ILP system ProGolem. A challenge that arises with a bottom-up approach is the
coverage computation of long, non-determinate, clauses. Prolog’s SLD-resolution is no longer
adequate. We developed a new, Prolog-based, theta-subsumption engine which is significantly
more efficient than SLD-resolution in computing the coverage of such complex clauses.
We provide evidence that ProGolem achieves the goal of learning complex concepts by presenting
a protein-hexose binding prediction application. The theory ProGolem induced has
a statistically significant better predictive accuracy than that of other learners. More importantly,
the biological insights ProGolem’s theory provided were judged by domain experts to
be relevant and, in some cases, novel
Relational extensions to feature logic: applications to constraint based grammars
This thesis investigates the logical and computational foundations of unification-based
or more appropriately constraint based grammars. The thesis explores extensions to
feature logics (which provide the basic knowledge representation services to constraint
based grammars) with multi-valued or relational features. These extensions are useful
for knowledge representation tasks that cannot be expressed within current feature
logics.The approach bridges the gap between concept languages (such as KL-ONE), which
are the mainstay of knowledge representation languages in AI, and feature logics. Va¬
rious constraints on relational attributes are considered such as existential membership,
universal membership, set descriptions, transitive relations and linear precedence con¬
straints.The specific contributions of this thesis can be summarised as follows:
1. Development of an integrated feature/concept logic
2. Development of a constraint logic for so called partial set descriptions
3. Development of a constraint logic for expressing linear precedence constraints
4. The design of a constraint language CL-ONE that incorporates the central ideas
provided by the above study
5. A study of the application of CL-ONE for constraint based grammarsThe thesis takes into account current insights in the areas of constraint logic programming, object-oriented languages, computational linguistics and knowledge representation
A workbench to develop ILP systems
Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201
Constructive approaches to Program Induction
Search is a key technique in artificial intelligence, machine learning and Program Induction. No
matter how efficient a search procedure, there exist spaces that are too large to search effectively
and they include the search space of programs. In this dissertation we show that in the context
of logic-program induction (Inductive Logic Programming, or ILP) it is not necessary to search
for a correct program, because if one exists, there also exists a unique object that is the most
general correct program, and that can be constructed directly, without a search, in polynomial
time and from a polynomial number of examples. The existence of this unique object, that we
term the Top Program because of its maximal generality, does not so much solve the problem
of searching a large program search space, as it completely sidesteps it, thus improving the
efficiency of the learning task by orders of magnitude commensurate with the complexity of a
program space search.
The existence of a unique Top Program and the ability to construct it given finite resources
relies on the imposition, on the language of hypotheses, from which programs are constructed,
of a strong inductive bias with relevance to the learning task. In common practice, in machine
learning, Program Induction and ILP, such relevant inductive bias is selected, or created,
manually, by the human user of a learning system, with intuition or knowledge of the problem
domain, and in the form of various kinds of program templates. In this dissertation we show
that by abandoning the reliance on such extra-logical devices as program templates, and instead
defining inductive bias exclusively as First- and Higher-Order Logic formulae, it is possible to
learn inductive bias itself from examples, automatically, and efficiently, by Higher-Order Top
Program construction.
In Chapter 4 we describe the Top Program in the context of the Meta-Interpretive Learning
approach to ILP (MIL) and describe an algorithm for its construction, the Top Program
Construction algorithm (TPC). We prove the efficiency and accuracy of TPC and describe
its implementation in a new MIL system called Louise. We support theoretical results with
experiments comparing Louise to the state-of-the-art, search-based MIL system, Metagol, and
find that Louise improves Metagol’s efficiency and accuracy. In Chapter 5 we re-frame MIL as
specialisation of metarules, Second-Order clauses used as inductive bias in MIL, and prove that
problem-specific metarules can be derived by specialisation of maximally general metarules, by
MIL. We describe a sub-system of Louise, called TOIL, that learns new metarules by MIL and
demonstrate empirically that the metarules learned by TOIL match those selected manually,
while maintaining the accuracy and efficiency of learning.
iOpen Acces
Using Natural Language as Knowledge Representation in an Intelligent Tutoring System
Knowledge used in an intelligent tutoring system to teach students is usually acquired from authors who are experts in the domain. A problem is that they cannot directly add and update knowledge if they don’t learn formal language used in the system. Using natural language to represent knowledge can allow authors to update knowledge easily. This thesis presents a new approach to use unconstrained natural language as knowledge representation for a physics tutoring system so that non-programmers can add knowledge without learning a new knowledge representation. This approach allows domain experts to add not only problem statements, but also background knowledge such as commonsense and domain knowledge including principles in natural language. Rather than translating into a formal language, natural language representation is directly used in inference so that domain experts can understand the internal process, detect knowledge bugs, and revise the knowledgebase easily. In authoring task studies with the new system based on this approach, it was shown that the size of added knowledge was small enough for a domain expert to add, and converged to near zero as more problems were added in one mental model test. After entering the no-new-knowledge state in the test, 5 out of 13 problems (38 percent) were automatically solved by the system without adding new knowledge
Inductive logic programming at 30
Inductive logic programming (ILP) is a form of logic-based machine learning.
The goal of ILP is to induce a hypothesis (a logic program) that generalises
given training examples and background knowledge. As ILP turns 30, we survey
recent work in the field. In this survey, we focus on (i) new meta-level search
methods, (ii) techniques for learning recursive programs that generalise from
few examples, (iii) new approaches for predicate invention, and (iv) the use of
different technologies, notably answer set programming and neural networks. We
conclude by discussing some of the current limitations of ILP and discuss
directions for future research.Comment: Extension of IJCAI20 survey paper. arXiv admin note: substantial text
overlap with arXiv:2002.11002, arXiv:2008.0791
- …