448 research outputs found
Efficient Learning and Evaluation of Complex Concepts in Inductive Logic Programming
Inductive Logic Programming (ILP) is a subfield of Machine Learning with foundations in logic
programming. In ILP, logic programming, a subset of first-order logic, is used as a uniform
representation language for the problem specification and induced theories. ILP has been
successfully applied to many real-world problems, especially in the biological domain (e.g. drug
design, protein structure prediction), where relational information is of particular importance.
The expressiveness of logic programs grants flexibility in specifying the learning task and understandability
to the induced theories. However, this flexibility comes at a high computational
cost, constraining the applicability of ILP systems. Constructing and evaluating complex concepts
remain two of the main issues that prevent ILP systems from tackling many learning
problems. These learning problems are interesting both from a research perspective, as they
raise the standards for ILP systems, and from an application perspective, where these target
concepts naturally occur in many real-world applications. Such complex concepts cannot
be constructed or evaluated by parallelizing existing top-down ILP systems or improving the
underlying Prolog engine. Novel search strategies and cover algorithms are needed.
The main focus of this thesis is on how to efficiently construct and evaluate complex hypotheses
in an ILP setting. In order to construct such hypotheses we investigate two approaches.
The first, the Top Directed Hypothesis Derivation framework, implemented in the ILP system
TopLog, involves the use of a top theory to constrain the hypothesis space. In the second approach
we revisit the bottom-up search strategy of Golem, lifting its restriction on determinate
clauses which had rendered Golem inapplicable to many key areas. These developments led to
the bottom-up ILP system ProGolem. A challenge that arises with a bottom-up approach is the
coverage computation of long, non-determinate, clauses. Prolog’s SLD-resolution is no longer
adequate. We developed a new, Prolog-based, theta-subsumption engine which is significantly
more efficient than SLD-resolution in computing the coverage of such complex clauses.
We provide evidence that ProGolem achieves the goal of learning complex concepts by presenting
a protein-hexose binding prediction application. The theory ProGolem induced has
a statistically significant better predictive accuracy than that of other learners. More importantly,
the biological insights ProGolem’s theory provided were judged by domain experts to
be relevant and, in some cases, novel
Recommended from our members
A comparative survey of integrated learning systems
This paper presents the duction framework for unifying the three basic forms of inference - deduction, abduction, and induction - by specifying the possible relationships and influences among them in the context of integrated learning. Special assumptive forms of inference are defined that extend the use of these inference methods, and the properties of these forms are explored. A comparison to a related inference-based learning frame work is made. Finally several existing integrated learning programs are examined in the perspective of the duction framework
kLog: A Language for Logical and Relational Learning with Kernels
We introduce kLog, a novel approach to statistical relational learning.
Unlike standard approaches, kLog does not represent a probability distribution
directly. It is rather a language to perform kernel-based learning on
expressive logical and relational representations. kLog allows users to specify
learning problems declaratively. It builds on simple but powerful concepts:
learning from interpretations, entity/relationship data modeling, logic
programming, and deductive databases. Access by the kernel to the rich
representation is mediated by a technique we call graphicalization: the
relational representation is first transformed into a graph --- in particular,
a grounded entity/relationship diagram. Subsequently, a choice of graph kernel
defines the feature space. kLog supports mixed numerical and symbolic data, as
well as background knowledge in the form of Prolog or Datalog programs as in
inductive logic programming systems. The kLog framework can be applied to
tackle the same range of tasks that has made statistical relational learning so
popular, including classification, regression, multitask learning, and
collective classification. We also report about empirical comparisons, showing
that kLog can be either more accurate, or much faster at the same level of
accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at
http://klog.dinfo.unifi.it along with tutorials
Proceedings of the Workshop on the lambda-Prolog Programming Language
The expressiveness of logic programs can be greatly increased over first-order Horn clauses through a stronger emphasis on logical connectives and by admitting various forms of higher-order quantification. The logic of hereditary Harrop formulas and the notion of uniform proof have been developed to provide a foundation for more expressive logic programming languages. The λ-Prolog language is actively being developed on top of these foundational considerations. The rich logical foundations of λ-Prolog provides it with declarative approaches to modular programming, hypothetical reasoning, higher-order programming, polymorphic typing, and meta-programming. These aspects of λ-Prolog have made it valuable as a higher-level language for the specification and implementation of programs in numerous areas, including natural language, automated reasoning, program transformation, and databases
Recommended from our members
A knowledge level analysis of learning programs
This chapter develops a taxonomy of learning methods using techniques based on Newell’s knowledge level. Two properties of each system are defined: knowlÂedge level predictability and knowledge level learning. A system is predictable at the knowledge level if the principle of rationality can be applied to predict its behavior. A system learns at the knowledge level if its knowledge level deÂscription changes over time. These two definitions can be used to generate the three-class taxonomy. The taxonomy formalizes the intuition that there are two kinds of learning systems: systems that simply improve their efficiency (symbol-level learning SLL) and systems that acquire new knowledge (knowledge-level learning; KLL). The implications of the taxonomy for learning research are explored. Automatic programming research can provide ideas for SLL. DevelÂopment of methods for KLL must rely either on the development of a principle of plausible rationality or OIL the construction of learning methods that work well only for certain kinds of environments. Explanation-based generalzation and chunking methods address only SLL and do not provide solutions to the problems of KLL
Analytical learning and term-rewriting systems
Analytical learning is a set of machine learning techniques for revising the representation of a theory based on a small set of examples of that theory. When the representation of the theory is correct and complete but perhaps inefficient, an important objective of such analysis is to improve the computational efficiency of the representation. Several algorithms with this purpose have been suggested, most of which are closely tied to a first order logical language and are variants of goal regression, such as the familiar explanation based generalization (EBG) procedure. But because predicate calculus is a poor representation for some domains, these learning algorithms are extended to apply to other computational models. It is shown that the goal regression technique applies to a large family of programming languages, all based on a kind of term rewriting system. Included in this family are three language families of importance to artificial intelligence: logic programming, such as Prolog; lambda calculus, such as LISP; and combinatorial based languages, such as FP. A new analytical learning algorithm, AL-2, is exhibited that learns from success but is otherwise quite different from EBG. These results suggest that term rewriting systems are a good framework for analytical learning research in general, and that further research should be directed toward developing new techniques
A workbench to develop ILP systems
Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201
- …