Search CORE

8 research outputs found

A sequence-length sensitive approach to learning biological grammars using inductive logic programming.

Author: Mamer Thierry
Publication venue
Publication date: 31/01/2011
Field of study

This thesis aims to investigate if the ideas behind compression principles, such as the Minimum Description Length, can help us to improve the process of learning biological grammars from protein sequences using Inductive Logic Programming (ILP). Contrary to most traditional ILP learning problems, biological sequences often have a high variation in their length. This variation in length is an important feature of biological sequences which should not be ignored by ILP systems. However we have identified that some ILP systems do not take into account the length of examples when evaluating their proposed hypotheses. During the learning process, many ILP systems use clause evaluation functions to assign a score to induced hypotheses, estimating their quality and effectively influencing the search. Traditionally, clause evaluation functions do not take into account the length of the examples which are covered by the clause. We propose L-modification, a way of modifying existing clause evaluation functions so that they take into account the length of the examples which they learn from. An empirical study was undertaken to investigate if significant improvements can be achieved by applying L-modification to a standard clause evaluation function. Furthermore, we generally investigated how ILP systems cope with the length of examples in training data. We show that our L-modified clause evaluation function outperforms our benchmark function in every experiment we conducted and thus we prove that L-modification is a useful concept. We also show that the length of the examples in the training data used by ILP systems does have an undeniable impact on the results

University of Salford Institutional Repository

Open Access Institutional Repository at Robert Gordon University

Learning natural language syntax

Author: Watkinson Stephen
Publication venue: University of York
Publication date: 01/01/2002
Field of study

White Rose E-theses Online

Object-oriented data mining

Author: Rawles Simon Alan
Publication venue
Publication date: 01/01/2007
Field of study

EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository

Explore Bristol Research

Experiments in inductive chart parsing

Author: James Cussens
Stephen Pulman
Publication venue
Publication date: 01/01/2000
Field of study

We use Inductive Logic Programming (ILP) within a chart-parsing framework for grammar learning. Given an existing grammar G, together with some sentences which G can not parse, we use ILP to find the "missing " grammar rules or lexical items. Our aim is to exploit the inductive capabilities of chart parsing, i.e. the ability to efficiently determine what is needed for a parse. For each unparsable sentence, we find actual edges and needed edges: those which are needed to allow a parse. The former are used as background knowledge for the ILP algorithm (P-Progol) and the latter are used as examples for the ILP algorithm. We demonstrate our approach with a number of experiments using context-free grammars and a feature grammar

CiteSeerX

Oxford University Research Archive

Experiments in inductive chart parsing

Author: James Cussens
Stephen Pulman
Publication venue
Publication date: 01/01/2000
Field of study

Oxford University Research Archive