FOIL is an Inductive Logic Programming Algorithm
to discover first order rules to explain the patterns involved
in a domain of knowledge. Domains as Information Retrieval
or Information Extraction are handicaps for FOIL due to the
huge amount of information it needs manage to devise the rules.
Current solutions to problems in these domains are restricted to
devising ad hoc domain dependent inductive algorithms that use
a less-expressive formalism to code rules.
We work on optimising FOIL learning process to deal with
such complex domain problems while retaining expressiveness.
Our hypothesis is that changing the information gain scoring
function, used by FOIL to decide how rules are learnt, can reduce
the number of steps the algorithm performs. We have analysed 15
scoring functions, normalised them into a common notation and
checked a test in which they are computed. The learning process
will be evaluated according to its efficiency, and the quality of
the rules according to their precision, recall, complexity and
specificity. The results reinforce our hypothesis, demonstrating
that replacing the information gain can optimise both the FOIL
algorithm execution and the learnt rules.Ministerio de Educación y Ciencia TIN2007-64119Junta de Andalucía P07-TIC-2602Junta de Andalucía P08-TIC-4100Ministerio de Ciencia e Innovación TIN2008-04718-