Search CORE

393 research outputs found

A review of the state of the art in Machine Learning on the Semantic Web: Technical Report CSTR-05-003

Author: Price S
Publication venue: Department of Computer Science, University of Bristol
Publication date: 01/01/2004
Field of study

Efficient Learning and Evaluation of Complex Concepts in Inductive Logic Programming

Author: Santos Jose Carlos Almeida Santos
Santos Jose Carlos Almeida Santos
Publication venue: Computing, Imperial College London
Publication date: 01/03/2011
Field of study

Inductive Logic Programming (ILP) is a subfield of Machine Learning with foundations in logic programming. In ILP, logic programming, a subset of first-order logic, is used as a uniform representation language for the problem specification and induced theories. ILP has been successfully applied to many real-world problems, especially in the biological domain (e.g. drug design, protein structure prediction), where relational information is of particular importance. The expressiveness of logic programs grants flexibility in specifying the learning task and understandability to the induced theories. However, this flexibility comes at a high computational cost, constraining the applicability of ILP systems. Constructing and evaluating complex concepts remain two of the main issues that prevent ILP systems from tackling many learning problems. These learning problems are interesting both from a research perspective, as they raise the standards for ILP systems, and from an application perspective, where these target concepts naturally occur in many real-world applications. Such complex concepts cannot be constructed or evaluated by parallelizing existing top-down ILP systems or improving the underlying Prolog engine. Novel search strategies and cover algorithms are needed. The main focus of this thesis is on how to efficiently construct and evaluate complex hypotheses in an ILP setting. In order to construct such hypotheses we investigate two approaches. The first, the Top Directed Hypothesis Derivation framework, implemented in the ILP system TopLog, involves the use of a top theory to constrain the hypothesis space. In the second approach we revisit the bottom-up search strategy of Golem, lifting its restriction on determinate clauses which had rendered Golem inapplicable to many key areas. These developments led to the bottom-up ILP system ProGolem. A challenge that arises with a bottom-up approach is the coverage computation of long, non-determinate, clauses. Prolog’s SLD-resolution is no longer adequate. We developed a new, Prolog-based, theta-subsumption engine which is significantly more efficient than SLD-resolution in computing the coverage of such complex clauses. We provide evidence that ProGolem achieves the goal of learning complex concepts by presenting a protein-hexose binding prediction application. The theory ProGolem induced has a statistically significant better predictive accuracy than that of other learners. More importantly, the biological insights ProGolem’s theory provided were judged by domain experts to be relevant and, in some cases, novel

Spiral - Imperial College Digital Repository

A workbench to develop ILP systems

Author: Azevedo João de Campos
Publication venue
Publication date: 01/01/2010
Field of study

Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

Repositório Aberto da Universidade do Porto

Transforming Graph Representations for Statistical Relational Learning

Author: Aha David W.
McDowell Luke K.
Neville Jennifer
Rossi Ryan A.
Publication venue
Publication date: 01/01/2012
Field of study

Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

arXiv.org e-Print Archive

CiteSeerX

Modeling Complex Networks For (Electronic) Commerce

Author: Provost Foster
Sundararajan Arun
Publication venue
Publication date: 01/01/2007
Field of study

NYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc

Crossref

New York University Faculty Digital Archive

Fast learning of relational kernels

Author: A. Argyriou
A. Bordes
A. Karalič
A. Srinivasan
A. Srinivasan
Andrea Passerini
C. Cortes
C. Micchelli
C. A. Micchelli
C. S. Ong
F. Cucker
G. R. G. Lanckriet
G. S. Kimeldorf
H. Blockeel
H. Fang
H. Lodhi
H. Saigo
J. Davis
J. Quinlan
J. Ramon
J. Shawe-Taylor
J. Weston
J. W. Lloyd
K. Kersting
K. Khan
K.-U. Höffgen
L. Getoor
L. Raedt De
L. Raedt De
Luc De Raedt
M. Kirsten
M. D. Reid
N. Cristianini
N. Lavrac
Niels Landwehr
O. Chapelle
P. Frasconi
P. L. Bartlett
Paolo Frasconi
R. Caruana
R. King
S. Ben-David
S. Kok
S. Kramer
S. Muggleton
S. Muggleton
S. Slattery
S. J. Swamidass
T. Evgeniou
T. Gärtner
T. Gärtner
T. Joachims
T. Poggio
U. Rückert
Y. Bengio
Y. Freund
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Strategies to parallelize ILP systems

Author: A. Clare
A. Grama
A. Srinivasan
B. Dolsak
D. Page
D. Page
F. Železný
H. Blockeel
H. Ohwada
H. Ohwada
J. Fürnkranz
J. Graham
J.M. Squyres
J.R. Quinlan
L. Raedt De
N. Fonseca
R.G. Smith
S. Muggleton
S. Muggleton
S. Muggleton
S. Muggleton
W. Gropp
Y. Wang
Publication venue
Publication date: 01/01/2005
Field of study

It is well known by Inductive Logic Programming (ILP) practionersthat ILP systems usually take a long time to nd valuable models(theories). The problem is specially critical for large datasets, preventingILP systems to scale up to larger applications. One approach to reducethe execution time has been the parallelization of ILP systems. In thispaper we overview the state-of-the-art on parallel ILP implementationsand present work on the evaluation of some major parallelization strategiesfor ILP. Conclusions about the applicability of each strategy arepresented

Crossref

Repositório Aberto da Universidade do Porto