Search CORE

70 research outputs found

Schema Independent Relational Learning

Author: Abiteboul S.
Anderson M.
Arias M.
Kraska T.
Muggleton S.
Muggleton S.
Muggleton S.
Yin X.
Publication venue
Publication date: 06/11/2017
Field of study

Learning novel concepts and relations from relational databases is an important problem with many applications in database systems and machine learning. Relational learning algorithms learn the definition of a new relation in terms of existing relations in the database. Nevertheless, the same data set may be represented under different schemas for various reasons, such as efficiency, data quality, and usability. Unfortunately, the output of current relational learning algorithms tends to vary quite substantially over the choice of schema, both in terms of learning accuracy and efficiency. This variation complicates their off-the-shelf application. In this paper, we introduce and formalize the property of schema independence of relational learning algorithms, and study both the theoretical and empirical dependence of existing algorithms on the common class of (de) composition schema transformations. We study both sample-based learning algorithms, which learn from sets of labeled examples, and query-based algorithms, which learn by asking queries to an oracle. We prove that current relational learning algorithms are generally not schema independent. For query-based learning algorithms we show that the (de) composition transformations influence their query complexity. We propose Castor, a sample-based relational learning algorithm that achieves schema independence by leveraging data dependencies. We support the theoretical results with an empirical study that demonstrates the schema dependence/independence of several algorithms on existing benchmark and real-world datasets under (de) compositions

arXiv.org e-Print Archive

Crossref

Application of abductive ILP to learning metabolic network inhibition from temporal data

Author: A. Varma
A.C. Kakas
A.C. Kakas
A.W. Nicholls
Alireza Tamaddoni-Nezhad
Antonis Kakas
B. Hess
B. Zupan
D.J. Crockford
E. Alm
E. Ravasz
H. J. Zimmerman
H. Jeong
H. Ogata
J.A. Papin
J.J. Tyson
Nir Friedman
O. Boutaud
R. Alves
R.D. King
Raphael Chaleil
S. Muggleton
S. Muggleton
Stephen Muggleton
T.A. Świerkosz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2006
Field of study

Crossref

Spiral - Imperial College Digital Repository

Inductive logic programming at 30: a new introduction

Author: Cropper Andrew
Dumančić Sebastijan
Publication venue
Publication date: 07/12/2021
Field of study

Inductive logic programming (ILP) is a form of machine learning. The goal of ILP is to induce a hypothesis (a set of logical rules) that generalises training examples. As ILP turns 30, we provide a new introduction to the field. We introduce the necessary logical notation and the main learning settings; describe the building blocks of an ILP system; compare several systems on several dimensions; describe four systems (Aleph, TILDE, ASPAL, and Metagol); highlight key application areas; and, finally, summarise current limitations and directions for future research.Comment: Paper under revie

arXiv.org e-Print Archive

Oxford University Research Archive

Meta-interpretive learning of higher-order dyadic datalog: predicate invention revisited

Author: A Blumer
A Srinivasan
Alireza Tamaddoni-Nezhad
C Feng
CAR Hoare
D Knuth
D Miller
Dianhuan Lin
G Huet
J Larson
J McCarthy
JR Quinlan
JW Lloyd
L Raedt De
LG Valiant
RS Sutton
S Džeroski
S Moyle
S-A Tärnlund
SH Muggleton
SH Muggleton
SH Muggleton
SH Muggleton
SJ Russell
ST Kedar-Cabelli
Stephen H. Muggleton
TM Mitchell
W Cohen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2015
Field of study

Since the late 1990s predicate invention has been under-explored within inductive logic programming due to difficulties in formulating efficient search mechanisms. However, a recent paper demonstrated that both predicate invention and the learning of recursion can be efficiently implemented for regular and context-free grammars, by way of metalogical substitutions with respect to a modified Prolog meta-interpreter which acts as the learning engine. New predicate symbols are introduced as constants representing existentially quantified higher-order variables. The approach demonstrates that predicate invention can be treated as a form of higher-order logical reasoning. In this paper we generalise the approach of meta-interpretive learning (MIL) to that of learning higher-order dyadic datalog programs. We show that with an infinite signature the higher-order dyadic datalog class

H^2_2

H22has universal Turing expressivity though

H^2_2

H22is decidable given a finite signature. Additionally we show that Knuth–Bendix ordering of the hypothesis space together with logarithmic clause bounding allows our MIL implementation Metagol

_{D}

Dto PAC-learn minimal cardinality

H^2_2

H22definitions. This result is consistent with our experiments which indicate that Metagol

_{D}

Defficiently learns compact

H^2_2

H22definitions involving predicate invention for learning robotic strategies, the East–West train challenge and NELL. Additionally higher-order concepts were learned in the NELL language learning domain. The Metagol code and datasets described in this paper have been made publicly available on a website to allow reproduction of results in this paper

Crossref

University of Surrey

Object-oriented data mining

Author: Rawles Simon Alan
Publication venue
Publication date: 01/01/2007
Field of study

EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository

Explore Bristol Research

Learning relational event models from video

Author: Bhatt M
Cohn AG
Dubba KSR
Dylla F
Hogg DC
Publication venue: 'AI Access Foundation'
Publication date: 15/05/2015
Field of study

Event models obtained automatically from video can be used in applications ranging from abnormal event detection to content based video retrieval. When multiple agents are involved in the events, characterizing events naturally suggests encoding interactions as relations. Learning event models from this kind of relational spatio-temporal data using relational learning techniques such as Inductive Logic Programming (ILP) hold promise, but have not been successfully applied to very large datasets which result from video data. In this paper, we present a novel framework REMIND (Relational Event Model INDuction) for supervised relational learning of event models from large video datasets using ILP. Efficiency is achieved through the learning from interpretations setting and using a typing system that exploits the type hierarchy of objects in a domain. The use of types also helps prevent over generalization. Furthermore, we also present a type-refining operator and prove that it is optimal. The learned models can be used for recognizing events from previously unseen videos. We also present an extension to the framework by integrating an abduction step that improves the learning performance when there is noise in the input data. The experimental results on several hours of video data from two challenging real world domains (an airport domain and a physical action verbs domain) suggest that the techniques are suitable to real world scenarios

CiteSeerX

Crossref

White Rose Research Online

Distribution-based aggregation for relational learning with identifier attributes

Author: Perlich Claudia
Provost Foster
Publication venue: 'International Journal of Machine Learning and Networked Collaborative Engineering'
Publication date: 27/01/2006
Field of study

Identifier attributes—very high-dimensional categorical attributes such as particular product ids or people’s names—rarely are incorporated in statistical modeling. However, they can play an important role in relational modeling: it may be informative to have communicated with a particular set of people or to have purchased a particular set of products. A key limitation of existing relational modeling techniques is how they aggregate bags (multisets) of values from related entities. The aggregations used by existing methods are simple summaries of the distributions of features of related entities: e.g., MEAN, MODE, SUM, or COUNT. This paper’s main contribution is the introduction of aggregation operators that capture more information about the value distributions, by storing meta-data about value distributions and referencing this meta-data when aggregating—for example by computing class-conditional distributional distances. Such aggregations are particularly important for aggregating values from high-dimensional categorical attributes, for which the simple aggregates provide little information. In the first half of the paper we provide general guidelines for designing aggregation operators, introduce the new aggregators in the context of the relational learning system ACORA (Automated Construction of Relational Attributes), and provide theoretical justification.We also conjecture special properties of identifier attributes, e.g., they proxy for unobserved attributes and for information deeper in the relationship network. In the second half of the paper we provide extensive empirical evidence that the distribution-based aggregators indeed do facilitate modeling with high-dimensional categorical attributes, and in support of the aforementioned conjectures.NYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc

New York University Faculty Digital Archive

Learning non-monotonic Logic Programs to Reason about Actions and Change

Author: Lorenzo Blanco David
Publication venue
Publication date: 01/01/2001
Field of study

[Resumen] El objetivo de esta tesis es el diseño de métodos de aprendizaje automático capaces de encontrar un modelo de un sistema dinámico que determina cómo las propiedades del sistema con afectadas por la ejecución de acciones, Esto permite obtener de manera automática el conocimiento específico del dominio necesario para las tareas de planficación o diagnóstico así como predecir el comportamiento futuro del sistema. La aproximación seguida difiere de las aproximaciones previas en dos aspectos. Primero, el uso de formalismos no monótonos para el razonamiento sobre acciones y el cambio con respecto a los clásicos operadores tipo STRIPS o aquellos basados en formalismos especializados en tareas muy concretas, y por otro lado el uso de métodos de aprendizaje de programas lógicos (Inductive Logic Programming). La combinación de estos dos campos permite obtener un marco declarativo para el aprendizaje, donde la especificación de las acciones y sus efectos es muy intuitiva y natural y que permite aprender teorías más expresivas que en anteriores aproximaciones

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Recommended from our members

Representationally Robust and Scalable Learning over Relational Databases

Author: Picado Leiva Jose Manuel
Publication venue: 'Oregon State University'
Publication date
Field of study

Learning novel concepts from relational databases is an important problem with applications in several disciplines, such as data management, natural language processing, and bioinformatics. For a learning algorithm to be effective, the input data should be clean and in some desired representation. However, real-world data is usually heterogeneous – the same data may be represented under different representations. The current approach to effectively use learning algorithms is to find the desired representations for these algorithms, transform the data to these representations, and clean the data. These tasks are hard and time-consuming and are major obstacles for unlocking the value of data. This thesis demonstrates that it is possible to develop robust learning algorithms that learn in the presence of representational variations. We develop two systems called Castor and CastorX, which exploit data dependencies to be robust against different types of representational variations. Further, we propose several techniques that allow these systems to learn efficiently over large databases. The proposed systems learn over the original data, removing the need for transforming the data before applying learning algorithms. Our results show that Castor and CastorX learn accurately and efficiently over real-world databases. This work paves the way for new approaches that replace pre-processing tasks such as data wrangling with robust learning algorithms

ScholarsArchive@OSU