121 research outputs found
Recommended from our members
Neural Relational Learning Through Semi-Propositionalization of Bottom Clauses
Relational learning can be described as the task of learning first-order logic rules from examples. It has enabled a number of new machine learning applications, e.g. graph mining and link analysis in social networks. The CILP++ system is a neural-symbolic system which can perform efficient relational learning, by being able to process first-order logic knowledge into a neural network. CILP++ relies on BCP, a recently discovered propositionalization algorithm, to perform relational learning. However, efficient knowledge extraction from such networks is an open issue and features generated by BCP do not have an independent relational description, which prevents sound knowledge extraction from such networks. We present a methodology for generating independent propositional features for BCP by using semi-propositionalization of bottom clauses. Empirical results obtained in comparison with the original version of BCP show that this approach has comparable accuracy and runtimes, while allowing proper relational knowledge representation of features for knowledge extraction from CILP++ networks
kLog: A Language for Logical and Relational Learning with Kernels
We introduce kLog, a novel approach to statistical relational learning.
Unlike standard approaches, kLog does not represent a probability distribution
directly. It is rather a language to perform kernel-based learning on
expressive logical and relational representations. kLog allows users to specify
learning problems declaratively. It builds on simple but powerful concepts:
learning from interpretations, entity/relationship data modeling, logic
programming, and deductive databases. Access by the kernel to the rich
representation is mediated by a technique we call graphicalization: the
relational representation is first transformed into a graph --- in particular,
a grounded entity/relationship diagram. Subsequently, a choice of graph kernel
defines the feature space. kLog supports mixed numerical and symbolic data, as
well as background knowledge in the form of Prolog or Datalog programs as in
inductive logic programming systems. The kLog framework can be applied to
tackle the same range of tasks that has made statistical relational learning so
popular, including classification, regression, multitask learning, and
collective classification. We also report about empirical comparisons, showing
that kLog can be either more accurate, or much faster at the same level of
accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at
http://klog.dinfo.unifi.it along with tutorials
Using ILP to Identify Pathway Activation Patterns in Systems Biology
We show a logical aggregation method that, combined with propositionalization methods, can construct novel structured biological features from gene expression data. We do this to gain understanding of pathway mechanisms, for instance, those associated with a particular disease. We illustrate this method on the task of distinguishing between two types of lung cancer; Squamous Cell Carcinoma (SCC) and Adenocarcinoma (AC). We identify pathway activation patterns in pathways previously implicated in the development of cancers. Our method identified a model with comparable predictive performance to the winning algorithm of a recent challenge, while providing biologically relevant explanations that may be useful to a biologist
Structural Logistic Regression for Link Analysis
We present Structural Logistic Regression, an extension of logistic regression to modeling relational data. It is an integrated approach to building regression models from data stored in relational databases in which potential predictors, both boolean and real-valued, are generated by structured search in the space of queries to the database, and then tested with statistical information criteria for inclusion in a logistic regression. Using statistics and relational representation allows modeling in noisy domains with complex structure. Link prediction is a task of high interest with exactly such characteristics. Be it in the domain of scientific citations, social networks or hypertext, the underlying data are extremely noisy and the features useful for prediction are not readily available in a flat file format. We propose the application of Structural Logistic Regression to building link prediction models, and present experimental results for the task of predicting citations made in scientific literature using relational data taken from the CiteSeer search engine. This data includes the citation graph, authorship and publication venues of papers, as well as their word content
Finding relational redescriptions
We introduce relational redescription mining, that is, the task of finding two structurally different patterns that describe nearly the same set of object pairs in a relational dataset. By extending redescription mining beyond propositional and real-valued attributes, it provides a powerful tool to match different relational descriptions of the same concept.
We propose an alternating scheme for solving this problem. Its core consists of a novel relational query miner that efficiently identifies discriminative connection patterns between pairs of objects. Compared to a baseline Inductive Logic Programming (ILP) approach, our query miner is able to mine more complex queries, much faster. We performed extensive experiments on three real world relational datasets, and present examples of redescriptions found, exhibiting the power of the method to expressively capture relations present in these networks
- …