756 research outputs found

    From sequential to parallel Inductive Logic Programming

    Get PDF
    Inductive Logic Programming (ILP) has achieved considerablesuccess in a wide range of domains. It is recognized however thateciency is a major obstacle to the use of ILP systems in applicationsrequiring large amounts of data. In this paper we address the problem ofeciency in ILP in three steps: i) we survey speedup techniques proposedfor sequential execution of ILP systems; ii) we survey dierent ways ofparallelizing an ILP system and; ii) adapt and combine the sequentialexecution speedup techniques in the parallel implementations of an ILPsystem. We also propose a novel technique to partition the search spaceinto independent sub-spaces that may be adequately searched in parallel

    Schema Independent Relational Learning

    Full text link
    Learning novel concepts and relations from relational databases is an important problem with many applications in database systems and machine learning. Relational learning algorithms learn the definition of a new relation in terms of existing relations in the database. Nevertheless, the same data set may be represented under different schemas for various reasons, such as efficiency, data quality, and usability. Unfortunately, the output of current relational learning algorithms tends to vary quite substantially over the choice of schema, both in terms of learning accuracy and efficiency. This variation complicates their off-the-shelf application. In this paper, we introduce and formalize the property of schema independence of relational learning algorithms, and study both the theoretical and empirical dependence of existing algorithms on the common class of (de) composition schema transformations. We study both sample-based learning algorithms, which learn from sets of labeled examples, and query-based algorithms, which learn by asking queries to an oracle. We prove that current relational learning algorithms are generally not schema independent. For query-based learning algorithms we show that the (de) composition transformations influence their query complexity. We propose Castor, a sample-based relational learning algorithm that achieves schema independence by leveraging data dependencies. We support the theoretical results with an empirical study that demonstrates the schema dependence/independence of several algorithms on existing benchmark and real-world datasets under (de) compositions

    Improving the efficiency of ILP systems

    Get PDF
    Inductive Logic Programming (ILP) is a promising technol-ogy for knowledge extraction applications. ILP has produced intelligiblesolutions for a wide variety of domains where it has been applied. TheILP lack of eciency is, however, a major impediment for its scalabilityto applications requiring large amounts of data. In this paper we pro-pose a set of techniques that improve ILP systems eciency and makethen more likely to scale up to applications of knowledge extraction fromlarge datasets. We propose and evaluate the lazy evaluation of examples,to improve the eciency of ILP systems. Lazy evaluation is essentiallya way to avoid or postpone the evaluation of the generated hypotheses(coverage tests).The techniques were evaluated using the IndLog system on ILP datasetsreferenced in the literature. The proposals lead to substantial eficiencyimprovements and are generally applicable to any ILP system

    Learning programs by learning from failures

    Full text link
    We describe an inductive logic programming (ILP) approach called learning from failures. In this approach, an ILP system (the learner) decomposes the learning problem into three separate stages: generate, test, and constrain. In the generate stage, the learner generates a hypothesis (a logic program) that satisfies a set of hypothesis constraints (constraints on the syntactic form of hypotheses). In the test stage, the learner tests the hypothesis against training examples. A hypothesis fails when it does not entail all the positive examples or entails a negative example. If a hypothesis fails, then, in the constrain stage, the learner learns constraints from the failed hypothesis to prune the hypothesis space, i.e. to constrain subsequent hypothesis generation. For instance, if a hypothesis is too general (entails a negative example), the constraints prune generalisations of the hypothesis. If a hypothesis is too specific (does not entail all the positive examples), the constraints prune specialisations of the hypothesis. This loop repeats until either (i) the learner finds a hypothesis that entails all the positive and none of the negative examples, or (ii) there are no more hypotheses to test. We introduce Popper, an ILP system that implements this approach by combining answer set programming and Prolog. Popper supports infinite problem domains, reasoning about lists and numbers, learning textually minimal programs, and learning recursive programs. Our experimental results on three domains (toy game problems, robot strategies, and list transformations) show that (i) constraints drastically improve learning performance, and (ii) Popper can outperform existing ILP systems, both in terms of predictive accuracies and learning times.Comment: Accepted for the machine learning journa

    As lazy as it can be

    Get PDF
    Inductive Logic Programming (ILP) is a promising technology for knowledgeextraction applications. ILP has produced intelligible solutions for a wide variety of domains where it has been applied. The ILP lack of efficiency is, however, a major impediment for its scalability to applications requiring large amounts of data. In this paper we address important issues that must be solved to make ILP scalable to applicationsof knowledge extraction in large amounts of data. The issues include: efficiency and storage requirements.We propose and evaluate a set of techniques, globally called lazy evaluation of examples, to improve the efficiency of ILP systems. Lazy evaluation is essentially a way to avoid or postpone the evaluation of the generated hypotheses (coverage tests). To reduce the storage amount a representation schema called interval trees is proposed and evaluated.All the techniques were evaluated using the IndLog ILP system and a set of ILPdatasets referenced in the literature. The proposals lead to substantial efficiency improvements and memory savings and are generally applicable to any ILP system

    An OpenCL software compilation framework targeting an SoC-FPGA VLIW chip multiprocessor

    Get PDF
    Modern systems-on-chip augment their baseline CPU with coprocessors and accelerators to increase overall computational capability and power efficiency, and thus have evolved into heterogeneous multi-core systems. Several languages have been developed to enable this paradigm shift, including CUDA and OpenCL. This paper discusses a unified compilation environment to enable heterogeneous system design through the use of OpenCL and a highly configurable VLIW Chip Multiprocessor architecture known as the LE1. An LLVM compilation framework was researched and a prototype developed to enable the execution of OpenCL applications on a number of hardware configurations of the LE1 CMP. The presented OpenCL framework fully automates the compilation flow and supports work-item coalescing which better maps onto the ILP processor cores of the LE1 architecture. This paper discusses in detail both the software stack and target hardware architecture and evaluates the scalability of the proposed framework by running 12 industry-standard OpenCL benchmarks drawn from the AMD SDK and the Rodinia suites. The benchmarks are executed on 40 LE1 configurations with 10 implemented on an SoC-FPGA and the remaining on a cycle-accurate simulator. Across 12 OpenCL benchmarks results demonstrate near-linear wall-clock performance improvement of 1.8x (using 2 dual-issue cores), up to 5.2x (using 8 dual-issue cores) and on one case, super-linear improvement of 8.4x (FixOffset kernel, 8 dual-issue cores). The number of OpenCL benchmarks evaluated makes this study one of the most complete in the literature

    ILP - Just trie it

    Get PDF
    Despite the considerable success of Inductive Logic Programming (ILP), deployed ILP systems still have efficiency problems when applied to complex problems. Several techniques have been proposed to address the efficiency issue. Such proposals include query transformations, query packs, lazy evaluation and parallel execution of ILP systems, to mention just a few. We propose a novel technique that avoids the procedure of deducing each example to evaluate each constructed clause. The technique takes advantage of the two stage procedure of Mode Directed Inverse Entailment (MDIE) systems. In the first stage of a MDIE system, where the bottom clause is constructed, we store not only the bottom clause but also valuable additional information. The information stored is sufficient to evaluate the clauses constructed in the second stage without the need for a theorem prover. We used a data structure called Trie to efficiently store all bottom clauses produced using all examples (positive and negative) as seeds. The technique was implemented and evaluated using two well known data sets from the ILP literature. The results are promising both in terms of execution time and accuracy
    corecore