Search CORE

7,887 research outputs found

An information theoretic approach to rule induction from databases

Author: Goodman Rodney M.
Smyth Padhraic
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/1992
Field of study

The knowledge acquisition bottleneck in obtaining rules directly from an expert is well known. Hence, the problem of automated rule acquisition from data is a well-motivated one, particularly for domains where a database of sample data exists. In this paper we introduce a novel algorithm for the induction of rules from examples. The algorithm is novel in the sense that it not only learns rules for a given concept (classification), but it simultaneously learns rules relating multiple concepts. This type of learning, known as generalized rule induction is considerably more general than existing algorithms which tend to be classification oriented. Initially we focus on the problem of determining a quantitative, well-defined rule preference measure. In particular, we propose a quantity called the J-measure as an information theoretic alternative to existing approaches. The J-measure quantifies the information content of a rule or a hypothesis. We will outline the information theoretic origins of this measure and examine its plausibility as a hypothesis preference measure. We then define the ITRULE algorithm which uses the newly proposed measure to learn a set of optimal rules from a set of data samples, and we conclude the paper with an analysis of experimental results on real-world data

Caltech Authors

An information theoretic approach to rule induction from databases

Author: Goodman Rodney M.
Smyth Padhraic
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/1992
Field of study

Computationally efficient induction of classification rules with the PMCRI and J-PMCRI frameworks

Author: Berrar
Bramer
Bramer
Bramer
Bramer
Bramer
Cendrowska
Cohen
Corkill
Frederic Stahl
Hennessy
Hunt
Hwang
Jiang
Max Bramer
Michalski
Mutlu
Nolle
Pham
Provost
Quinlan
Quinlan
Smyth
Stahl
Stahl
Stahl
Stahl
Szalay
Witten
Xavier
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

In order to gain knowledge from large databases, scalable data mining technologies are needed. Data are captured on a large scale and thus databases are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classiﬁcation rule induction, parallelisation of classiﬁcation rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classiﬁcation rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classiﬁcation rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach.are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classiﬁcation rule induction, parallelisation of classiﬁcation rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classiﬁcation rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classiﬁcation rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach

Bournemouth University Research Online

Programming in logic without logic programming

Author: Kowalski Robert
Sadri Fariba
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 14/12/2015
Field of study

In previous work, we proposed a logic-based framework in which computation is the execution of actions in an attempt to make reactive rules of the form if antecedent then consequent true in a canonical model of a logic program determined by an initial state, sequence of events, and the resulting sequence of subsequent states. In this model-theoretic semantics, reactive rules are the driving force, and logic programs play only a supporting role. In the canonical model, states, actions and other events are represented with timestamps. But in the operational semantics, for the sake of efficiency, timestamps are omitted and only the current state is maintained. State transitions are performed reactively by executing actions to make the consequents of rules true whenever the antecedents become true. This operational semantics is sound, but incomplete. It cannot make reactive rules true by preventing their antecedents from becoming true, or by proactively making their consequents true before their antecedents become true. In this paper, we characterize the notion of reactive model, and prove that the operational semantics can generate all and only such models. In order to focus on the main issues, we omit the logic programming component of the framework.Comment: Under consideration in Theory and Practice of Logic Programming (TPLP

arXiv.org e-Print Archive

CiteSeerX

Spiral - Imperial College Digital Repository

Random Prism: An Alternative to Random Forests.

Author: Bramer Max
Stahl Frederic
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Ensemble learning techniques generate multiple classifiers, so called base classifiers, whose combined classification results are used in order to increase the overall classification accuracy. In most ensemble classifiers the base classifiers are based on the Top Down Induction of Decision Trees (TDIDT) approach. However, an alternative approach for the induction of rule based classifiers is the Prism family of algorithms. Prism algorithms produce modular classification rules that do not necessarily fit into a decision tree structure. Prism classification rulesets achieve a comparable and sometimes higher classification accuracy compared with decision tree classifiers, if the data is noisy and large. Yet Prism still suffers from overfitting on noisy and large datasets. In practice ensemble techniques tend to reduce the overfitting, however there exists no ensemble learner for modular classification rule inducers such as the Prism family of algorithms. This article describes the first development of an ensemble learner based on the Prism family of algorithms in order to enhance Prism’s classification accuracy by reducing overfitting

Bournemouth University Research Online

Super Logic Programs

Author: Brass Stefan
Dix Juergen
Przymusinski Teodor C.
Publication venue
Publication date: 01/01/2002
Field of study

The Autoepistemic Logic of Knowledge and Belief (AELB) is a powerful nonmonotic formalism introduced by Teodor Przymusinski in 1994. In this paper, we specialize it to a class of theories called `super logic programs'. We argue that these programs form a natural generalization of standard logic programs. In particular, they allow disjunctions and default negation of arbibrary positive objective formulas. Our main results are two new and powerful characterizations of the static semant ics of these programs, one syntactic, and one model-theoretic. The syntactic fixed point characterization is much simpler than the fixed point construction of the static semantics for arbitrary AELB theories. The model-theoretic characterization via Kripke models allows one to construct finite representations of the inherently infinite static expansions. Both characterizations can be used as the basis of algorithms for query answering under the static semantics. We describe a query-answering interpreter for super programs which we developed based on the model-theoretic characterization and which is available on the web.Comment: 47 pages, revised version of the paper submitted 10/200

arXiv.org e-Print Archive

CiteSeerX