Search CORE

69 research outputs found

Application of decision trees and multivariate regression trees in design and optimization

Author: Forouraghi Babak
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/1995
Field of study

Induction of decision trees and regression trees is a powerful technique not only for performing ordinary classification and regression analysis but also for discovering the often complex knowledge which describes the input-output behavior of a learning system in qualitative forms;In the area of classification (discrimination analysis), a new technique called IDea is presented for performing incremental learning with decision trees. It is demonstrated that IDea\u27s incremental learning can greatly reduce the spatial complexity of a given set of training examples. Furthermore, it is shown that this reduction in complexity can also be used as an effective tool for improving the learning efficiency of other types of inductive learners such as standard backpropagation neural networks;In the area of regression analysis, a new methodology for performing multiobjective optimization has been developed. Specifically, we demonstrate that muitiple-objective optimization through induction of multivariate regression trees is a powerful alternative to the conventional vector optimization techniques. Furthermore, in an attempt to investigate the effect of various types of splitting rules on the overall performance of the optimizing system, we present a tree partitioning algorithm which utilizes a number of techniques derived from diverse fields of statistics and fuzzy logic. These include: two multivariate statistical approaches based on dispersion matrices, an information-theoretic measure of covariance complexity which is typically used for obtaining multivariate linear models, two newly-formulated fuzzy splitting rules based on Pearson\u27s parametric and Kendall\u27s nonparametric measures of association, Bellman and Zadeh\u27s fuzzy decision-maximizing approach within an inductive framework, and finally, the multidimensional extension of a widely-used fuzzy entropy measure. The advantages of this new approach to optimization are highlighted by presenting three examples which respectively deal with design of a three-bar truss, a beam, and an electric discharge machining (EDM) process

Digital Repository @ Iowa State University (ISU)

Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning

Author: Mooney Raymond J.
Publication venue
Publication date: 01/01/1996
Field of study

This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The specific problem tested involves disambiguating six senses of the word ``line'' using the words in the current and proceeding sentence as context. The statistical and neural-network methods perform the best on this particular problem and we discuss a potential reason for this observed difference. We also discuss the role of bias in machine learning and its importance in explaining performance differences observed on specific problems.Comment: 10 page

arXiv.org e-Print Archive

CiteSeerX

An information theoretic approach to rule induction from databases

Author: Goodman Rodney M.
Smyth Padhraic
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/1992
Field of study

The knowledge acquisition bottleneck in obtaining rules directly from an expert is well known. Hence, the problem of automated rule acquisition from data is a well-motivated one, particularly for domains where a database of sample data exists. In this paper we introduce a novel algorithm for the induction of rules from examples. The algorithm is novel in the sense that it not only learns rules for a given concept (classification), but it simultaneously learns rules relating multiple concepts. This type of learning, known as generalized rule induction is considerably more general than existing algorithms which tend to be classification oriented. Initially we focus on the problem of determining a quantitative, well-defined rule preference measure. In particular, we propose a quantity called the J-measure as an information theoretic alternative to existing approaches. The J-measure quantifies the information content of a rule or a hypothesis. We will outline the information theoretic origins of this measure and examine its plausibility as a hypothesis preference measure. We then define the ITRULE algorithm which uses the newly proposed measure to learn a set of optimal rules from a set of data samples, and we conclude the paper with an analysis of experimental results on real-world data

Caltech Authors

A Hybrid heuristic-exhaustive search approach for rule extraction

Author: Rodic Daniel
Publication venue: 'University of Pretoria - Department of Philosophy'
Publication date: 29/05/2006
Field of study

The topic of this thesis is knowledge discovery and artificial intelligence based knowledge discovery algorithms. The knowledge discovery process and associated problems are discussed, followed by an overview of three classes of artificial intelligence based knowledge discovery algorithms. Typical representatives of each of these classes are presented and discussed in greater detail. Then a new knowledge discovery algorithm, called Hybrid Classifier System (HCS), is presented. The guiding concept behind the new algorithm was simplicity. The new knowledge discovery algorithm is loosely based on schemata theory. It is evaluated against one of the discussed algorithms from each class, namely: CN2; C4.5, BRAINNE and BGP. Results are discussed and compared. A comparison was done using a benchmark of classification problems. These results show that the new knowledge discovery algorithm performs satisfactory, yielding accurate, crisp rule sets. Probably the main strength of the HCS algorithm is its simplicity, so it can be the foundation for many possible future extensions. Some of the possible extensions of the new proposed algorithm are suggested in the final part of this thesis.Dissertation (MSc)--University of Pretoria, 2007.Computer Scienceunrestricte

UPSpace at the University of Pretoria

Recommended from our members

Distributed Data Mining: The JAM system architecture

Author: Kalina David
Prodromidis Andreas L.
Sherwin Jeffrey
Stolfo Salvatore
Truta Terrance
Tselepis Shelley
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2001
Field of study

This paper describes the system architecture of JAM (Java Agents for Meta-learning), a distributed data mining system that scales up to large and physically separated data sets. An earlyversion of the JAM system was described in Stolfo-et-al-97-KDD-JAM. Since then, JAM has evolved both architecturally and functionally and here we present the final design and implementation details of this system architecture. JAM is an extensible agent-based distributed data mining system that supports the remote dispatch and exchange of agents among participating datasites and employs meta-learning techniques to combine the multiple models that are learned. One of JAM's target applications is fraud and intrusion detection in financial information systems. A brief description of this learning task and JAM's applicability and summary results are also discussed

Columbia University Academic Commons

Knowledge Acquisition from Data Bases

Author: Wu Xindong
Publication venue: The University of Edinburgh: College of Science and Engineering: The School of Informatics
Publication date: 01/07/1993
Field of study

Centre for Intelligent Systems and their ApplicationsGrant No.6897502Knowledge acquisition from databases is a research frontier for both data base technology and machine learning (ML) techniques,and has seen sustained research over recent years.It also acts as a link between the two fields,thus offering a dual benefit. Firstly, since database technology has already found wide application in many fields ML research obviously stands to gain from this greater exposure and established technological foundation. Secondly, ML techniques can augment the ability of existing database systems to represent acquire,and process a collection of expertise such as those which form part of the semantics of many advanced applications (e.gCAD/CAM).The major contribution of this thesis is the introduction of an effcient induction algorithm to facilitate the acquisition of such knowledge from databases. There are three typical families of inductive algorithms: the generalisation- specialisation based AQ11-like family, the decision tree based ID3-like family,and the extension matrix based family. A heuristic induction algorithm, HCV based on the newly-developed extension matrix approach is described in this thesis. By dividing the positive examples (PE) of a specific class in a given example set into intersect in groups and adopting a set of strategies to find a heuristic conjunctive rule in each group which covers all the group's positiv examples and none of the negativ examples(NE),HCV can find rules in the form of variable-valued logic for PE against NE in low-order polynomial time. The rules generated in HCV are shown empirically to be more compact than the rules produced by AQ1-like algorithms and the decision trees produced by the ID3-like algorithms. KEshell2, an intelligent learning database system, which makes use of the HCV algorithm and couples ML techniques with database and knowledgebase technology, is also described

Edinburgh Research Archive

Rule Set Quality Measures For Inductive Learning Algorithms

Author: Klinkenberg Ralf
Publication venue
Publication date: 12/12/2003
Field of study

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Decision tree learning for intelligent mobile robot navigation

Author: G. Hossein Shah Hamzei (7202189)
Publication venue
Publication date: 01/01/1998
Field of study

The replication of human intelligence, learning and reasoning by means of computer algorithms is termed Artificial Intelligence (Al) and the interaction of such algorithms with the physical world can be achieved using robotics. The work described in this thesis investigates the applications of concept learning (an approach which takes its inspiration from biological motivations and from survival instincts in particular) to robot control and path planning. The methodology of concept learning has been applied using learning decision trees (DTs) which induce domain knowledge from a finite set of training vectors which in turn describe systematically a physical entity and are used to train a robot to learn new concepts and to adapt its behaviour. To achieve behaviour learning, this work introduces the novel approach of hierarchical learning and knowledge decomposition to the frame of the reactive robot architecture. Following the analogy with survival instincts, the robot is first taught how to survive in very simple and homogeneous environments, namely a world without any disturbances or any kind of "hostility". Once this simple behaviour, named a primitive, has been established, the robot is trained to adapt new knowledge to cope with increasingly complex environments by adding further worlds to its existing knowledge. The repertoire of the robot behaviours in the form of symbolic knowledge is retained in a hierarchy of clustered decision trees (DTs) accommodating a number of primitives. To classify robot perceptions, control rules are synthesised using symbolic knowledge derived from searching the hierarchy of DTs. A second novel concept is introduced, namely that of multi-dimensional fuzzy associative memories (MDFAMs). These are clustered fuzzy decision trees (FDTs) which are trained locally and accommodate specific perceptual knowledge. Fuzzy logic is incorporated to deal with inherent noise in sensory data and to merge conflicting behaviours of the DTs. In this thesis, the feasibility of the developed techniques is illustrated in the robot applications, their benefits and drawbacks are discussed

Loughborough University Institutional Repository

Pruning methods for rule induction

Author: Othman O
Publication venue
Publication date: 05/05/2017
Field of study

Machine learning is a research area within computer science that is mainly concerned with discovering regularities in data. Rule induction is a powerful technique used in machine learning wherein the target concept is represented as a set of rules. The attraction of rule induction is that rules are more transparent and easier to understand compared to other induction methods (e.g., regression methods or neural network). Rule induction has been shown to outperform other learners on many problems. However, it is not suitable to handle exceptions and noisy data in training sets, which can be solved by pruning. This thesis is concerned with investigating whether preceding rule induction with instance reduction techniques can help in reducing the complexity of rule sets by reducing the number of rules generated without adversely affecting the predictive accuracy. An empirical study is undertaken to investigate the application of three different rule classifiers to datasets that were previously reduced with promising instance-reduction methods. Furthermore, we propose a new instance reduction method based on Ant Colony Optimization (ACO). We evaluate the effectiveness of this instance reduction method for k nearest neighbour algorithms in term of predictive accuracy and amount of reduction. Then we compared it with other instance reduction methods.We show that pruning classification rules with instance-reduction methods lead to a statistically significant decrease in the number of generated rules, without adversely affecting performance. On the other hand, applying instance-reduction methods enhances the predictive accuracy on some datasets. Moreover, the results provide evidence that: (1) our proposed instance reduction method based on ACO is competitive with the well-known k-NN algorithm; (2) the reduced sets computed by our method offers better classification accuracy than those obtained by the compared algorithms

University of Salford Institutional Repository