69 research outputs found

    Application of decision trees and multivariate regression trees in design and optimization

    Get PDF
    Induction of decision trees and regression trees is a powerful technique not only for performing ordinary classification and regression analysis but also for discovering the often complex knowledge which describes the input-output behavior of a learning system in qualitative forms;In the area of classification (discrimination analysis), a new technique called IDea is presented for performing incremental learning with decision trees. It is demonstrated that IDea\u27s incremental learning can greatly reduce the spatial complexity of a given set of training examples. Furthermore, it is shown that this reduction in complexity can also be used as an effective tool for improving the learning efficiency of other types of inductive learners such as standard backpropagation neural networks;In the area of regression analysis, a new methodology for performing multiobjective optimization has been developed. Specifically, we demonstrate that muitiple-objective optimization through induction of multivariate regression trees is a powerful alternative to the conventional vector optimization techniques. Furthermore, in an attempt to investigate the effect of various types of splitting rules on the overall performance of the optimizing system, we present a tree partitioning algorithm which utilizes a number of techniques derived from diverse fields of statistics and fuzzy logic. These include: two multivariate statistical approaches based on dispersion matrices, an information-theoretic measure of covariance complexity which is typically used for obtaining multivariate linear models, two newly-formulated fuzzy splitting rules based on Pearson\u27s parametric and Kendall\u27s nonparametric measures of association, Bellman and Zadeh\u27s fuzzy decision-maximizing approach within an inductive framework, and finally, the multidimensional extension of a widely-used fuzzy entropy measure. The advantages of this new approach to optimization are highlighted by presenting three examples which respectively deal with design of a three-bar truss, a beam, and an electric discharge machining (EDM) process

    Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning

    Full text link
    This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The specific problem tested involves disambiguating six senses of the word ``line'' using the words in the current and proceeding sentence as context. The statistical and neural-network methods perform the best on this particular problem and we discuss a potential reason for this observed difference. We also discuss the role of bias in machine learning and its importance in explaining performance differences observed on specific problems.Comment: 10 page

    An information theoretic approach to rule induction from databases

    Get PDF
    The knowledge acquisition bottleneck in obtaining rules directly from an expert is well known. Hence, the problem of automated rule acquisition from data is a well-motivated one, particularly for domains where a database of sample data exists. In this paper we introduce a novel algorithm for the induction of rules from examples. The algorithm is novel in the sense that it not only learns rules for a given concept (classification), but it simultaneously learns rules relating multiple concepts. This type of learning, known as generalized rule induction is considerably more general than existing algorithms which tend to be classification oriented. Initially we focus on the problem of determining a quantitative, well-defined rule preference measure. In particular, we propose a quantity called the J-measure as an information theoretic alternative to existing approaches. The J-measure quantifies the information content of a rule or a hypothesis. We will outline the information theoretic origins of this measure and examine its plausibility as a hypothesis preference measure. We then define the ITRULE algorithm which uses the newly proposed measure to learn a set of optimal rules from a set of data samples, and we conclude the paper with an analysis of experimental results on real-world data

    A Hybrid heuristic-exhaustive search approach for rule extraction

    Get PDF
    The topic of this thesis is knowledge discovery and artificial intelligence based knowledge discovery algorithms. The knowledge discovery process and associated problems are discussed, followed by an overview of three classes of artificial intelligence based knowledge discovery algorithms. Typical representatives of each of these classes are presented and discussed in greater detail. Then a new knowledge discovery algorithm, called Hybrid Classifier System (HCS), is presented. The guiding concept behind the new algorithm was simplicity. The new knowledge discovery algorithm is loosely based on schemata theory. It is evaluated against one of the discussed algorithms from each class, namely: CN2; C4.5, BRAINNE and BGP. Results are discussed and compared. A comparison was done using a benchmark of classification problems. These results show that the new knowledge discovery algorithm performs satisfactory, yielding accurate, crisp rule sets. Probably the main strength of the HCS algorithm is its simplicity, so it can be the foundation for many possible future extensions. Some of the possible extensions of the new proposed algorithm are suggested in the final part of this thesis.Dissertation (MSc)--University of Pretoria, 2007.Computer Scienceunrestricte

    Knowledge Acquisition from Data Bases

    Get PDF
    Centre for Intelligent Systems and their ApplicationsGrant No.6897502Knowledge acquisition from databases is a research frontier for both data base technology and machine learning (ML) techniques,and has seen sustained research over recent years.It also acts as a link between the two fields,thus offering a dual benefit. Firstly, since database technology has already found wide application in many fields ML research obviously stands to gain from this greater exposure and established technological foundation. Secondly, ML techniques can augment the ability of existing database systems to represent acquire,and process a collection of expertise such as those which form part of the semantics of many advanced applications (e.gCAD/CAM).The major contribution of this thesis is the introduction of an effcient induction algorithm to facilitate the acquisition of such knowledge from databases. There are three typical families of inductive algorithms: the generalisation- specialisation based AQ11-like family, the decision tree based ID3-like family,and the extension matrix based family. A heuristic induction algorithm, HCV based on the newly-developed extension matrix approach is described in this thesis. By dividing the positive examples (PE) of a specific class in a given example set into intersect in groups and adopting a set of strategies to find a heuristic conjunctive rule in each group which covers all the group's positiv examples and none of the negativ examples(NE),HCV can find rules in the form of variable-valued logic for PE against NE in low-order polynomial time. The rules generated in HCV are shown empirically to be more compact than the rules produced by AQ1-like algorithms and the decision trees produced by the ID3-like algorithms. KEshell2, an intelligent learning database system, which makes use of the HCV algorithm and couples ML techniques with database and knowledgebase technology, is also described

    Decision tree learning for intelligent mobile robot navigation

    Get PDF
    The replication of human intelligence, learning and reasoning by means of computer algorithms is termed Artificial Intelligence (Al) and the interaction of such algorithms with the physical world can be achieved using robotics. The work described in this thesis investigates the applications of concept learning (an approach which takes its inspiration from biological motivations and from survival instincts in particular) to robot control and path planning. The methodology of concept learning has been applied using learning decision trees (DTs) which induce domain knowledge from a finite set of training vectors which in turn describe systematically a physical entity and are used to train a robot to learn new concepts and to adapt its behaviour. To achieve behaviour learning, this work introduces the novel approach of hierarchical learning and knowledge decomposition to the frame of the reactive robot architecture. Following the analogy with survival instincts, the robot is first taught how to survive in very simple and homogeneous environments, namely a world without any disturbances or any kind of "hostility". Once this simple behaviour, named a primitive, has been established, the robot is trained to adapt new knowledge to cope with increasingly complex environments by adding further worlds to its existing knowledge. The repertoire of the robot behaviours in the form of symbolic knowledge is retained in a hierarchy of clustered decision trees (DTs) accommodating a number of primitives. To classify robot perceptions, control rules are synthesised using symbolic knowledge derived from searching the hierarchy of DTs. A second novel concept is introduced, namely that of multi-dimensional fuzzy associative memories (MDFAMs). These are clustered fuzzy decision trees (FDTs) which are trained locally and accommodate specific perceptual knowledge. Fuzzy logic is incorporated to deal with inherent noise in sensory data and to merge conflicting behaviours of the DTs. In this thesis, the feasibility of the developed techniques is illustrated in the robot applications, their benefits and drawbacks are discussed

    Pruning methods for rule induction

    Get PDF
    Machine learning is a research area within computer science that is mainly concerned with discovering regularities in data. Rule induction is a powerful technique used in machine learning wherein the target concept is represented as a set of rules. The attraction of rule induction is that rules are more transparent and easier to understand compared to other induction methods (e.g., regression methods or neural network). Rule induction has been shown to outperform other learners on many problems. However, it is not suitable to handle exceptions and noisy data in training sets, which can be solved by pruning. This thesis is concerned with investigating whether preceding rule induction with instance reduction techniques can help in reducing the complexity of rule sets by reducing the number of rules generated without adversely affecting the predictive accuracy. An empirical study is undertaken to investigate the application of three different rule classifiers to datasets that were previously reduced with promising instance-reduction methods. Furthermore, we propose a new instance reduction method based on Ant Colony Optimization (ACO). We evaluate the effectiveness of this instance reduction method for k nearest neighbour algorithms in term of predictive accuracy and amount of reduction. Then we compared it with other instance reduction methods.We show that pruning classification rules with instance-reduction methods lead to a statistically significant decrease in the number of generated rules, without adversely affecting performance. On the other hand, applying instance-reduction methods enhances the predictive accuracy on some datasets. Moreover, the results provide evidence that: (1) our proposed instance reduction method based on ACO is competitive with the well-known k-NN algorithm; (2) the reduced sets computed by our method offers better classification accuracy than those obtained by the compared algorithms
    • …
    corecore