7 research outputs found

    Automated fuzzy-clustering for Doctus expert system

    Get PDF
    Our Knowledge-Based Expert System Shell 'Doctus'1 is capable of deduction also called rule-based reasoning and of induction, which is the symbolic version of reasoning by cases2 . If connected to databases or data warehouses the inductive reasoning of Doctus is also used for data mining. To handle numerical domains Doctus uses statistical clustering algorithm. We define the problem in three steps: how to perform a clustering, which is neither rigid nor sensitive to noise, benefiting from the properties of the application domain, reducing the complexity as much as possible, and supplying the decision maker with useful information enabling the possibility of interaction? In this paper we present the conception of Automated FuzzyClustering using triangular and trapezoidal Fuzzy-sets, which provides overlapping Fuzzy-set covering of the domain

    Model of Learning Ability

    Get PDF
    The problem domain of the investigation presented in this dissertation is knowledge increase. In particular the research is concerned with the process of knowledge increase. The research problem formulated is formulated a posteriori: "Which factors determine the increase of personal knowledge that occurs by absorbing a particular new knowledge of an individual, who is a member of an organization, and how these factors work?" To explore and shed light on this problem a number of disciplinary boundaries were engaged and some models, tools, descriptions, etc. were borrowed from a number of related disciplines. These areas are briefly presented in the dissertation, restricting presentation to the relevant issues. There are three models developed for this thesis and they are subsequently integrated into a fourth model. First the 'Model of Learning Willingness' (MLW) is developed to consider personal and organizational value systems. For this model, new concepts have been created, to indicate the position of new knowledge in both personal and organizational value systems. Stable and the unstable states of the model are identified as well as how it is possible to pass from one state to another as result of an interaction between the two value systems by means of influencing each other. Applying a 'systems theory approach' on the cognitive psychology conception of knowledge, the impact of the characteristics of existing knowledge on the absorption of new knowledge is described. The developed model is called the 'Model of Learning Capability' (MLC). - This is the second model. It is also necessary to pay attention to the ability to acquire new knowledge; this is described by the 'Model of Attention' (MA) - the third model. This model is based on two main factors, namely cognitive and social conditions. These three models are thus integrated into fourth one, which is called the 'Model of Learning Ability' (MLA). For exploration/validation the model is wwwed with the Doctus Knowledge-Based Expert System, which was also the means of comparing the evolved hypotheses with the input from reality, namely observations and thought experiments. The first insight from the model is a better understanding of the process of 'knowledge increase'. The model can also be used to support choosing the right person to learn a particular piece of new knowledge, to identify the reason for someone not performing well with regards to learning and/or identifying a possible way of improving the process. Using the logic of the model experts can also be evaluated in the process of knowledge acquisition when building an expert system. Considering the achieved results some new problems emerge: It is not known what motivates the personal value system during the knowledge absorption; it is not known if the model can be extended to other forms of knowledge increase besides learning; it is not known how the social factors apart from love (i.e. power and money) affect the attention. Some new research ideas also evolved from this investigation, e.g. an attempt to model the knowledge using dimensions of understanding

    Data mining using neural networks

    Get PDF
    Data mining is about the search for relationships and global patterns in large databases that are increasing in size. Data mining is beneficial for anyone who has a huge amount of data, for example, customer and business data, transaction, marketing, financial, manufacturing and web data etc. The results of data mining are also referred to as knowledge in the form of rules, regularities and constraints. Rule mining is one of the popular data mining methods since rules provide concise statements of potentially important information that is easily understood by end users and also actionable patterns. At present rule mining has received a good deal of attention and enthusiasm from data mining researchers since rule mining is capable of solving many data mining problems such as classification, association, customer profiling, summarization, segmentation and many others. This thesis makes several contributions by proposing rule mining methods using genetic algorithms and neural networks. The thesis first proposes rule mining methods using a genetic algorithm. These methods are based on an integrated framework but capable of mining three major classes of rules. Moreover, the rule mining processes in these methods are controlled by tuning of two data mining measures such as support and confidence. The thesis shows how to build data mining predictive models using the resultant rules of the proposed methods. Another key contribution of the thesis is the proposal of rule mining methods using supervised neural networks. The thesis mathematically analyses the Widrow-Hoff learning algorithm of a single-layered neural network, which results in a foundation for rule mining algorithms using single-layered neural networks. Three rule mining algorithms using single-layered neural networks are proposed for the three major classes of rules on the basis of the proposed theorems. The thesis also looks at the problem of rule mining where user guidance is absent. The thesis proposes a guided rule mining system to overcome this problem. The thesis extends this work further by comparing the performance of the algorithm used in the proposed guided rule mining system with Apriori data mining algorithm. Finally, the thesis studies the Kohonen self-organization map as an unsupervised neural network for rule mining algorithms. Two approaches are adopted based on the way of self-organization maps applied in rule mining models. In the first approach, self-organization map is used for clustering, which provides class information to the rule mining process. In the second approach, automated rule mining takes the place of trained neurons as it grows in a hierarchical structure

    Design and analysis of scalable rule induction systems

    Get PDF
    Machine learning has been studied intensively during the past two decades. One motivation has been the desire to automate the process of knowledge acquisition during the construction of expert systems. The recent emergence of data mining as a major application for machine learning algorithms has led to the need for algorithms that can handle very large data sets. In real data mining applications, data sets with millions of training examples, thousands of attributes and hundreds of classes are common. Designing learning algorithms appropriate for such applications has thus become an important research problem. A great deal of research in machine learning has focused on classification learning. Among the various machine learning approaches developed for classification, rule induction is of particular interest for data mining because it generates models in the form of IF-THEN rules which are more expressive and easier for humans to comprehend. One weakness with rule induction algorithms is that they often scale relatively poorly with large data sets, especially on noisy data. The work reported in this thesis aims to design and develop scalable rule induction algorithms that can process large data sets efficiently while building from them the best possible models. There are two main approaches for rule induction, represented respectively by CN2 and the AQ family of algorithms. These approaches vary in the search strategy employed for examining the space of possible rules, each of which has its own advantages and disadvantages. The first part of this thesis introduces a new rule induction algorithm for learning classification rules, which broadly follows the approach of algorithms represented by CN2. The algorithm presents a new search method which employs several novel search-space pruning rules and rule-evaluation techniques. This results in a highly efficient algorithm with improved induction performance. Real-world data do not only contain nominal attributes but also continuous attributes. The ability to handle continuously valued data is thus crucial to the success of any general purpose learning algorithm. Most current discretisation approaches are developed as pre- processes for learning algorithms. The second part of this thesis proposes a new approach which discretises continuous-valued attributes during the learning process. Incorporating discretisation into the learning process has the advantage of taking into account the bias inherent in the learning system as well as the interactions between the different attributes. This in turn leads to improved performance. Overfitting the training data is a major problem in machine learning, particularly when noise is present. Overfitting increases learning time and reduces both the accuracy and the comprehensibility of the generated rules, making learning from large data sets more difficult. Pruning is a technique widely used for addressing such problems and consequently forms an essential component of practical learning algorithms. The third part of this thesis presents three new pruning techniques for rule induction based on the Minimum Description Length (MDL) principle. The result is an effective learning algorithm that not only produces an accurate and compact rule set, but also significantly accelerates the learning process. RULES-3 Plus is a simple rule induction algorithm developed at the author's laboratory which follows a similar approach to the AQ family of algorithms. Despite having been successfully applied to many learning problems, it has some drawbacks which adversely affect its performance. The fourth part of this thesis reports on an attempt to overcome these drawbacks by utilising the ideas presented in the first three parts of the thesis. A new version of RULES-3 Plus is reported that is a general and efficient algorithm with a wide range of potential applications
    corecore