54 research outputs found

    A comparison of the utility of data mining algorithms in an open distance learning context

    Get PDF
    The use of data mining within the higher education context has, increasingly, been gaining traction. A parallel examination of the accuracy, robustness and utility of the algorithms applied within data mining is argued as a necessary step toward entrenching the use of EDM. This paper provides a comparative analysis of various classification algorithms within an Open Distance Learning institution in South Africa. The study compares the performance of the ZeroR, OneR, Naïve Bayes, IBk, Simple Logistic Regression and the J48 in classifying students within a cohort over an 8 year timespan. The initial results appear to show that, given the data quality and structure of the institution under study, the J48 most consistently performed with the highest levels of accuracy

    Connectionist Theory Refinement: Genetically Searching the Space of Network Topologies

    Full text link
    An algorithm that learns from a set of examples should ideally be able to exploit the available resources of (a) abundant computing power and (b) domain-specific knowledge to improve its ability to generalize. Connectionist theory-refinement systems, which use background knowledge to select a neural network's topology and initial weights, have proven to be effective at exploiting domain-specific knowledge; however, most do not exploit available computing power. This weakness occurs because they lack the ability to refine the topology of the neural networks they produce, thereby limiting generalization, especially when given impoverished domain theories. We present the REGENT algorithm which uses (a) domain-specific knowledge to help create an initial population of knowledge-based neural networks and (b) genetic operators of crossover and mutation (specifically designed for knowledge-based networks) to continually search for better network topologies. Experiments on three real-world domains indicate that our new algorithm is able to significantly increase generalization compared to a standard connectionist theory-refinement system, as well as our previous algorithm for growing knowledge-based networks.Comment: See http://www.jair.org/ for any accompanying file

    Progressing Towards Responsible AI

    Get PDF
    The field of Artificial Intelligence (AI) and, in particular, the Machine Learning area, counts on a wide range of performance metrics and benchmark data sets to assess the problem-solving effectiveness of its solutions. However, the appearance of research centres, projects or institutions addressing AI solutions from a multidisciplinary and multi-stakeholder perspective suggests a new approach to assessment comprising ethical guidelines, reports or tools and frameworks to help both academia and business to move towards a responsible conceptualisation of AI. They all highlight the relevance of three key aspects: (i) enhancing cooperation among the different stakeholders involved in the design, deployment and use of AI; (ii) promoting multidisciplinary dialogue, including different domains of expertise in this process; and (iii) fostering public engagement to maximise a trusted relation with new technologies and practitioners. In this paper, we introduce the Observatory on Society and Artificial Intelligence (OSAI), an initiative grew out of the project AI4EU aimed at stimulating reflection on a broad spectrum of issues of AI (ethical, legal, social, economic and cultural). In particular, we describe our work in progress around OSAI and suggest how this and similar initiatives can promote a wider appraisal of progress in AI. This will give us the opportunity to present our vision and our modus operandi to enhance the implementation of these three fundamental dimensions

    Supervised Classification: Quite a Brief Overview

    Full text link
    The original problem of supervised classification considers the task of automatically assigning objects to their respective classes on the basis of numerical measurements derived from these objects. Classifiers are the tools that implement the actual functional mapping from these measurements---also called features or inputs---to the so-called class label---or output. The fields of pattern recognition and machine learning study ways of constructing such classifiers. The main idea behind supervised methods is that of learning from examples: given a number of example input-output relations, to what extent can the general mapping be learned that takes any new and unseen feature vector to its correct class? This chapter provides a basic introduction to the underlying ideas of how to come to a supervised classification problem. In addition, it provides an overview of some specific classification techniques, delves into the issues of object representation and classifier evaluation, and (very) briefly covers some variations on the basic supervised classification task that may also be of interest to the practitioner

    Feature interval learning algorithms for classification

    Get PDF
    Cataloged from PDF version of article.This paper presents Feature Interval Learning algorithms (FIL) which represent multi-concept descriptions in the form of disjoint feature intervals. The FIL algorithms are batch supervised inductive learning algorithms and use feature projections of the training instances to represent induced classification knowledge. The concept description is learned separately for each feature and is in the form of a set of disjoint intervals. The class of an unseen instance is determined by the weighted-majority voting of the feature predictions. The basic FIL algorithm is enhanced with adaptive interval and feature weight schemes in order to handle noisy and irrelevant features. The algorithms are empirically evaluated on twelve data sets from the UCI repository and are compared with k-NN, k-NNFP, and NBC classification algorithms. The experiments demonstrate that the FIL algorithms are robust to irrelevant features and missing feature values, achieve accuracy comparable to the best of the existing algorithms with significantly less average running times. (C) 2010 Elsevier B.V. All rights reserved
    corecore