5,393 research outputs found

    Decision diagrams in machine learning: an empirical study on real-life credit-risk data.

    Get PDF
    Decision trees are a widely used knowledge representation in machine learning. However, one of their main drawbacks is the inherent replication of isomorphic subtrees, as a result of which the produced classifiers might become too large to be comprehensible by the human experts that have to validate them. Alternatively, decision diagrams, a generalization of decision trees taking on the form of a rooted, acyclic digraph instead of a tree, have occasionally been suggested as a potentially more compact representation. Their application in machine learning has nonetheless been criticized, because the theoretical size advantages of subgraph sharing did not always directly materialize in the relatively scarce reported experiments on real-world data. Therefore, in this paper, starting from a series of rule sets extracted from three real-life credit-scoring data sets, we will empirically assess to what extent decision diagrams are able to provide a compact visual description. Furthermore, we will investigate the practical impact of finding a good attribute ordering on the achieved size savings.Advantages; Classifiers; Credit scoring; Data; Decision; Decision diagrams; Decision trees; Empirical study; Knowledge; Learning; Real life; Representation; Size; Studies;

    Learning understandable classifier models.

    Get PDF
    The topic of this dissertation is the automation of the process of extracting understandable patterns and rules from data. An unprecedented amount of data is available to anyone with a computer connected to the Internet. The disciplines of Data Mining and Machine Learning have emerged over the last two decades to face this challenge. This has led to the development of many tools and methods. These tools often produce models that make very accurate predictions about previously unseen data. However, models built by the most accurate methods are usually hard to understand or interpret by humans. In consequence, they deliver only decisions, and are short of any explanations. Hence they do not directly lead to the acquisition of new knowledge. This dissertation contributes to bridging the gap between the accurate opaque models and those less accurate but more transparent for humans. This dissertation first defines the problem of learning from data. It surveys the state-of-the-art methods for supervised learning of both understandable and opaque models from data, as well as unsupervised methods that detect features present in the data. It describes popular methods of rule extraction from unintelligible models which rewrite them into an understandable form. Limitations of rule extraction are described. A novel definition of understandability which ties computational complexity and learning is provided to show that rule extraction is an NP-hard problem. Next, a discussion whether one can expect that even an accurate classifier has learned new knowledge. The survey ends with a presentation of two approaches to building of understandable classifiers. On the one hand, understandable models must be able to accurately describe relations in the data. On the other hand, often a description of the output of a system in terms of its input requires the introduction of intermediate concepts, called features. Therefore it is crucial to develop methods that describe the data with understandable features and are able to use those features to present the relation that describes the data. Novel contributions of this thesis follow the survey. Two families of rule extraction algorithms are considered. First, a method that can work with any opaque classifier is introduced. Artificial training patterns are generated in a mathematically sound way and used to train more accurate understandable models. Subsequently, two novel algorithms that require that the opaque model is a Neural Network are presented. They rely on access to the network\u27s weights and biases to induce rules encoded as Decision Diagrams. Finally, the topic of feature extraction is considered. The impact on imposing non-negativity constraints on the weights of a neural network is considered. It is proved that a three layer network with non-negative weights can shatter any given set of points and experiments are conducted to assess the accuracy and interpretability of such networks. Then, a novel path-following algorithm that finds robust sparse encodings of data is presented. In summary, this dissertation contributes to improved understandability of classifiers in several tangible and original ways. It introduces three distinct aspects of achieving this goal: infusion of additional patterns from the underlying pattern distribution into rule learners, the derivation of decision diagrams from neural networks, and achieving sparse coding with neural networks with non-negative weights

    An overview of decision table literature 1982-1995.

    Get PDF
    This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.

    A System for Induction of Oblique Decision Trees

    Full text link
    This article describes a new system for induction of oblique decision trees. This system, OC1, combines deterministic hill-climbing with two forms of randomization to find a good oblique split (in the form of a hyperplane) at each node of a decision tree. Oblique decision tree methods are tuned especially for domains in which the attributes are numeric, although they can be adapted to symbolic or mixed symbolic/numeric attributes. We present extensive empirical studies, using both real and artificial data, that analyze OC1's ability to construct oblique trees that are smaller and more accurate than their axis-parallel counterparts. We also examine the benefits of randomization for the construction of oblique decision trees.Comment: See http://www.jair.org/ for an online appendix and other files accompanying this articl

    Knowledge-based incremental induction of clinical algorithms

    Get PDF
    The current approaches for the induction of medical procedural knowledge suffer from several drawbacks: the structures produced may not be explicit medical structures, they are only based on statistical measures that do not necessarily respect medical criteria which can be essential to guarantee medical correct structures, or they are not prepared to deal with the incremental arrival of new data. In this thesis we propose a methodology to automatically induce medically correct clinical algorithms (CAs) from hospital databases. These CAs are represented according to the SDA knowledge model. The methodology considers relevant background knowledge and it is able to work in an incremental way. The methodology has been tested in the domains of hypertension, diabetes mellitus and the comborbidity of both diseases. As a result, we propose a repository of background knowledge for these pathologies and provide the SDA diagrams obtained. Later analyses show that the results are medically correct and comprehensible when validated with health care professionals

    Tractable nonlinear memory functions as a tool to capture and explain dynamical behaviours

    Full text link
    Mathematical approaches from dynamical systems theory are used in a range of fields. This includes biology where they are used to describe processes such as protein-protein interaction and gene regulatory networks. As such networks increase in size and complexity, detailed dynamical models become cumbersome, making them difficult to explore and decipher. This necessitates the application of simplifying and coarse graining techniques in order to derive explanatory insight. Here we demonstrate that Zwanzig-Mori projection methods can be used to arbitrarily reduce the dimensionality of dynamical networks while retaining their dynamical properties. We show that a systematic expansion around the quasi-steady state approximation allows an explicit solution for memory functions without prior knowledge of the dynamics. The approach not only preserves the same steady states but also replicates the transients of the original system. The method also correctly predicts the dynamics of multistable systems as well as networks producing sustained and damped oscillations. Applying the approach to a gene regulatory network from the vertebrate neural tube, a well characterised developmental transcriptional network, identifies features of the regulatory network responsible dfor its characteristic transient behaviour. Taken together, our analysis shows that this method is broadly applicable to multistable dynamical systems and offers a powerful and efficient approach for understanding their behaviour.Comment: (8 pages, 8 figures

    Machine learning and its applications in reliability analysis systems

    Get PDF
    In this thesis, we are interested in exploring some aspects of Machine Learning (ML) and its application in the Reliability Analysis systems (RAs). We begin by investigating some ML paradigms and their- techniques, go on to discuss the possible applications of ML in improving RAs performance, and lastly give guidelines of the architecture of learning RAs. Our survey of ML covers both levels of Neural Network learning and Symbolic learning. In symbolic process learning, five types of learning and their applications are discussed: rote learning, learning from instruction, learning from analogy, learning from examples, and learning from observation and discovery. The Reliability Analysis systems (RAs) presented in this thesis are mainly designed for maintaining plant safety supported by two functions: risk analysis function, i.e., failure mode effect analysis (FMEA) ; and diagnosis function, i.e., real-time fault location (RTFL). Three approaches have been discussed in creating the RAs. According to the result of our survey, we suggest currently the best design of RAs is to embed model-based RAs, i.e., MORA (as software) in a neural network based computer system (as hardware). However, there are still some improvement which can be made through the applications of Machine Learning. By implanting the 'learning element', the MORA will become learning MORA (La MORA) system, a learning Reliability Analysis system with the power of automatic knowledge acquisition and inconsistency checking, and more. To conclude our thesis, we propose an architecture of La MORA

    Implementability Among Predicates

    Get PDF
    Much work has been done to understand when given predicates (relations) on discrete variables can be conjoined to implement other predicates. Indeed, the lattice of "co-clones" (sets of predicates closed under conjunction, variable renaming, and existential quantification of variables) has been investigated steadily from the 1960's to the present. Here, we investigate a more general model, where duplicatability of values is not taken for granted. This model is motivated in part by large scale neural models, where duplicating a value is similar in cost to computing a function, and by quantum mechanics, where values cannot be duplicated. Implementations in this case are naturally given by a graph fragment in which vertices are predicates, internal edges are existentially quantified variables, and "dangling edges" (edges emanating from a vertex but not yet connected to another vertex) are the free variables of the implemented predicate. We examine questions of implementability among predicates in this scenario, and we present the solution to all implementability problems for single predicates on up to three boolean values. However, we find that a variety of proof methods are required, and the question of implementability indeed becomes undecidable for larger predicates, although this is tricky to prove. We find that most predicates cannot implement the 3-way equality predicate, which reaffirms the view that duplicatability of values should not be assumed a priori

    Modulating the Granularity of Category Formation by Global Cortical States

    Get PDF
    The unsupervised categorization of sensory stimuli is typically attributed to feedforward processing in a hierarchy of cortical areas. This purely sensory-driven view of cortical processing, however, ignores any internal modulation, e.g., by top-down attentional signals or neuromodulator release. To isolate the role of internal signaling on category formation, we consider an unbroken continuum of stimuli without intrinsic category boundaries. We show that a competitive network, shaped by recurrent inhibition and endowed with Hebbian and homeostatic synaptic plasticity, can enforce stimulus categorization. The degree of competition is internally controlled by the neuronal gain and the strength of inhibition. Strong competition leads to the formation of many attracting network states, each being evoked by a distinct subset of stimuli and representing a category. Weak competition allows more neurons to be co-active, resulting in fewer but larger categories. We conclude that the granularity of cortical category formation, i.e., the number and size of emerging categories, is not simply determined by the richness of the stimulus environment, but rather by some global internal signal modulating the network dynamics. The model also explains the salient non-additivity of visual object representation observed in the monkey inferotemporal (IT) cortex. Furthermore, it offers an explanation of a previously observed, demand-dependent modulation of IT activity on a stimulus categorization task and of categorization-related cognitive deficits in schizophrenic patients
    corecore