47 research outputs found

    Regularizing soft decision trees

    Get PDF
    Recently, we have proposed a new decision tree family called soft decision trees where a node chooses both its left and right children with different probabilities as given by a gating function, different from a hard decision node which chooses one of the two. In this paper, we extend the original algorithm by introducing local dimension reduction via L-1 and L-2 regularization for feature selection and smoother fitting. We compare our novel approach with the standard decision tree algorithms over 27 classification data sets. We see that both regularized versions have similar generalization ability with less complexity in terms of number of nodes, where L-2 seems to work slightly better than L-1.Publisher's VersionAuthor Post Prin

    Fisher’s decision tree

    Get PDF
    Univariate decision trees are classifiers currently used in many data mining applications. This classifier discovers partitions in the input space via hyperplanes that are orthogonal to the axes of attributes, producing a model that can be understood by human experts. One disadvantage of univariate decision trees is that they produce complex and inaccurate models when decision boundaries are not orthogonal to axes. In this paper we introduce the Fisher’s Tree, it is a classifier that takes advantage of dimensionality reduction of Fisher’s linear discriminant and uses the decomposition strategy of decision trees, to come up with an oblique decision tree. Our proposal generates an artificial attribute that is used to split the data in a recursive way. The Fisher’s decision tree induces oblique trees whose accuracy, size, number of leaves and training time are competitive with respect to other decision trees reported in the literature. We use more than ten public available data sets to demonstrate the effectiveness of our method

    Quadratic programming for class ordering in rule induction

    Get PDF
    Separate-and-conquer type rule induction algorithms such as Ripper, solve a K>2 class problem by converting it into a sequence of K - 1 two-class problems. As a usual heuristic, the classes are fed into the algorithm in the order of increasing prior probabilities. Although the heuristic works well in practice, there is much room for improvement. In this paper, we propose a novel approach to improve this heuristic. The approach transforms the ordering search problem into a quadratic optimization problem and uses the solution of the optimization problem to extract the optimal ordering. We compared new Ripper (guided by the ordering found with our approach) with original Ripper (guided by the heuristic ordering) on 27 datasets. Simulation results show that our approach produces rulesets that are significantly better than those produced by the original Ripper.Publisher's VersionAuthor Post Prin

    Dynamic learning of cases from data streams

    Get PDF
    This paper presents a dynamic adaptive framework for building a case library being able to cope with a data stream in the field of Case-Based Reasoning. The framework provides a three-layer architecture formed by a set of case libraries dynamically built. This Dynamic and Adaptive Case Library (DACL), can process in an incremental way a data stream, and can be used as a classification model or a regression model, depending on the predicted variable. In this paper, the work is focused on classification tasks. Each case library has a first layer formed by the dynamic clusters of cases, a second one formed by the meta-cases or prototypes of the cluster, and a third one formed by an incremental indexing structure. In our approach, some variant of k-d tres have been used, in addition to an exploration technique to get a more efficient retrieval time. This three-layer famework can be constructed in an incremental way. Several meta-case learning approaches are proposed, as well as some case learning strategies. The framework has been tested with several datasets. The experimental results show a very good performance in comparison with a batch learning scheme over the same data.Peer ReviewedPostprint (author's final draft

    Model selection in omnivariate decision trees using Structural Risk Minimization

    Get PDF
    As opposed to trees that use a single type of decision node, an omnivariate decision tree contains nodes of different types. We propose to use Structural Risk Minimization (SRM) to choose between node types in omnivariate decision tree construction to match the complexity of a node to the complexity of the data reaching that node. In order to apply SRM for model selection, one needs the VC-dimension of the candidate models. In this paper, we first derive the VC-dimension of the univariate model, and estimate the VC-dimension of all three models (univariate, linear multivariate or quadratic multivariate) experimentally. Second, we compare SRM with other model selection techniques including Akaike's Information Criterion (AIC), Bayesian Information Criterion (BIC) and cross-validation (CV) on standard datasets from the UCI and Delve repositories. We see that SRM induces omnivariate trees that have a small percentage of multivariate nodes close to the root and they generalize more or at least as accurately as those constructed using other model selection techniques.The authors thank the three anonymous referees and the editor for their constructive comments, pointers to related literature, and pertinent questions which allowed us to better situate our work as well as organize the ms and improve the presentation. This work has been supported by the Turkish Scientific Technical Research Council TUBITAK EEEAG 107E127Publisher's VersionAuthor Pre-Prin

    A dynamic adaptive framework for improving case-based reasoning system performance

    Get PDF
    An optimal performance of a Case-Based Reasoning (CBR) system means, the CBR system must be efficient both in time and in size, and must be optimally competent. The efficiency in time is closely related to an efficient and optimal retrieval process over the Case Base of the CBR system. Efficiency in size means that the Case Library (CL) size should be minimal. Therefore, the efficiency in size is closely related to optimal case learning policies, optimal meta-case learning policies, optimal case forgetting policies, etc. On the other hand, the optimal competence of a CBR system means that the number of problems that the CBR system can satisfactorily solve must be maximum. To improve or optimize all three dimensions in a CBR system at the same time is a difficult challenge because they are interrelated, and it becomes even more difficult when the CBR system is applied to a dynamic or continuous domain (data stream). In this thesis, a Dynamic Adaptive Case Library framework (DACL) is proposed to improve the CBR system performance coping especially with reducing the retrieval time, increasing the CBR system competence, and maintaining and adapting the CL to be efficient in size, especially in continuous domains. DACL learns cases and organizes them into dynamic cluster structures. The DACL is able to adapt itself to a dynamic environment, where new clusters, meta-cases or prototype of cases, and associated indexing structures (discriminant trees, k-d trees, etc.) can be formed, updated, or even removed. DACL offers a possible solution to the management of the large amount of data generated in an unsupervised continuous domain (data stream). In addition, we propose the use of a Multiple Case Library (MCL), which is a static version of a DACL, with the same structure but being defined statically to be used in supervised domains. The thesis work proposes some techniques for improving the indexation and the retrieval task. The most important indexing method is the NIAR k-d tree algorithm, which improves the retrieval time and competence, compared against the baseline approach (a flat CL) and against the well-known techniques based on using standard k-d tree strategies. The proposed Partial Matching Exploration (PME) technique explores a hierarchical case library with a tree indexing-structure aiming at not losing the most similar cases to a query case. This technique allows not only exploring the best matching path, but also several alternative partial matching paths to be explored. The results show an improvement in competence and time of retrieving of similar cases. Through the experimentation tests done, with a set of well-known benchmark supervised databases. The dynamic building of prototypes in DACL has been tested in an unsupervised domain (environmental domain) where the air pollution is evaluated. The core task of building prototypes in a DACL is the implementation of a stochastic method for the learning of new cases and management of prototypes. Finally, the whole dynamic framework, integrating all the main proposed approaches of the research work, has been tested in simulated unsupervised domains with several well-known databases in an incremental way, as data streams are processed in real life. The conclusions outlined that from the experimental results, it can be stated that the dynamic adaptive framework proposed (DACL/MCL), jointly with the contributed indexing strategies and exploration techniques, and with the proposed stochastic case learning policies, and meta-case learning policies, improves the performance of standard CBR systems both in supervised domains (MCL) and in unsupervised continuous domains (DACL).El rendimiento óptimo de un sistema de razonamiento basado en casos (CBR) significa que el sistema CBR debe ser eficiente tanto en tiempo como en tamaño, y debe ser competente de manera óptima. La eficiencia temporal está estrechamente relacionada con que el proceso de recuperación sobre la Base de Casos del sistema CBR sea eficiente y óptimo. La eficiencia en tamaño significa que el tamaño de la Base de Casos (CL) debe ser mínimo. Por lo tanto, la eficiencia en tamaño está estrechamente relacionada con las políticas óptimas de aprendizaje de casos y meta-casos, y las políticas óptimas de olvido de casos, etc. Por otro lado, la competencia óptima de un sistema CBR significa que el número de problemas que el sistema puede resolver de forma satisfactoria debe ser máximo. Mejorar u optimizar las tres dimensiones de un sistema CBR al mismo tiempo es un reto difícil, ya que están relacionadas entre sí, y se vuelve aún más difícil cuando se aplica el sistema de CBR a un dominio dinámico o continuo (flujo de datos). En esta tesis se propone el Dynamic Adaptive Case Library framework (DACL) para mejorar el rendimiento del sistema CBR especialmente con la reducción del tiempo de recuperación, aumentando la competencia del sistema CBR, manteniendo y adaptando la CL para ser eficiente en tamaño, especialmente en dominios continuos. DACL aprende casos y los organiza en estructuras dinámicas de clusters. DACL es capaz de adaptarse a entornos dinámicos, donde los nuevos clusters, meta-casos o prototipos de los casos, y las estructuras asociadas de indexación (árboles discriminantes, árboles k-d, etc.) se pueden formar, actualizarse, o incluso ser eliminados. DACL ofrece una posible solución para la gestión de la gran cantidad de datos generados en un dominio continuo no supervisado (flujo de datos). Además, se propone el uso de la Multiple Case Library (MCL), que es una versión estática de una DACL, con la misma estructura pero siendo definida estáticamente para ser utilizada en dominios supervisados. El trabajo de tesis propone algunas técnicas para mejorar los procesos de indexación y de recuperación. El método de indexación más importante es el algoritmo NIAR k-d tree, que mejora el tiempo de recuperación y la competencia, comparado con una CL plana y con las técnicas basadas en el uso de estrategias de árboles k-d estándar. Partial Matching Exploration (PME) technique, la técnica propuesta, explora una base de casos jerárquica con una indexación de estructura de árbol con el objetivo de no perder los casos más similares a un caso de consulta. Esta técnica no sólo permite explorar el mejor camino coincidente, sino también varios caminos parciales alternativos coincidentes. Los resultados, a través de la experimentación realizada con bases de datos supervisadas conocidas, muestran una mejora de la competencia y del tiempo de recuperación de casos similares. Además la construcción dinámica de prototipos en DACL ha sido probada en un dominio no supervisado (dominio ambiental), donde se evalúa la contaminación del aire. La tarea central de la construcción de prototipos en DACL es la implementación de un método estocástico para el aprendizaje de nuevos casos y la gestión de prototipos. Por último, todo el sistema, integrando todos los métodos propuestos en este trabajo de investigación, se ha evaluado en dominios no supervisados simulados con varias bases de datos de una manera gradual, como se procesan los flujos de datos en la vida real. Las conclusiones, a partir de los resultados experimentales, muestran que el sistema de adaptación dinámica propuesto (DACL / MCL), junto con las estrategias de indexación y de exploración, y con las políticas de aprendizaje de casos estocásticos y de meta-casos propuestas, mejora el rendimiento de los sistemas estándar de CBR tanto en dominios supervisados (MCL) como en dominios continuos no supervisados (DACL).Postprint (published version

    Omnivariate rule induction using a novel pairwise statistical test

    Get PDF
    Rule learning algorithms, for example, RIPPER, induces univariate rules, that is, a propositional condition in a rule uses only one feature. In this paper, we propose an omnivariate induction of rules where under each condition, both a univariate and a multivariate condition are trained, and the best is chosen according to a novel statistical test. This paper has three main contributions: First, we propose a novel statistical test, the combined 5 x 2 cv t test, to compare two classifiers, which is a variant of the 5 x 2 cv t test and give the connections to other tests as 5 x 2 cv F test and k-fold paired t test. Second, we propose a multivariate version of RIPPER, where support vector machine with linear kernel is used to find multivariate linear conditions. Third, we propose an omnivariate version of RIPPER, where the model selection is done via the combined 5 x 2 cv t test. Our results indicate that 1) the combined 5 x 2 cv t test has higher power (lower type II error), lower type I error, and higher replicability compared to the 5 x 2 cv t test, 2) omnivariate rules are better in that they choose whichever condition is more accurate, selecting the right model automatically and separately for each condition in a rule.Publisher's VersionAuthor Post Prin

    On the feature extraction in discrete space

    Get PDF
    In many pattern recognition applications, feature space expansion is a key step for improving the performance of the classifier. In this paper, we (i) expand the discrete feature space by generating all orderings of values of k discrete attributes exhaustively, (ii) modify the well-known decision tree and rule induction classifiers (ID3, Quilan, 1986 [1] and Ripper, Cohen, 1995 [2]) using these orderings as the new attributes. Our simulation results on 15 datasets from UCI repository [3] show that the novel classifiers perform better than the proper ones in terms of error rate and complexity.This work has been supported by the Turkish Scientific Technical Research Council (TUBITAK) EEEAG 107E127Publisher's VersionAuthor Pre-Prin
    corecore