155 research outputs found

    Parallel Perceptrons and Training Set Selection for Imbalanced Classification Problems

    Full text link
    This is an electronic version of the paper presented at the Learning 2004, held in Spain on 2004Parallel perceptrons are a novel approach to the study of committee machines that allows, among other things, for a fast training with minimal communications between outputs and hidden units. Moreover, their training allows to naturally de¯ne margins for hidden unit activations. In this work we shall show how to use those margins to perform subsample selections over a given training set that reduce training complexity while enhancing classi¯cation accuracy and allowing for a balanced classi¯er performance when class sizes are greatly di®erent.With partial support of Spain's CICyT, TIC 01-57

    Parallel Perceptrons, Activation Margins and Imbalanced Training Set Pruning

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/11492542_6Proceedings of Second Iberian Conference, IbPRIA 2005, Estoril, Portugal, June 7-9, 2005, Part IIA natural way to deal with training samples in imbalanced class problems is to prune them removing redundant patterns, easy to classify and probably over represented, and label noisy patterns that belonging to one class are labelled as members of another. This allows classifier construction to focus on borderline patterns, likely to be the most informative ones. To appropriately define the above subsets, in this work we will use as base classifiers the so–called parallel perceptrons, a novel approach to committee machine training that allows, among other things, to naturally define margins for hidden unit activations. We shall use these margins to define the above pattern types and to iteratively perform subsample selections in an initial training set that enhance classification accuracy and allow for a balanced classifier performance even when class sizes are greatly different.With partial support of Spain’s CICyT, TIC 01–572, TIN2004–0767

    Making decision trees feasible in ultrahigh feature and label dimensions

    Full text link
    ©2017 Weiwei Liu and Ivor W. Tsang. Due to the non-linear but highly interpretable representations, decision tree (DT) models have significantly attracted a lot of attention of researchers. However, it is difficult to understand and interpret DT models in ultrahigh dimensions and DT models usually suffer from the curse of dimensionality and achieve degenerated performance when there are many noisy features. To address these issues, this paper first presents a novel data-dependent generalization error bound for the perceptron decision tree (PDT), which provides the theoretical justification to learn a sparse linear hyperplane in each decision node and to prune the tree. Following our analysis, we introduce the notion of budget-aware classifier (BAC) with a budget constraint on the weight coefficients, and propose a supervised budgeted tree (SBT) algorithm to achieve non-linear prediction performance. To avoid generating an unstable and complicated decision tree and improve the generalization of the SBT, we present a pruning strategy by learning classifiers to minimize cross-validation errors on each BAC. To deal with ultrahigh label dimensions, based on three important phenomena of real-world data sets from a variety of application domains, we develop a sparse coding tree framework for multi-label annotation problems and provide the theoretical analysis. Extensive empirical studies verify that 1) SBT is easy to understand and interpret in ultrahigh dimensions and is more resilient to noisy features. 2) Compared with state-of-the-art algorithms, our proposed sparse coding tree framework is more efficient, yet accurate in ultrahigh label and feature dimensions

    Evolving rules for document classification

    Get PDF
    We describe a novel method for using Genetic Programming to create compact classification rules based on combinations of N-Grams (character strings). Genetic programs acquire fitness by producing rules that are effective classifiers in terms of precision and recall when evaluated against a set of training documents. We describe a set of functions and terminals and provide results from a classification task using the Reuters 21578 dataset. We also suggest that because the induced rules are meaningful to a human analyst they may have a number of other uses beyond classification and provide a basis for text mining applications

    On the optimality of classifier chain for multi-label classification

    Full text link
    To capture the interdependencies between labels in multi-label classification problems, classifier chain (CC) tries to take the multiple labels of each instance into account under a deterministic high-order Markov Chain model. Since its performance is sensitive to the choice of label order, the key issue is how to determine the optimal label order for CC. In this work, we first generalize the CC model over a random label order. Then, we present a theoretical analysis of the generalization error for the proposed generalized model. Based on our results, we propose a dynamic programming based classifier chain (CC-DP) algorithm to search the globally optimal label order for CC and a greedy classifier chain (CC-Greedy) algorithm to find a locally optimal CC. Comprehensive experiments on a number of real-world multi-label data sets from various domains demonstrate that our proposed CC-DP algorithm outperforms state-of-the-art approaches and the CC-Greedy algorithm achieves comparable prediction performance with CC-DP

    Infinite Ensemble Learning with Support Vector Machines

    Get PDF
    Ensemble learning algorithms such as boosting can achieve better performance by averaging over the predictions of base learners. However, existing algorithms are limited to combining only a finite number of base learners, and the generated ensemble is usually sparse. It is not clear whether we should construct an ensemble classifier with a larger or even an infinite number of base learners. In addition, constructing an infinite ensemble itself is a challenging task. In this paper, we formulate an infinite ensemble learning framework based on SVM. The framework could output an infinite and nonsparse ensemble, and can be applied to construct new kernels for SVM as well as to interpret existing ones. We demonstrate the framework with a concrete application, the stump kernel, which embodies infinitely many decision stumps. The stump kernel is simple, yet powerful. Experimental results show that SVM with the stump kernel usually achieves better performance than boosting, even with noisy data.</p

    Evolving Lucene search queries for text classification

    Get PDF
    We describe a method for generating accurate, compact, human understandable text classifiers. Text datasets are indexed using Apache Lucene and Genetic Programs are used to construct Lucene search queries. Genetic programs acquire fitness by producing queries that are effective binary classifiers for a particular category when evaluated against a set of training documents. We describe a set of functions and terminals and provide results from classification tasks

    Interactive volumetric segmentation for textile micro-tomography data using wavelets and nonlocal means

    Get PDF
    This work addresses segmentation of volumetric images of woven carbon fiber textiles from micro-tomography data. We propose a semi-supervised algorithm to classify carbon fibers that requires sparse input as opposed to completely labeled images. The main contributions are: (a) design of effective discriminative classifiers, for three-dimensional textile samples, trained on wavelet features for segmentation; (b) coupling of previous step with nonlocal means as simple, efficient alternative to the Potts model; and (c) demonstration of reuse of classifier to diverse samples containing similar content. We evaluate our work by curating test sets of voxels in the absence of a complete ground truth mask. The algorithm obtains an average 0.95 F1 score on test sets and average F1 score of 0.93 on new samples. We conclude with discussion of failure cases and propose future directions toward analysis of spatiotemporal high-resolution micro-tomography images

    Evolving text classification rules with genetic programming

    Get PDF
    We describe a novel method for using genetic programming to create compact classification rules using combinations of N-grams (character strings). Genetic programs acquire fitness by producing rules that are effective classifiers in terms of precision and recall when evaluated against a set of training documents. We describe a set of functions and terminals and provide results from a classification task using the Reuters 21578 dataset. We also suggest that the rules may have a number of other uses beyond classification and provide a basis for text mining applications

    Supervised Classification and Mathematical Optimization

    Get PDF
    Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data
    • …