5,022 research outputs found

    Fast Single-Class Classification and the Principle of Logit Separation

    Full text link
    We consider neural network training, in applications in which there are many possible classes, but at test-time, the task is a binary classification task of determining whether the given example belongs to a specific class, where the class of interest can be different each time the classifier is applied. For instance, this is the case for real-time image search. We define the Single Logit Classification (SLC) task: training the network so that at test-time, it would be possible to accurately identify whether the example belongs to a given class in a computationally efficient manner, based only on the output logit for this class. We propose a natural principle, the Principle of Logit Separation, as a guideline for choosing and designing losses suitable for the SLC. We show that the cross-entropy loss function is not aligned with the Principle of Logit Separation. In contrast, there are known loss functions, as well as novel batch loss functions that we propose, which are aligned with this principle. In total, we study seven loss functions. Our experiments show that indeed in almost all cases, losses that are aligned with the Principle of Logit Separation obtain at least 20% relative accuracy improvement in the SLC task compared to losses that are not aligned with it, and sometimes considerably more. Furthermore, we show that fast SLC does not cause any drop in binary classification accuracy, compared to standard classification in which all logits are computed, and yields a speedup which grows with the number of classes. For instance, we demonstrate a 10x speedup when the number of classes is 400,000. Tensorflow code for optimizing the new batch losses is publicly available at https://github.com/cruvadom/Logit Separation.Comment: Published as a conference paper in ICDM 201

    Latent class analysis for segmenting preferences of investment bonds

    Get PDF
    Market segmentation is a key component of conjoint analysis which addresses consumer preference heterogeneity. Members in a segment are assumed to be homogenous in their views and preferences when worthing an item but distinctly heterogenous to members of other segments. Latent class methodology is one of the several conjoint segmentation procedures that overcome the limitations of aggregate analysis and a-priori segmentation. The main benefit of Latent class models is that market segment membership and regression parameters of each derived segment are estimated simultaneously. The Latent class model presented in this paper uses mixtures of multivariate conditional normal distributions to analyze rating data, where the likelihood is maximized using the EM algorithm. The application focuses on customer preferences for investment bonds described by four attributes; currency, coupon rate, redemption term and price. A number of demographic variables are used to generate segments that are accessible and actionable.peer-reviewe

    The Default Risk of Firms Examined with Smooth Support Vector Machines

    Get PDF
    In the era of Basel II a powerful tool for bankruptcy prognosis is vital for banks. The tool must be precise but also easily adaptable to the bank's objections regarding the relation of false acceptances (Type I error) and false rejections (Type II error). We explore the suitabil- ity of Smooth Support Vector Machines (SSVM), and investigate how important factors such as selection of appropriate accounting ratios (predictors), length of training period and structure of the training sample in°uence the precision of prediction. Furthermore we show that oversampling can be employed to gear the tradeo® between error types. Finally, we illustrate graphically how di®erent variants of SSVM can be used jointly to support the decision task of loan o±cers.Insolvency Prognosis, SVMs, Statistical Learning Theory, Non-parametric Classification models, local time-homogeneity

    Learning Prototype Classifiers for Long-Tailed Recognition

    Full text link
    The problem of long-tailed recognition (LTR) has received attention in recent years due to the fundamental power-law distribution of objects in the real-world. Most recent works in LTR use softmax classifiers that have a tendency to correlate classifier norm with the amount of training data for a given class. On the other hand, Prototype classifiers do not suffer from this shortcoming and can deliver promising results simply using Nearest-Class-Mean (NCM), a special case where prototypes are empirical centroids. However, the potential of Prototype classifiers as an alternative to softmax in LTR is relatively underexplored. In this work, we propose Prototype classifiers, which jointly learn prototypes that minimize average cross-entropy loss based on probability scores from distances to prototypes. We theoretically analyze the properties of Euclidean distance based prototype classifiers that leads to stable gradient-based optimization which is robust to outliers. We further enhance Prototype classifiers by learning channel-dependent temperature parameters to enable independent distance scales along each channel. Our analysis shows that prototypes learned by Prototype classifiers are better separated than empirical centroids. Results on four long-tailed recognition benchmarks show that Prototype classifier outperforms or is comparable to the state-of-the-art methods.Comment: Accepted at IJCAI-2

    Estimating Probabilities of Default With Support Vector Machines

    Get PDF
    This paper proposes a rating methodology that is based on a non-linear classification method, the support vector machine, and a non-parametric technique for mapping rating scores into probabilities of default. We give an introduction to underlying statistical models and represent the results of testing our approach on German Bundesbank data. In particular we discuss the selection of variables and give a comparison with more traditional approaches such as discriminant analysis and the logit regression. The results demonstrate that the SVM has clear advantages over these methods for all variables tested

    Fast DD-classification of functional data

    Full text link
    A fast nonparametric procedure for classifying functional data is introduced. It consists of a two-step transformation of the original data plus a classifier operating on a low-dimensional hypercube. The functional data are first mapped into a finite-dimensional location-slope space and then transformed by a multivariate depth function into the DDDD-plot, which is a subset of the unit hypercube. This transformation yields a new notion of depth for functional data. Three alternative depth functions are employed for this, as well as two rules for the final classification on [0,1]q[0,1]^q. The resulting classifier has to be cross-validated over a small range of parameters only, which is restricted by a Vapnik-Cervonenkis bound. The entire methodology does not involve smoothing techniques, is completely nonparametric and allows to achieve Bayes optimality under standard distributional settings. It is robust, efficiently computable, and has been implemented in an R environment. Applicability of the new approach is demonstrated by simulations as well as a benchmark study
    • …
    corecore