17,224 research outputs found

    Optimization of distributions differences for classification

    Full text link
    In this paper we introduce a new classification algorithm called Optimization of Distributions Differences (ODD). The algorithm aims to find a transformation from the feature space to a new space where the instances in the same class are as close as possible to one another while the gravity centers of these classes are as far as possible from one another. This aim is formulated as a multiobjective optimization problem that is solved by a hybrid of an evolutionary strategy and the Quasi-Newton method. The choice of the transformation function is flexible and could be any continuous space function. We experiment with a linear and a non-linear transformation in this paper. We show that the algorithm can outperform 6 other state-of-the-art classification methods, namely naive Bayes, support vector machines, linear discriminant analysis, multi-layer perceptrons, decision trees, and k-nearest neighbors, in 12 standard classification datasets. Our results show that the method is less sensitive to the imbalanced number of instances comparing to these methods. We also show that ODD maintains its performance better than other classification methods in these datasets, hence, offers a better generalization ability

    Click-through rate prediction : a comparative study of ensemble techniques in real-time bidding

    Get PDF
    Dissertation presented as a partial requirement for obtaining the Master’s degree in Information Management, with a specialization in Business Intelligence and Knowledge ManagementReal-Time Bidding is an automated mechanism to buy and sell ads in real time that uses data collected from internet users, to accurately deliver the right audience to the best-matched advertisers. It goes beyond contextual advertising by motivating the bidding focused on user data and also, it is different from the sponsored search auction where the bid price is associated with keywords. There is extensive literature regarding the classification and prediction of performance metrics such as click-through-rate, impression rate and bidding price. However, there is limited research on the application of advanced machine learning techniques, such as ensemble methods, on predicting click-through rate of real-time bidding campaigns. This paper presents an in-depth analysis of predicting click-through rate in real-time bidding campaigns by comparing the classification results from six traditional classification models (Linear Discriminant Analysis, Logistic Regression, Regularised Regression, Decision trees, k-nearest neighbors and Support Vector Machines) with two popular ensemble learning techniques (Voting and BootStrap Aggregation). The goal of our research is to determine whether ensemble methods can accurately predict click-through rate and compared to standard classifiers. Results showed that ensemble techniques outperformed simple classifiers performance. Moreover, also, highlights the excellent performance of linear algorithms (Linear Discriminant Analysis and Regularized Regression)

    Support Vector Machines for Credit Scoring and discovery of significant features

    Get PDF
    The assessment of risk of default on credit is important for financial institutions. Logistic regression and discriminant analysis are techniques traditionally used in credit scoring for determining likelihood to default based on consumer application and credit reference agency data. We test support vector machines against these traditional methods on a large credit card database. We find that they are competitive and can be used as the basis of a feature selection method to discover those features that are most significant in determining risk of default. 1

    Neural Class-Specific Regression for face verification

    Get PDF
    Face verification is a problem approached in the literature mainly using nonlinear class-specific subspace learning techniques. While it has been shown that kernel-based Class-Specific Discriminant Analysis is able to provide excellent performance in small- and medium-scale face verification problems, its application in today's large-scale problems is difficult due to its training space and computational requirements. In this paper, generalizing our previous work on kernel-based class-specific discriminant analysis, we show that class-specific subspace learning can be cast as a regression problem. This allows us to derive linear, (reduced) kernel and neural network-based class-specific discriminant analysis methods using efficient batch and/or iterative training schemes, suited for large-scale learning problems. We test the performance of these methods in two datasets describing medium- and large-scale face verification problems.Comment: 9 pages, 4 figure

    Partial least squares discriminant analysis: A dimensionality reduction method to classify hyperspectral data

    Get PDF
    The recent development of more sophisticated spectroscopic methods allows acquisition of high dimensional datasets from which valuable information may be extracted using multivariate statistical analyses, such as dimensionality reduction and automatic classification (supervised and unsupervised). In this work, a supervised classification through a partial least squares discriminant analysis (PLS-DA) is performed on the hy- perspectral data. The obtained results are compared with those obtained by the most commonly used classification approaches

    Partial least squares discriminant analysis: A dimensionality reduction method to classify hyperspectral data

    Get PDF
    The recent development of more sophisticated spectroscopic methods allows acqui- sition of high dimensional datasets from which valuable information may be extracted using multivariate statistical analyses, such as dimensionality reduction and automatic classification (supervised and unsupervised). In this work, a supervised classification through a partial least squares discriminant analysis (PLS-DA) is performed on the hy- perspectral data. The obtained results are compared with those obtained by the most commonly used classification approaches
    • …
    corecore