5,579 research outputs found

    A Simple Iterative Algorithm for Parsimonious Binary Kernel Fisher Discrimination

    Get PDF
    By applying recent results in optimization theory variously known as optimization transfer or majorize/minimize algorithms, an algorithm for binary, kernel, Fisher discriminant analysis is introduced that makes use of a non-smooth penalty on the coefficients to provide a parsimonious solution. The problem is converted into a smooth optimization that can be solved iteratively with no greater overhead than iteratively re-weighted least-squares. The result is simple, easily programmed and is shown to perform, in terms of both accuracy and parsimony, as well as or better than a number of leading machine learning algorithms on two well-studied and substantial benchmarks

    Nonlinear Dimension Reduction for Micro-array Data (Small n and Large p)

    Get PDF

    Feature Augmentation via Nonparametrics and Selection (FANS) in High Dimensional Classification

    Full text link
    We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.Comment: 30 pages, 2 figure

    Support vector machine for functional data classification

    Get PDF
    In many applications, input data are sampled functions taking their values in infinite dimensional spaces rather than standard vectors. This fact has complex consequences on data analysis algorithms that motivate modifications of them. In fact most of the traditional data analysis tools for regression, classification and clustering have been adapted to functional inputs under the general name of functional Data Analysis (FDA). In this paper, we investigate the use of Support Vector Machines (SVMs) for functional data analysis and we focus on the problem of curves discrimination. SVMs are large margin classifier tools based on implicit non linear mappings of the considered data into high dimensional spaces thanks to kernels. We show how to define simple kernels that take into account the unctional nature of the data and lead to consistent classification. Experiments conducted on real world data emphasize the benefit of taking into account some functional aspects of the problems.Comment: 13 page

    Parsimonious Kernel Fisher Discrimination

    No full text
    By applying recent results in optimization transfer, a new algorithm for kernel Fisher Discriminant Analysis is provided that makes use of a non-smooth penalty on the coefficients to provide a parsimonious solution. The algorithm is simple, easily programmed and is shown to perform as well as or better than a number of leading machine learning algorithms on a substantial benchmark. It is then applied to a set of extreme small-sample-size problems in virtual screening where it is found to be less accurate than a currently leading approach but is still comparable in a number of cases

    A Noise-Robust Fast Sparse Bayesian Learning Model

    Full text link
    This paper utilizes the hierarchical model structure from the Bayesian Lasso in the Sparse Bayesian Learning process to develop a new type of probabilistic supervised learning approach. The hierarchical model structure in this Bayesian framework is designed such that the priors do not only penalize the unnecessary complexity of the model but will also be conditioned on the variance of the random noise in the data. The hyperparameters in the model are estimated by the Fast Marginal Likelihood Maximization algorithm which can achieve sparsity, low computational cost and faster learning process. We compare our methodology with two other popular learning models; the Relevance Vector Machine and the Bayesian Lasso. We test our model on examples involving both simulated and empirical data, and the results show that this approach has several performance advantages, such as being fast, sparse and also robust to the variance in random noise. In addition, our method can give out a more stable estimation of variance of random error, compared with the other methods in the study.Comment: 15 page
    corecore