17 research outputs found

    Mixing Linear SVMs for Nonlinear Classification

    Full text link

    Non-Gaussian Discriminative Factor Models via the Max-Margin Rank-Likelihood

    Full text link
    We consider the problem of discriminative factor analysis for data that are in general non-Gaussian. A Bayesian model based on the ranks of the data is proposed. We first introduce a new {\em max-margin} version of the rank-likelihood. A discriminative factor model is then developed, integrating the max-margin rank-likelihood and (linear) Bayesian support vector machines, which are also built on the max-margin principle. The discriminative factor model is further extended to the {\em nonlinear} case through mixtures of local linear classifiers, via Dirichlet processes. Fully local conjugacy of the model yields efficient inference with both Markov Chain Monte Carlo and variational Bayes approaches. Extensive experiments on benchmark and real data demonstrate superior performance of the proposed model and its potential for applications in computational biology.Comment: 14 pages, 7 figures, ICML 201

    Dont Just Divide; Polarize and Conquer!

    Full text link
    In data containing heterogeneous subpopulations, classification performance benefits from incorporating the knowledge of cluster structure in the classifier. Previous methods for such combined clustering and classification are either 1) classifier-specific and not generic, or 2) independently perform clustering and classifier training, which may not form clusters that can potentially benefit classifier performance. The question of how to perform clustering to improve the performance of classifiers trained on the clusters has received scant attention in previous literature, despite its importance in several real-world applications. In this paper, we design a simple and efficient classification algorithm called Clustering Aware Classification (CAC), to find clusters that are well suited for being used as training datasets by classifiers for each underlying subpopulation. Our experiments on synthetic and real benchmark datasets demonstrate the efficacy of CAC over previous methods for combined clustering and classification.Comment: 19 Pages, 5 figure

    Hemodynamic Analysis for Cognitive Load Assessment and Classification in Motor Learning Tasks Using Type-2 Fuzzy Sets

    Get PDF
    The paper addresses a novel approach to assess and classify the cognitive load of subjects from their hemodynamic response while engaged in motor learning tasks, such as vehicle-driving. A set of complex motor-activity-learning stimuli for braking, steering-control and acceleration is prepared to experimentally measure and classify the cognitive load of the car-drivers in three distinct classes: High, Medium and Low. New models of General and Interval Type-2 Fuzzy classifiers are proposed to reduce the scope of uncertainty in cognitive load classification due to the fluctuation of the hemodynamic features within and across sessions. The proposed classifiers offer high classification accuracy over 96%, leaving behind the traditional type-1/type-2 fuzzy and other standard classifiers. Experiments undertaken also offer a deep biological insight concerning the shift of brain-activations from the orbito-frontal to the ventro-lateral prefrontal cortex during high-to-low transition in cognitive load. Further, the activation of the dorsolateral prefrontal cortex is also reduced during low cognitive load of subjects. The proposed research outcome may directly be utilized to identify driving learners with low cognitive load for difficult motor learning tasks, such as taking a U-turn in a narrow space and motion control on the top of a bridge to avoid possible collision with the car ahead

    A COMPREHENSIVE PIPELINE FOR CLASS COMPARISON AND CLASS PREDICTION IN CANCER RESEARCH

    Get PDF
    Personalized medicine is an emerging field that promises to bring radical changes in healthcare and may be defined as \u201ca medical model using molecular profiling technologies for tailoring the right therapeutic strategy for the right person at the right time, and determine the predisposition to disease at the population level and to deliver timely and stratified prevention\u201d. The sequencing of the human genome together with the development and implementation of new high throughput technologies has provided access to large \u2018omics\u2019 (e.g. genomics, proteomics) data, bringing a better understanding of cancer biology and enabling new approaches to diagnosis, drug development, and individualized therapy. \u2018Omics\u2019 data have the potential as cancer biomarkers but no consolidated guidelines have been established for discovery analyses. In the context of the EDERA project, funded by the Italian Association for Cancer Research, a structured pipeline was developed with innovative applications of existing bioinformatics methods including: 1) the combination of the results of two statistical tests (t and Anderson-Darling) to detect features with significant fold change or general distributional differences in class comparison; 2) the application of a bootstrap selection procedure together with machine learning techniques to guarantee result generalizability and study the interconnections among the selected features in class prediction. Such a pipeline was successfully applied to plasmatic microRNA, identifying five hemolysis related microRNAs and to Secondary ElectroSpray Ionization-Mass Spectrometry data, in which case eight mass spectrometry signals were found able to discriminate exhaled breath from breast cancer patients from that of healthy individuals
    corecore