6 research outputs found

    An embedded feature selection framework for hybrid data

    Full text link
    © 2017, Springer International Publishing AG. Feature selection in terms of inductive supervised learning is a process of selecting a subset of features relevant to the target concept and removing irrelevant and redundant features. The majority of feature selection methods, which have been developed in the last decades, can deal with only numerical or categorical features. An exception is the Recursive Feature Elimination under the clinical kernel function which is an embedded feature selection method. However, it suffers from low classification performance. In this work, we propose several embedded feature selection methods which are capable of dealing with hybrid balanced, and hybrid imbalanced data sets. In the experimental evaluation on five UCI Machine Learning Repository data sets, we demonstrate the dominance and effectiveness of the proposed methods in terms of dimensionality reduction and classification performance

    Trace ratio optimization with feature correlation mining for multiclass discriminant analysis

    Full text link
    Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Fisher's linear discriminant analysis is a widely accepted dimensionality reduction method, which aims to find a transformation matrix to convert feature space to a smaller space by maximising the between-class scatter matrix while minimising the within-class scatter matrix. Although the fast and easy process of finding the transformation matrix has made this method attractive, overemphasizing the large class distances makes the criterion of this method suboptimal. In this case, the close class pairs tend to overlap in the subspace. Despite different weighting methods having been developed to overcome this problem, there is still a room to improve this issue. In this work, we study a weighted trace ratio by maximising the harmonic mean of the multiple objective reciprocals. To further improve the performance, we enforce the 2,1-norm to the developed objective function. Additionally, we propose an iterative algorithm to optimise this objective function. The proposed method avoids the domination problem of the largest objective, and guarantees that no objectives will be too small. This method can be more beneficial if the number of classes is large. The extensive experiments on different datasets show the effectiveness of our proposed method when compared with four state-of-the-art methods

    Foliar and Stem Diseases

    No full text
    corecore