108 research outputs found

    Towards ultrahigh dimensional feature selection for big data

    Full text link
    In this paper, we present a new adaptive feature scaling scheme for ultrahigh-dimensional feature selection on Big Data, and then reformulate it as a convex semi-infinite programming (SIP) problem. To address the SIP, we propose an eficient feature generating paradigm. Different from traditional gradient-based approaches that conduct optimization on all input features, the proposed paradigm iteratively activates a group of features, and solves a sequence of multiple kernel learning (MKL) subproblems. To further speed up the training, we propose to solve the MKL subproblems in their primal forms through a modified accelerated proximal gradient approach. Due to such optimization scheme, some eficient cache techniques are also developed. The feature generating paradigm is guaranteed to converge globally under mild conditions, and can achieve lower feature selection bias. Moreover, the proposed method can tackle two challenging tasks in feature selection: 1) group-based feature selection with complex structures, and 2) nonlinear feature selection with explicit feature mappings. Comprehensive experiments on a wide range of synthetic and real-world data sets of tens of million data points with O(1014) features demonstrate the competitive performance of the proposed method over state-of-the-art feature selection methods in terms of generalization performance and training eficiency. © 2014 Mingkui Tan, Ivor W. Tsang and Li Wang

    Discriminative latent variable models for visual recognition

    Get PDF
    Visual Recognition is a central problem in computer vision, and it has numerous potential applications in many dierent elds, such as robotics, human computer interaction, and entertainment. In this dissertation, we propose two discriminative latent variable models for handling challenging visual recognition problems. In particular, we use latent variables to capture and model various prior knowledge in the training data. In the rst model, we address the problem of recognizing human actions from still images. We jointly consider both poses and actions in a unied framework, and treat human poses as latent variables. The learning of this model follows the framework of latent SVM. Secondly, we propose another latent variable model to address the problem of automated tag learning on YouTube videos. In particular, we address the semantic variations (sub-tags) of the videos which have the same tag. In the model, each video is assumed to be associated with a sub-tag label, and we treat this sub-tag label as latent information. This model is trained using a latent learning framework based on LogitBoost, which jointly considers both the latent sub-tag label and the tag label. Moreover, we propose a novel discriminative latent learning framework, kernel latent SVM, which combines the benet of latent SVM and kernel methods. The framework of kernel latent SVM is general enough to be applied in many applications of visual recognition. It is also able to handle complex latent variables with interdependent structures using composite kernels
    • …
    corecore