11,316 research outputs found

    Hyperbolic Interaction Model For Hierarchical Multi-Label Classification

    Full text link
    Different from the traditional classification tasks which assume mutual exclusion of labels, hierarchical multi-label classification (HMLC) aims to assign multiple labels to every instance with the labels organized under hierarchical relations. Besides the labels, since linguistic ontologies are intrinsic hierarchies, the conceptual relations between words can also form hierarchical structures. Thus it can be a challenge to learn mappings from word hierarchies to label hierarchies. We propose to model the word and label hierarchies by embedding them jointly in the hyperbolic space. The main reason is that the tree-likeness of the hyperbolic space matches the complexity of symbolic data with hierarchical structures. A new Hyperbolic Interaction Model (HyperIM) is designed to learn the label-aware document representations and make predictions for HMLC. Extensive experiments are conducted on three benchmark datasets. The results have demonstrated that the new model can realistically capture the complex data structures and further improve the performance for HMLC comparing with the state-of-the-art methods. To facilitate future research, our code is publicly available

    Multiclass latent locally linear support vector machines

    Get PDF
    Kernelized Support Vector Machines (SVM) have gained the status of off-the-shelf classifiers, able to deliver state of the art performance on almost any problem. Still, their practical use is constrained by their computational and memory complexity, which grows super-linearly with the number of training samples. In order to retain the low training and testing complexity of linear classifiers and the exibility of non linear ones, a growing, promising alternative is represented by methods that learn non-linear classifiers through local combinations of linear ones. In this paper we propose a new multi class local classifier, based on a latent SVM formulation. The proposed classifier makes use of a set of linear models that are linearly combined using sample and class specific weights. Thanks to the latent formulation, the combination coefficients are modeled as latent variables. We allow soft combinations and we provide a closed-form solution for their estimation, resulting in an efficient prediction rule. This novel formulation allows to learn in a principled way the sample specific weights and the linear classifiers, in a unique optimization problem, using a CCCP optimization procedure. Extensive experiments on ten standard UCI machine learning datasets, one large binary dataset, three character and digit recognition databases, and a visual place categorization dataset show the power of the proposed approach

    Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization

    Full text link
    Global covariance pooling in convolutional neural networks has achieved impressive improvement over the classical first-order pooling. Recent works have shown matrix square root normalization plays a central role in achieving state-of-the-art performance. However, existing methods depend heavily on eigendecomposition (EIG) or singular value decomposition (SVD), suffering from inefficient training due to limited support of EIG and SVD on GPU. Towards addressing this problem, we propose an iterative matrix square root normalization method for fast end-to-end training of global covariance pooling networks. At the core of our method is a meta-layer designed with loop-embedded directed graph structure. The meta-layer consists of three consecutive nonlinear structured layers, which perform pre-normalization, coupled matrix iteration and post-compensation, respectively. Our method is much faster than EIG or SVD based ones, since it involves only matrix multiplications, suitable for parallel implementation on GPU. Moreover, the proposed network with ResNet architecture can converge in much less epochs, further accelerating network training. On large-scale ImageNet, we achieve competitive performance superior to existing counterparts. By finetuning our models pre-trained on ImageNet, we establish state-of-the-art results on three challenging fine-grained benchmarks. The source code and network models will be available at http://www.peihuali.org/iSQRT-COVComment: Accepted to CVPR 201
    • …
    corecore