1,633 research outputs found

    Effectiveness of Hierarchical Softmax in Large Scale Classification Tasks

    Full text link
    Typically, Softmax is used in the final layer of a neural network to get a probability distribution for output classes. But the main problem with Softmax is that it is computationally expensive for large scale data sets with large number of possible outputs. To approximate class probability efficiently on such large scale data sets we can use Hierarchical Softmax. LSHTC datasets were used to study the performance of the Hierarchical Softmax. LSHTC datasets have large number of categories. In this paper we evaluate and report the performance of normal Softmax Vs Hierarchical Softmax on LSHTC datasets. This evaluation used macro f1 score as a performance measure. The observation was that the performance of Hierarchical Softmax degrades as the number of classes increase

    Hierarchical Label Partitioning for Large Scale Classification

    Get PDF
    International audienceExtreme classification task where the number of classes is very large has received important focus over the last decade. Usual efficient multi-class classification approaches have not been designed to deal with such large number of classes. A particular issue in the context of large scale problems concerns the computational classification complexity : best multi-class approaches have generally a linear complexity with respect to the number of classes which does not allow these approaches to scale up. Recent works have put their focus on using hierarchical classification process in order to speed-up the classification of new instances. A priori information on labels is not always available nor useful to build hierarchical models. Finding a suitable hierarchical organization of the labels is thus a crucial issue as the accuracy of the model depends highly on the label assignment through the label tree. We propose in this work a new algorithm to build iteratively a hierarchical label structure by proposing a partitioning algorithm which optimizes simultaneously the structure in terms of classification complexity and the label partitioning problem in order to achieve high classification performances. Beginning from a flat tree structure, our algorithm selects iteratively a node to expand by adding a new level of nodes between the considered node and its children. This operation increases the speed-up of the classification process. Once the node is selected, best partitioning of the classes has to be computed. We propose to consider a measure based on the maximization of the expected loss of the sub-levels in order to minimize the global error of the structure. This choice enforces hardly separable classes to be group together in same partitions at the first levels of the tree structure and it delays errors at a deep level of the structure where there is no incidence on the accuracy of other classes

    Contemporary Art Authentication With Large-Scale Classification

    Get PDF
    Art authentication is the process of identifying the artist who created a piece of artwork and is manifested through events of provenance, such as art gallery exhibitions and financial transactions. Art authentication has visual influence via the uniqueness of the artist’s style in contrast to the style of another artist. The significance of this contrast is proportional to the number of artists involved and the degree of uniqueness of an artist’s collection. This visual uniqueness of style can be captured in a mathematical model produced by a machine learning (ML) algorithm on painting images. Art authentication is not always possible as provenance can be obscured or lost through anonymity, forgery, gifting, or theft of artwork. This paper presents an image-only art authentication attribute marker of contemporary art paintings for a very large number of artists. The experiments in this paper demonstrate that it is possible to use ML-generated models to authenticate contemporary art from 2368 to 100 artists with an accuracy of 48.97% to 91.23%, respectively. This is the largest effort for image-only art authentication to date, with respect to the number of artists involved and the accuracy of authentication

    Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning

    Get PDF
    This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/ licenses/by/4.0

    Evaluation of random forests on large-scale classification problems using a bag-of-visual-words representation

    Get PDF
    Random Forest is a very efficient classification method that has shown success in tasks like image segmentation or object detection, but has not been applied yet in large-scale image classification scenarios using a Bag-of-Visual-Words representation. In this work we evaluate the performance of Random Forest on the ImageNet dataset, and compare it to standard approaches in the state-of-the-art.Peer ReviewedPostprint (author’s final draft

    Large-scale classification based on support vector machine

    Get PDF
    Esta tese propón o fast support vector classifier, unha versión eficiente da máquina de vectores de soporte (SVM) con cerne gausiano para problemas de clasificación grandes. Este clasificador acada un acerto cercano aos mellores métodos dispoñíbeis, sendo moito máis rápido que aqueles en conxuntos de ata 31 millóns de datos, 30.000 entradas e 131 clases. Tamén axusta os requerimentos de memoria, permitindo a súa execución en datos de tamano case arbitrariamente grande. Esta tese tamén propón o algoritmo ideal kernel tuning, un método de sintonización eficiente da anchura do cerne gausiano para a SVM, método que é o máis rápido comparado con outras 5 alternativas da literatura, cun acerto moi perto do mellor dispoñíbel actualmente e cun reducido consumo de memoria
    • …
    corecore