26,481 research outputs found

    Evaluation of Output Embeddings for Fine-Grained Image Classification

    Full text link
    Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and class embeddings, we learn a compatibility function such that matching embeddings are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score. We use state-of-the-art image features and focus on different supervised attributes and unsupervised output embeddings either derived from hierarchies or learned from unlabeled text corpora. We establish a substantially improved state-of-the-art on the Animals with Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate that purely unsupervised output embeddings (learned from Wikipedia and improved with fine-grained text) achieve compelling results, even outperforming the previous supervised state-of-the-art. By combining different output embeddings, we further improve results.Comment: @inproceedings {ARWLS15, title = {Evaluation of Output Embeddings for Fine-Grained Image Classification}, booktitle = {IEEE Computer Vision and Pattern Recognition}, year = {2015}, author = {Zeynep Akata and Scott Reed and Daniel Walter and Honglak Lee and Bernt Schiele}

    PredNet and Predictive Coding: A Critical Review

    Full text link
    PredNet, a deep predictive coding network developed by Lotter et al., combines a biologically inspired architecture based on the propagation of prediction error with self-supervised representation learning in video. While the architecture has drawn a lot of attention and various extensions of the model exist, there is a lack of a critical analysis. We fill in the gap by evaluating PredNet both as an implementation of the predictive coding theory and as a self-supervised video prediction model using a challenging video action classification dataset. We design an extended model to test if conditioning future frame predictions on the action class of the video improves the model performance. We show that PredNet does not yet completely follow the principles of predictive coding. The proposed top-down conditioning leads to a performance gain on synthetic data, but does not scale up to the more complex real-world action classification dataset. Our analysis is aimed at guiding future research on similar architectures based on the predictive coding theory

    Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking

    Full text link
    Extraction from raw text to a knowledge base of entities and fine-grained types is often cast as prediction into a flat set of entity and type labels, neglecting the rich hierarchies over types and entities contained in curated ontologies. Previous attempts to incorporate hierarchical structure have yielded little benefit and are restricted to shallow ontologies. This paper presents new methods using real and complex bilinear mappings for integrating hierarchical information, yielding substantial improvement over flat predictions in entity linking and fine-grained entity typing, and achieving new state-of-the-art results for end-to-end models on the benchmark FIGER dataset. We also present two new human-annotated datasets containing wide and deep hierarchies which we will release to the community to encourage further research in this direction: MedMentions, a collection of PubMed abstracts in which 246k mentions have been mapped to the massive UMLS ontology; and TypeNet, which aligns Freebase types with the WordNet hierarchy to obtain nearly 2k entity types. In experiments on all three datasets we show substantial gains from hierarchy-aware training.Comment: ACL 201

    Hyperbolic Interaction Model For Hierarchical Multi-Label Classification

    Full text link
    Different from the traditional classification tasks which assume mutual exclusion of labels, hierarchical multi-label classification (HMLC) aims to assign multiple labels to every instance with the labels organized under hierarchical relations. Besides the labels, since linguistic ontologies are intrinsic hierarchies, the conceptual relations between words can also form hierarchical structures. Thus it can be a challenge to learn mappings from word hierarchies to label hierarchies. We propose to model the word and label hierarchies by embedding them jointly in the hyperbolic space. The main reason is that the tree-likeness of the hyperbolic space matches the complexity of symbolic data with hierarchical structures. A new Hyperbolic Interaction Model (HyperIM) is designed to learn the label-aware document representations and make predictions for HMLC. Extensive experiments are conducted on three benchmark datasets. The results have demonstrated that the new model can realistically capture the complex data structures and further improve the performance for HMLC comparing with the state-of-the-art methods. To facilitate future research, our code is publicly available
    • …
    corecore