Search CORE

45,822 research outputs found

Gaze Embeddings for Zero-Shot Image Classification

Author: Akata Zeynep
Bulling Andreas
Karessli Nour
Schiele Bernt
Publication venue
Publication date: 01/01/2017
Field of study

Zero-shot image classification using auxiliary information, such as attributes describing discriminative object properties, requires time-consuming annotation by domain experts. We instead propose a method that relies on human gaze as auxiliary information, exploiting that even non-expert users have a natural ability to judge class membership. We present a data collection paradigm that involves a discrimination task to increase the information content obtained from gaze data. Our method extracts discriminative descriptors from the data and learns a compatibility function between image and gaze using three novel gaze embeddings: Gaze Histograms (GH), Gaze Features with Grid (GFG) and Gaze Features with Sequence (GFS). We introduce two new gaze-annotated datasets for fine-grained image classification and show that human gaze data is indeed class discriminative, provides a competitive alternative to expert-annotated attributes, and outperforms other baselines for zero-shot image classification

arXiv.org e-Print Archive

Crossref

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Aggregated Deep Local Features for Remote Sensing Image Retrieval

Author: Bondarev Egor
de With Peter H. N.
Imbriaco Raffaele
Sebastian Clint
Publication venue: 'MDPI AG'
Publication date: 01/02/2019
Field of study

Remote Sensing Image Retrieval remains a challenging topic due to the special nature of Remote Sensing Imagery. Such images contain various different semantic objects, which clearly complicates the retrieval task. In this paper, we present an image retrieval pipeline that uses attentive, local convolutional features and aggregates them using the Vector of Locally Aggregated Descriptors (VLAD) to produce a global descriptor. We study various system parameters such as the multiplicative and additive attention mechanisms and descriptor dimensionality. We propose a query expansion method that requires no external inputs. Experiments demonstrate that even without training, the local convolutional features and global representation outperform other systems. After system tuning, we can achieve state-of-the-art or competitive results. Furthermore, we observe that our query expansion method increases overall system performance by about 3%, using only the top-three retrieved images. Finally, we show how dimensionality reduction produces compact descriptors with increased retrieval performance and fast retrieval computation times, e.g. 50% faster than the current systems.Comment: Published in Remote Sensing. The first two authors have equal contributio

arXiv.org e-Print Archive

Pure OAI Repository

Directory of Open Access Journals

Person Search with Natural Language Description

Author: Li Hongsheng
Li Shuang
Wang Xiaogang
Xiao Tong
Yue Dayu
Zhou Bolei
Publication venue
Publication date: 30/03/2017
Field of study

Searching persons in large-scale image databases with the query of natural language description has important applications in video surveillance. Existing methods mainly focused on searching persons with image-based or attribute-based queries, which have major limitations for a practical usage. In this paper, we study the problem of person search with natural language description. Given the textual description of a person, the algorithm of the person search is required to rank all the samples in the person database then retrieve the most relevant sample corresponding to the queried description. Since there is no person dataset or benchmark with textual description available, we collect a large-scale person description dataset with detailed natural language annotations and person samples from various sources, termed as CUHK Person Description Dataset (CUHK-PEDES). A wide range of possible models and baselines have been evaluated and compared on the person search benchmark. An Recurrent Neural Network with Gated Neural Attention mechanism (GNA-RNN) is proposed to establish the state-of-the art performance on person search

arXiv.org e-Print Archive

Crossref