Search CORE

6,577 research outputs found

Objects2action: Classifying and localizing actions without any video example

Author: Jain Mihir
Mensink Thomas
Snoek Cees G. M.
van Gemert Jan C.
Publication venue
Publication date: 01/01/2015
Field of study

The goal of this paper is to recognize actions in video without the need for examples. Different from traditional zero-shot approaches we do not demand the design and specification of attribute classifiers and class-to-attribute mappings to allow for transfer from seen classes to unseen classes. Our key contribution is objects2action, a semantic word embedding that is spanned by a skip-gram model of thousands of object categories. Action labels are assigned to an object encoding of unseen video based on a convex combination of action and object affinities. Our semantic embedding has three main characteristics to accommodate for the specifics of actions. First, we propose a mechanism to exploit multiple-word descriptions of actions and objects. Second, we incorporate the automated selection of the most responsive objects per action. And finally, we demonstrate how to extend our zero-shot approach to the spatio-temporal localization of actions in video. Experiments on four action datasets demonstrate the potential of our approach

arXiv.org e-Print Archive

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions

Author: Hammarstrand Lars
Kahl Fredrik
Maddern Will
Okutomi Masatoshi
Pajdla Tomas
Pollefeys Marc
Safari Daniel
Sattler Torsten
Sivic Josef
Stenborg Erik
Toft Carl
Torii Akihiko
Publication venue
Publication date: 01/01/2018
Field of study

Visual localization enables autonomous vehicles to navigate in their surroundings and augmented reality applications to link virtual to real worlds. Practical visual localization approaches need to be robust to a wide variety of viewing condition, including day-night changes, as well as weather and seasonal variations, while providing highly accurate 6 degree-of-freedom (6DOF) camera pose estimates. In this paper, we introduce the first benchmark datasets specifically designed for analyzing the impact of such factors on visual localization. Using carefully created ground truth poses for query images taken under a wide variety of conditions, we evaluate the impact of various factors on 6DOF camera pose estimation accuracy through extensive experiments with state-of-the-art localization approaches. Based on our results, we draw conclusions about the difficulty of different conditions, showing that long-term localization is far from solved, and propose promising avenues for future work, including sequence-based localization approaches and the need for better local features. Our benchmark is available at visuallocalization.net.Comment: Accepted to CVPR 2018 as a spotligh

arXiv.org e-Print Archive

Lund University Publications

Crossref

INRIA a CCSD electronic archive server

Chalmers Research

Hausdorff-Distance Enhanced Matching of Scale Invariant Feature Transform Descriptors in Context of Image Querying

Author: Olszewska Joanna Isabelle
Wilson D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2012
Field of study

Reliable and effective matching of visual descriptors is a key step for many vision applications, e.g. image retrieval. In this paper, we propose to integrate the Hausdorff distance matching together with our pairing algorithm, in order to obtain a robust while computationally efficient process of matching feature descriptors for image-to-image querying in standards datasets. For this purpose, Scale Invariant Feature Transform (SIFT) descriptors have been matched using our presented algorithm, followed by the computation of our related similarity measure. This approach has shown excellent performance in both retrieval accuracy and speed

Crossref

University of Huddersfield Repository