Search CORE

41,643 research outputs found

Multiple-window Bag of Features for Road Environment Recognition

Author: Ishikawa Seiji
Kim Hyoungseop
Morita Shou
Tan Joo Kooi
Publication venue: 'Atlantis Press'
Publication date: 01/09/2014
Field of study

The idea of Bag of Features (BoF) is recently often employed for general object recognition. But, as it does not take positional relations of detected features into account, the recognition rate is still not very high for practical use. This paper proposes a method of describing the feature of an object by the BoF representation which considers positional information of the features. Although the original BoF representation is applied to an entire image, the proposed method employs multiple windows on an image. The BoF representation is applied to each of the windows to represent an object in the image interested for recognition. The performance of the proposed method is shown experimentally

Kyutacar : Kyushu Institute of Technology Academic Repository

Convolutional Patch Networks with Spatial Prior for Road Detection and Urban Scene Understanding

Author: Brust Clemens-Alexander
Denzler Joachim
Rodner Erik
Sickert Sven
Simon Marcel
Publication venue
Publication date: 01/01/2015
Field of study

Classifying single image patches is important in many different applications, such as road detection or scene understanding. In this paper, we present convolutional patch networks, which are convolutional networks learned to distinguish different image patches and which can be used for pixel-wise labeling. We also show how to incorporate spatial information of the patch as an input to the network, which allows for learning spatial priors for certain categories jointly with an appearance model. In particular, we focus on road detection and urban scene understanding, two application areas where we are able to achieve state-of-the-art results on the KITTI as well as on the LabelMeFacade dataset. Furthermore, our paper offers a guideline for people working in the area and desperately wandering through all the painstaking details that render training CNs on image patches extremely difficult.Comment: VISAPP 2015 pape

arXiv.org e-Print Archive

CiteSeerX

Crossref

The THUMOS Challenge on Action Recognition for Videos "in the Wild"

Author: Gorban Alex
Idrees Haroon
Jiang Yu-Gang
Laptev Ivan
Shah Mubarak
Sukthankar Rahul
Zamir Amir R.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Automatically recognizing and localizing wide ranges of human actions has crucial importance for video understanding. Towards this goal, the THUMOS challenge was introduced in 2013 to serve as a benchmark for action recognition. Until then, video action recognition, including THUMOS challenge, had focused primarily on the classification of pre-segmented (i.e., trimmed) videos, which is an artificial task. In THUMOS 2014, we elevated action recognition to a more practical level by introducing temporally untrimmed videos. These also include `background videos' which share similar scenes and backgrounds as action videos, but are devoid of the specific actions. The three editions of the challenge organized in 2013--2015 have made THUMOS a common benchmark for action classification and detection and the annual challenge is widely attended by teams from around the world. In this paper we describe the THUMOS benchmark in detail and give an overview of data collection and annotation procedures. We present the evaluation protocols used to quantify results in the two THUMOS tasks of action classification and temporal detection. We also present results of submissions to the THUMOS 2015 challenge and review the participating approaches. Additionally, we include a comprehensive empirical study evaluating the differences in action recognition between trimmed and untrimmed videos, and how well methods trained on trimmed videos generalize to untrimmed videos. We conclude by proposing several directions and improvements for future THUMOS challenges.Comment: Preprint submitted to Computer Vision and Image Understandin

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Quantifying and Transferring Contextual Information in Object Detection

Author: Gong S
Xiang T
Zheng W-S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

(c) 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other work

CiteSeerX

Queen Mary Research Online