96,436 research outputs found
Contextual Attention for Hand Detection in the Wild
We present Hand-CNN, a novel convolutional network architecture for detecting hand masks and predicting hand orientations in unconstrained images. Hand-CNN extends MaskRCNN with a novel attention mechanism to incorporate contextual cues in the detection process. This attention mechanism can be implemented as an efficient network module that captures non-local dependencies between features. This network module can be inserted at different stages of an object detection network, and the entire detector can be trained end-to-end. We also introduce large-scale annotated hand datasets containing hands in unconstrained images for training and evaluation. We show that Hand-CNN outperforms existing methods on the newly collected datasets and the publicly available PASCAL VOC human layout dataset. Data and code: https://www3.cs.stonybrook.edu/~cvl/projects/hand_det_attention
Contextual Attention for Hand Detection in the Wild
We present Hand-CNN, a novel convolutional network architecture for detecting
hand masks and predicting hand orientations in unconstrained images. Hand-CNN
extends MaskRCNN with a novel attention mechanism to incorporate contextual
cues in the detection process. This attention mechanism can be implemented as
an efficient network module that captures non-local dependencies between
features. This network module can be inserted at different stages of an object
detection network, and the entire detector can be trained end-to-end.
We also introduce a large-scale annotated hand dataset containing hands in
unconstrained images for training and evaluation. We show that Hand-CNN
outperforms existing methods on several datasets, including our hand detection
benchmark and the publicly available PASCAL VOC human layout challenge. We also
conduct ablation studies on hand detection to show the effectiveness of the
proposed contextual attention module.Comment: 9 pages, 9 figure
Direction-aware Spatial Context Features for Shadow Detection
Shadow detection is a fundamental and challenging task, since it requires an
understanding of global image semantics and there are various backgrounds
around shadows. This paper presents a novel network for shadow detection by
analyzing image context in a direction-aware manner. To achieve this, we first
formulate the direction-aware attention mechanism in a spatial recurrent neural
network (RNN) by introducing attention weights when aggregating spatial context
features in the RNN. By learning these weights through training, we can recover
direction-aware spatial context (DSC) for detecting shadows. This design is
developed into the DSC module and embedded in a CNN to learn DSC features at
different levels. Moreover, a weighted cross entropy loss is designed to make
the training more effective. We employ two common shadow detection benchmark
datasets and perform various experiments to evaluate our network. Experimental
results show that our network outperforms state-of-the-art methods and achieves
97% accuracy and 38% reduction on balance error rate.Comment: Accepted for oral presentation in CVPR 2018. The journal version of
this paper is arXiv:1805.0463
Relation Networks for Object Detection
Although it is well believed for years that modeling relations between
objects would help object recognition, there has not been evidence that the
idea is working in the deep learning era. All state-of-the-art object detection
systems still rely on recognizing object instances individually, without
exploiting their relations during learning.
This work proposes an object relation module. It processes a set of objects
simultaneously through interaction between their appearance feature and
geometry, thus allowing modeling of their relations. It is lightweight and
in-place. It does not require additional supervision and is easy to embed in
existing networks. It is shown effective on improving object recognition and
duplicate removal steps in the modern object detection pipeline. It verifies
the efficacy of modeling object relations in CNN based detection. It gives rise
to the first fully end-to-end object detector
- …