75,183 research outputs found
Feature Selective Networks for Object Detection
Objects for detection usually have distinct characteristics in different
sub-regions and different aspect ratios. However, in prevalent two-stage object
detection methods, Region-of-Interest (RoI) features are extracted by RoI
pooling with little emphasis on these translation-variant feature components.
We present feature selective networks to reform the feature representations of
RoIs by exploiting their disparities among sub-regions and aspect ratios. Our
network produces the sub-region attention bank and aspect ratio attention bank
for the whole image. The RoI-based sub-region attention map and aspect ratio
attention map are selectively pooled from the banks, and then used to refine
the original RoI features for RoI classification. Equipped with a light-weight
detection subnetwork, our network gets a consistent boost in detection
performance based on general ConvNet backbones (ResNet-101, GoogLeNet and
VGG-16). Without bells and whistles, our detectors equipped with ResNet-101
achieve more than 3% mAP improvement compared to counterparts on PASCAL VOC
2007, PASCAL VOC 2012 and MS COCO datasets
Salient Object Detection via Augmented Hypotheses
In this paper, we propose using \textit{augmented hypotheses} which consider
objectness, foreground and compactness for salient object detection. Our
algorithm consists of four basic steps. First, our method generates the
objectness map via objectness hypotheses. Based on the objectness map, we
estimate the foreground margin and compute the corresponding foreground map
which prefers the foreground objects. From the objectness map and the
foreground map, the compactness map is formed to favor the compact objects. We
then derive a saliency measure that produces a pixel-accurate saliency map
which uniformly covers the objects of interest and consistently separates fore-
and background. We finally evaluate the proposed framework on two challenging
datasets, MSRA-1000 and iCoSeg. Our extensive experimental results show that
our method outperforms state-of-the-art approaches.Comment: IJCAI 2015 pape
What Can I Do Around Here? Deep Functional Scene Understanding for Cognitive Robots
For robots that have the capability to interact with the physical environment
through their end effectors, understanding the surrounding scenes is not merely
a task of image classification or object recognition. To perform actual tasks,
it is critical for the robot to have a functional understanding of the visual
scene. Here, we address the problem of localizing and recognition of functional
areas from an arbitrary indoor scene, formulated as a two-stage deep learning
based detection pipeline. A new scene functionality testing-bed, which is
complied from two publicly available indoor scene datasets, is used for
evaluation. Our method is evaluated quantitatively on the new dataset,
demonstrating the ability to perform efficient recognition of functional areas
from arbitrary indoor scenes. We also demonstrate that our detection model can
be generalized onto novel indoor scenes by cross validating it with the images
from two different datasets
The reentry hypothesis: The putative interaction of the frontal eye field, ventrolateral prefrontal cortex, and areas V4, IT for attention and eye movement
Attention is known to play a key role in perception, including action selection, object recognition and memory. Despite findings revealing competitive interactions among cell populations, attention remains difficult to explain. The central purpose of this paper is to link up a large number of findings in a single computational approach. Our simulation results suggest that attention can be well explained on a network level involving many areas of the brain. We argue that attention is an emergent phenomenon that arises from reentry and competitive interactions. We hypothesize that guided visual search requires the usage of an object-specific template in prefrontal cortex to sensitize V4 and IT cells whose preferred stimuli match the target template. This induces a feature-specific bias and provides guidance for eye movements. Prior to an eye movement, a spatially organized reentry from occulomotor centers, specifically the movement cells of the frontal eye field, occurs and modulates the gain of V4 and IT cells. The processes involved are elucidated by quantitatively comparing the time course of simulated neural activity with experimental data. Using visual search tasks as an example, we provide clear and empirically testable predictions for the participation of IT, V4 and the frontal eye field in attention. Finally, we explain a possible physiological mechanism that can lead to non-flat search slopes as the result of a slow, parallel discrimination process
- …