46,382 research outputs found

    Online learning of taskdriven object-based visual attention control

    Get PDF
    A biologically-motivated computational model for learning task-driven and objectbased visual attention control in interactive environments is proposed. Our model consists of three layers. First, in the early visual processing layer, most salient location of a scene is derived using the biased saliency-based bottom-up model of visual attention. Then a cognitive component in the higher visual processing layer performs an application specific operation like object recognition at the focus of attention. From this information, a state is derived in the decision making and learning layer. Online Learning of Task-driven Object-based Visual Attention Control Ali Borji Top-down attention is learned by the U-TREE Discussions and Conclusions An agent working in an environment receives information momentarily through its visual sensor. It should determine what to look for. For this we use RL to teach the agent simply look for the most task relevant and rewarding entity in the visual scene ( This layer controls both top-down visual attention and motor actions. The learning approach is an extension of the U-TREE algorithm [6] to the visual domain. Attention tree is incrementally built in a quasi-static manner in two phases (iterations): 1) RL-fixed phase and 2) Tree-fixed phase In each Tree-fixed phase, RL algorithm is executed for some episodes by Fig. 1. Proposed model for learning task-driven object-based visual attention control Example scenario: captured scene through the agents' visual sensor undergoes a biased bottom-up saliency detection operation and focus of attention (FOA) is determined. Object at the FOA is recognized (i.e. is either present or not in the scene), then the agent moves in its binary tree in the decision making and leaves. 100% correct policy was achieved. The object at the attended location is recognized by the hierarchical model of object recognition (HMAX) [3] M. Riesenhuber, T. Poggio, Hierarchical models of object recognition in cortex. Nature Neuroscience, 2(1999),11, 1019-1025. Basic saliency-based model of visual attention [1] is revised for the purpose of salient region selection (object detection) at this layer where norm(.) is the Euclidean distance between two points in an image. Saliency is the function which takes as input an image and a weight vector and returns the most salient location. t i is the location of target object in the i-th image. In each Tree-fixed phase, RL algorithm is executed for some episodes by following ε-greedy action selection strategy. In this phase, tree is hold fixed and the derived quadruples (s t , a t , r t+1 , s t+1 ) are only used for updating the Q-table: State discretization occurs in the RL-fixed phase where gathered experiences are used to refine aliased states. An object which minimizes aliasing the most is selected for braking an aliased leaf. Acknowledgement This work was funded by the school of cognitive sciences, IPM, Tehran, IRAN. scene), then the agent moves in its binary tree in the decision making and learning layer. This is done repetitively until it reaches a leaf node which determines its state. The best motor action is this state is performed. Outcome of this action over the world is evaluated by a critic and a reinforcement signal is fed back to the agent to update its internal representations (attention tree) and action selection strategy in a quasi-static manner. Following subsections discuss each layer of the model in detail

    S4Net: Single Stage Salient-Instance Segmentation

    Full text link
    We consider an interesting problem-salient instance segmentation in this paper. Other than producing bounding boxes, our network also outputs high-quality instance-level segments. Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch. Our new branch regards not only local context inside each detection window but also its surrounding context, enabling us to distinguish the instances in the same scope even with obstruction. Our network is end-to-end trainable and runs at a fast speed (40 fps when processing an image with resolution 320x320). We evaluate our approach on a publicly available benchmark and show that it outperforms other alternative solutions. We also provide a thorough analysis of the design choices to help readers better understand the functions of each part of our network. The source code can be found at \url{https://github.com/RuochenFan/S4Net}

    Instance-Level Salient Object Segmentation

    Full text link
    Image saliency detection has recently witnessed rapid progress due to deep convolutional neural networks. However, none of the existing methods is able to identify object instances in the detected salient regions. In this paper, we present a salient instance segmentation method that produces a saliency mask with distinct object instance labels for an input image. Our method consists of three steps, estimating saliency map, detecting salient object contours and identifying salient object instances. For the first two steps, we propose a multiscale saliency refinement network, which generates high-quality salient region masks and salient object contours. Once integrated with multiscale combinatorial grouping and a MAP-based subset optimization framework, our method can generate very promising salient object instance segmentation results. To promote further research and evaluation of salient instance segmentation, we also construct a new database of 1000 images and their pixelwise salient instance annotations. Experimental results demonstrate that our proposed method is capable of achieving state-of-the-art performance on all public benchmarks for salient region detection as well as on our new dataset for salient instance segmentation.Comment: To appear in CVPR201
    • …
    corecore