11 research outputs found

    Online learning of taskdriven object-based visual attention control

    Get PDF
    A biologically-motivated computational model for learning task-driven and objectbased visual attention control in interactive environments is proposed. Our model consists of three layers. First, in the early visual processing layer, most salient location of a scene is derived using the biased saliency-based bottom-up model of visual attention. Then a cognitive component in the higher visual processing layer performs an application specific operation like object recognition at the focus of attention. From this information, a state is derived in the decision making and learning layer. Online Learning of Task-driven Object-based Visual Attention Control Ali Borji Top-down attention is learned by the U-TREE Discussions and Conclusions An agent working in an environment receives information momentarily through its visual sensor. It should determine what to look for. For this we use RL to teach the agent simply look for the most task relevant and rewarding entity in the visual scene ( This layer controls both top-down visual attention and motor actions. The learning approach is an extension of the U-TREE algorithm [6] to the visual domain. Attention tree is incrementally built in a quasi-static manner in two phases (iterations): 1) RL-fixed phase and 2) Tree-fixed phase In each Tree-fixed phase, RL algorithm is executed for some episodes by Fig. 1. Proposed model for learning task-driven object-based visual attention control Example scenario: captured scene through the agents' visual sensor undergoes a biased bottom-up saliency detection operation and focus of attention (FOA) is determined. Object at the FOA is recognized (i.e. is either present or not in the scene), then the agent moves in its binary tree in the decision making and leaves. 100% correct policy was achieved. The object at the attended location is recognized by the hierarchical model of object recognition (HMAX) [3] M. Riesenhuber, T. Poggio, Hierarchical models of object recognition in cortex. Nature Neuroscience, 2(1999),11, 1019-1025. Basic saliency-based model of visual attention [1] is revised for the purpose of salient region selection (object detection) at this layer where norm(.) is the Euclidean distance between two points in an image. Saliency is the function which takes as input an image and a weight vector and returns the most salient location. t i is the location of target object in the i-th image. In each Tree-fixed phase, RL algorithm is executed for some episodes by following Δ-greedy action selection strategy. In this phase, tree is hold fixed and the derived quadruples (s t , a t , r t+1 , s t+1 ) are only used for updating the Q-table: State discretization occurs in the RL-fixed phase where gathered experiences are used to refine aliased states. An object which minimizes aliasing the most is selected for braking an aliased leaf. Acknowledgement This work was funded by the school of cognitive sciences, IPM, Tehran, IRAN. scene), then the agent moves in its binary tree in the decision making and learning layer. This is done repetitively until it reaches a leaf node which determines its state. The best motor action is this state is performed. Outcome of this action over the world is evaluated by a critic and a reinforcement signal is fed back to the agent to update its internal representations (attention tree) and action selection strategy in a quasi-static manner. Following subsections discuss each layer of the model in detail

    25th annual computational neuroscience meeting: CNS-2016

    Get PDF
    The same neuron may play different functional roles in the neural circuits to which it belongs. For example, neurons in the Tritonia pedal ganglia may participate in variable phases of the swim motor rhythms [1]. While such neuronal functional variability is likely to play a major role the delivery of the functionality of neural systems, it is difficult to study it in most nervous systems. We work on the pyloric rhythm network of the crustacean stomatogastric ganglion (STG) [2]. Typically network models of the STG treat neurons of the same functional type as a single model neuron (e.g. PD neurons), assuming the same conductance parameters for these neurons and implying their synchronous firing [3, 4]. However, simultaneous recording of PD neurons shows differences between the timings of spikes of these neurons. This may indicate functional variability of these neurons. Here we modelled separately the two PD neurons of the STG in a multi-neuron model of the pyloric network. Our neuron models comply with known correlations between conductance parameters of ionic currents. Our results reproduce the experimental finding of increasing spike time distance between spikes originating from the two model PD neurons during their synchronised burst phase. The PD neuron with the larger calcium conductance generates its spikes before the other PD neuron. Larger potassium conductance values in the follower neuron imply longer delays between spikes, see Fig. 17.Neuromodulators change the conductance parameters of neurons and maintain the ratios of these parameters [5]. Our results show that such changes may shift the individual contribution of two PD neurons to the PD-phase of the pyloric rhythm altering their functionality within this rhythm. Our work paves the way towards an accessible experimental and computational framework for the analysis of the mechanisms and impact of functional variability of neurons within the neural circuits to which they belong

    Optimal local basis: A reinforcement learning approach for face recognition

    No full text
    This paper presents a novel learning approach for Face Recognition by introducing Optimal Local Basis. Optimal local bases are a set of basis derived by reinforcement learning to represent the face space locally. The reinforcement signal is designed to be correlated to the recognition accuracy. The optimal local bases are derived then by finding the most discriminant features for different parts of the face space, which represents either different individuals or different expressions, orientations, poses, illuminations, and other variants of the same individual. Therefore, unlike most of the existing approaches that solve the recognition problem by using a single basis for all individuals, our proposed method benefits from local information by incorporating different bases for its decision. We also introduce a novel classification scheme that uses reinforcement signal to build a similarity measure in a non-metric space. Experiments on AR, PIE, ORL and YALE databases indicate that the proposed method facilitates robust face recognition under pose, illumination and expression variations. The performance of our method is compared with that of Eigenface, Fisherface, Subclass Discriminant Analysis, and Random Subspace LDA methods as well. © Springer Science+Business Media, LLC 200

    Fast Initialization of Active Contours Towards Practical Visual Interfaces for Human-Robot Interaction

    No full text
    Abstract — The field of robotics is currently undergoing a change toward creation of robots that can naturally interact with humans. For achieving this, interactive robots must be endowed with natural interfaces that can sense and respond in real-time. Vision can provide handy information for this purpose by detecting and tracking human limbs to analyze gestures, actions and even emotions. However, real-time processing of visual information is a challenging bottleneck. In this paper, we will introduce a novel method, namely "self-organized contours", that can distinctly accelerate contour initialization, which is the slowest phase in visual tracking. Although the proposed method is general-purpose, it allows immediate initialization of active contours due to its similarity with snake structure. The proposed method is inspired from group behavior in insects and animals, particularly fishes. Keywords- real-time vision; active contours; multi-agent systems; human-robot interaction; self-organization; swarm intelligence.
    corecore