7,709 research outputs found

    Towards Contextual Action Recognition and Target Localization with Active Allocation of Attention

    Get PDF
    Exploratory gaze movements are fundamental for gathering the most relevant information regarding the partner during social interactions. We have designed and implemented a system for dynamic attention allocation which is able to actively control gaze movements during a visual action recognition task. During the observation of a partners reaching movement, the robot is able to contextually estimate the goal position of the partner hand and the location in space of the candidate targets, while moving its gaze around with the purpose of optimizing the gathering of information relevant for the task. Experimental results on a simulated environment show that active gaze control provides a relevant advantage with respect to typical passive observation, both in term of estimation precision and of time required for action recognition. © 2012 Springer-Verlag

    Towards Active Event Recognition

    No full text
    Directing robot attention to recognise activities and to anticipate events like goal-directed actions is a crucial skill for human-robot interaction. Unfortunately, issues like intrinsic time constraints, the spatially distributed nature of the entailed information sources, and the existence of a multitude of unobservable states affecting the system, like latent intentions, have long rendered achievement of such skills a rather elusive goal. The problem tests the limits of current attention control systems. It requires an integrated solution for tracking, exploration and recognition, which traditionally have been seen as separate problems in active vision.We propose a probabilistic generative framework based on a mixture of Kalman filters and information gain maximisation that uses predictions in both recognition and attention-control. This framework can efficiently use the observations of one element in a dynamic environment to provide information on other elements, and consequently enables guided exploration.Interestingly, the sensors-control policy, directly derived from first principles, represents the intuitive trade-off between finding the most discriminative clues and maintaining overall awareness.Experiments on a simulated humanoid robot observing a human executing goal-oriented actions demonstrated improvement on recognition time and precision over baseline systems

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Symbol Emergence in Robotics: A Survey

    Full text link
    Humans can learn the use of language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form a symbol system and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted on the construction of robotic systems and machine-learning methods that can learn the use of language through embodied multimodal interaction with their environment and other systems. Understanding human social interactions and developing a robot that can smoothly communicate with human users in the long term, requires an understanding of the dynamics of symbol systems and is crucially important. The embodied cognition and social interaction of participants gradually change a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER is a constructive approach towards an emergent symbol system. The emergent symbol system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e., humans and developmental robots. Specifically, we describe some state-of-art research topics concerning SER, e.g., multimodal categorization, word discovery, and a double articulation analysis, that enable a robot to obtain words and their embodied meanings from raw sensory--motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions of research in SER.Comment: submitted to Advanced Robotic

    Dynamics of perceptual learning in visual search

    Get PDF
    The present work is concerned with a phenomenon referred to as contextual cueing. In visual search, if a searched-for target object is consistently encountered within a stable spatial arrangement of distractor objects, detecting the target becomes more efficient over time, relative to non-repeated, random arrangements. This effect is attributed to learned target-distractor spatial associations stored in long-term memory, which expedite visual search. This Thesis investigates four aspects of contextual cueing: Study 1 tackled the implicit-explicit debate of contextual cueing from a new perspective. Previous studies tested explicit access to learned displays by applying a recognition test, asking observers whether they have seen a given display in the previous search task. These tests, however, typically yield mixed findings and there is an on-going controversy whether contextual cueing can be described as an implicit or an explicit effect. The current study applied the new perspective of metacognition to contextual cueing and combined a contextual cueing task with metacognitive ratings about the clarity of the visual experience, either of the display configuration or the target stimulus. Bayesian analysis revealed that there was an effect of repeated context on metacognitive sensitivity for configuration, but not target, ratings. It was concluded that effects of contextual memory on metacognition are content-specific and lead to increased metacognitive access to the display configuration, but not to the target stimulus. The more general implication is that from the perspective of metacognition, contextual cueing can be considered as an explicit effect. Study 2 aimed at testing how explicit knowledge affects memory-guided visual search. Two sets of search displays were shown to participants: explicit and implicit displays. Explicit displays were introduced prior to the search experiment, in a dedicated learning session, and observers should deliberately learn these displays. Implicit displays, on the other hand, were first shown in the search experiment and learning was incidental through repeated exposure to these displays. Contextual cueing arising from explicit and implicit displays was assessed relative to a baseline condition of non-repeated displays. The results showed a standard contextual cueing effect for explicit displays and, interestingly, a negative cueing effect for implicit displays. Recognition performance was above chance for both types of repeated displays; however, it was higher for explicit displays. This pattern of results confirmed – in part – the predictions of a single memory model of attention-moderated associative learning, in which different display types compete for behavior and explicit representations block the retrieval of implicit representations. Study 3 investigates interactions between long-term contextual memory with short-term perceptual hypotheses. Both types of perceptual memory share high similarities with respect to their content, therefore the hypothesis was formulated that they share a common memory resource. In three experiments of interrupted search with repeated and non-repeated displays, it was shown that contextual cueing expedites performance in interrupted search; however, there was no interaction of contextual cueing with the generation or the confirmation of perceptual hypotheses. Rather, the analysis of fixational eye movements showed that long-term memory exerts its influence on search performance upon the first glance of a given display, essentially affecting the starting point of the search process. The behavior of approaching the target stimulus is then a product of generating and confirming perceptual hypotheses with these processes being unaffected by long-term contextual memory. It was concluded that long-term and short-term memory representations of the same search display are independent and exhibit additive effects on search performance. Study 4 is concerned with the effects of reward on perceptual learning. It was argued that rewarding repeated displays in a contextual cueing paradigm leads to an acceleration of the learning effect; however, it was not considered whether reward also has an effect in non-repeated displays. In these displays, at least the target position is kept constant while distractor configurations are random across repetitions. Usually this is done in order to account for target position-specific probability learning in contextual cueing. However, it is possible that probability learning itself is modulated by reward. The current experiment introduced high or low reward to repeated and importantly, also non-repeated displays. It was shown that reward had a huge effect on non-repeated displays, indicating that rewarding certain target positions, irrespective of the distractor layout, facilitates RT performance. Interestingly, reward effects were even larger for non-repeated compared to repeated displays. It was concluded that reward has a strong effect on probability-, and not context learning

    Sensorimotor Modulations by Cognitive Processes During Accurate Speech Discrimination: An EEG Investigation of Dorsal Stream Processing

    Get PDF
    Internal models mediate the transmission of information between anterior and posterior regions of the dorsal stream in support of speech perception, though it remains unclear how this mechanism responds to cognitive processes in service of task demands. The purpose of the current study was to identify the influences of attention and working memory on sensorimotor activity across the dorsal stream during speech discrimination, with set size and signal clarity employed to modulate stimulus predictability and the time course of increased task demands, respectively. Independent Component Analysis of 64–channel EEG data identified bilateral sensorimotor mu and auditory alpha components from a cohort of 42 participants, indexing activity from anterior (mu) and posterior (auditory) aspects of the dorsal stream. Time frequency (ERSP) analysis evaluated task-related changes in focal activation patterns with phase coherence measures employed to track patterns of information flow across the dorsal stream. ERSP decomposition of mu clusters revealed event-related desynchronization (ERD) in beta and alpha bands, which were interpreted as evidence of forward (beta) and inverse (alpha) internal modeling across the time course of perception events. Stronger pre-stimulus mu alpha ERD in small set discrimination tasks was interpreted as more efficient attentional allocation due to the reduced sensory search space enabled by predictable stimuli. Mu-alpha and mu-beta ERD in peri- and post-stimulus periods were interpreted within the framework of Analysis by Synthesis as evidence of working memory activity for stimulus processing and maintenance, with weaker activity in degraded conditions suggesting that covert rehearsal mechanisms are sensitive to the quality of the stimulus being retained in working memory. Similar ERSP patterns across conditions despite the differences in stimulus predictability and clarity, suggest that subjects may have adapted to tasks. In light of this, future studies of sensorimotor processing should consider the ecological validity of the tasks employed, as well as the larger cognitive environment in which tasks are performed. The absence of interpretable patterns of mu-auditory coherence modulation across the time course of speech discrimination highlights the need for more sensitive analyses to probe dorsal stream connectivity
    corecore