790 research outputs found

    GazeStereo3D: seamless disparity manipulations

    Get PDF
    Producing a high quality stereoscopic impression on current displays is a challenging task. The content has to be carefully prepared in order to maintain visual comfort, which typically affects the quality of depth reproduction. In this work, we show that this problem can be significantly alleviated when the eye fixation regions can be roughly estimated. We propose a new method for stereoscopic depth adjustment that utilizes eye tracking or other gaze prediction information. The key idea that distinguishes our approach from the previous work is to apply gradual depth adjustments at the eye fixation stage, so that they remain unnoticeable. To this end, we measure the limits imposed on the speed of disparity changes in various depth adjustment scenarios, and formulate a new model that can guide such seamless stereoscopic content processing. Based on this model, we propose a real-time controller that applies local manipulations to stereoscopic content to find the optimum between depth reproduction and visual comfort. We show that the controller is mostly immune to the limitations of low-cost eye tracking solutions. We also demonstrate benefits of our model in off-line applications, such as stereoscopic movie production, where skillful directors can reliably guide and predict viewers' attention or where attended image regions are identified during eye tracking sessions. We validate both our model and the controller in a series of user experiments. They show significant improvements in depth perception without sacrificing the visual quality when our techniques are applied

    Backwards is the way forward: feedback in the cortical hierarchy predicts the expected future

    Get PDF
    Clark offers a powerful description of the brain as a prediction machine, which offers progress on two distinct levels. First, on an abstract conceptual level, it provides a unifying framework for perception, action, and cognition (including subdivisions such as attention, expectation, and imagination). Second, hierarchical prediction offers progress on a concrete descriptive level for testing and constraining conceptual elements and mechanisms of predictive coding models (estimation of predictions, prediction errors, and internal models)

    Evolution in 3D

    Get PDF
    PhD ThesisThis thesis explores the mechanisms underlying motion vision in the praying mantis (Sphodromantis lineola) and how this visual predator perceives camouflaged prey. By recording the mantis optomotor response to wide-field motion I was able to define the mantis Dmax, the point where a pattern is displaced by such a distance that coherent motion is no longer perceived. This allowed me to investigate the spatial characteristics of the insect wide field motion processing pathway. The insect Dmax was found to be very similar to that observed in humans which suggests similar underlying motion processing mechanisms; whereby low spatial frequency local motion is being pooled over a larger visual area compared to higher spatial frequency motion. By recording the mantis tracking response to computer generated targets, I was able to investigate whether there are any benefits of background matching when prey are moving and whether pattern influences the predatory response of the mantis towards prey. I found that only prey with large pattern elements benefit from background matching during movement; and above all prey which remain un-patterned but match the mean luminance of the background receive the greatest survival advantage. Additionally, I examined the effects of background motion on the tracking response of the mantis towards moving prey. By using a computer generated target as prey, I investigated the benefits associated with matching background motion as a protective strategy to reduce the risk of detection by predators. I found the mantis was able to successfully track a moving target in the presence of background My results suggests that although there are no overall benefits for prey to match background motion, it is costly to move out of phase with the background motion. Finally, I examined the contrast sensitivity of the mantis wide-field and small target motion detection pathways. Using the mantis tracking response to small targets and the optomotor response to wide-field motion; I measured the distinct temporal and spatial signatures of each pathway. I found the mantis wide-field and small target movement detecting pathways are each tuned to a different set of spatial and temporal frequencies. The wide-field motion detecting pathway has a high sensitivity to a broad range of spatio-temporal frequencies making it sensitive to a broad range of velocities; whereas the small-target motion-detection pathway has a high sensitivity to a narrow set of spatio-temporal combinations with optimal sensitivity to targets with a low spatial frequencymotion

    Augmented reality and scene examination

    Get PDF
    The research presented in this thesis explores the impact of Augmented Reality on human performance, and compares this technology with Virtual Reality using a head-mounted video-feed for a variety of tasks that relate to scene examination. The motivation for the work was the question of whether Augmented Reality could provide a vehicle for training in crime scene investigation. The Augmented Reality application was developed using fiducial markers in the Windows Presentation Foundation, running on a wearable computer platform; Virtual Reality was developed using the Crytek game engine to present a photo-realistic 3D environment; and a video-feed was provided through head-mounted webcam. All media were presented through head-mounted displays of similar resolution to provide the sole source of visual information to participants in the experiments. The experiments were designed to increase the amount of mobility required to conduct the search task, i.e., from rotation in the horizontal or vertical plane through to movement around a room. In each experiment, participants were required to find objects and subsequently recall their location. It is concluded that human performance is affected not merely via the medium through which the world is perceived but moreover, the constraints governing how movement in the world is controlled

    Vision-Guided Robot Hearing

    Get PDF
    International audienceNatural human-robot interaction (HRI) in complex and unpredictable environments is important with many potential applicatons. While vision-based HRI has been thoroughly investigated, robot hearing and audio-based HRI are emerging research topics in robotics. In typical real-world scenarios, humans are at some distance from the robot and hence the sensory (microphone) data are strongly impaired by background noise, reverberations and competing auditory sources. In this context, the detection and localization of speakers plays a key role that enables several tasks, such as improving the signal-to-noise ratio for speech recognition, speaker recognition, speaker tracking, etc. In this paper we address the problem of how to detect and localize people that are both seen and heard. We introduce a hybrid deterministic/probabilistic model. The deterministic component allows us to map 3D visual data onto an 1D auditory space. The probabilistic component of the model enables the visual features to guide the grouping of the auditory features in order to form audiovisual (AV) objects. The proposed model and the associated algorithms are implemented in real-time (17 FPS) using a stereoscopic camera pair and two microphones embedded into the head of the humanoid robot NAO. We perform experiments with (i)~synthetic data, (ii)~publicly available data gathered with an audiovisual robotic head, and (iii)~data acquired using the NAO robot. The results validate the approach and are an encouragement to investigate how vision and hearing could be further combined for robust HRI

    Working and Learning with Knowledge in the Lobes of a Humanoid's Mind

    Get PDF
    Humanoid class robots must have sufficient dexterity to assist people and work in an environment designed for human comfort and productivity. This dexterity, in particular the ability to use tools, requires a cognitive understanding of self and the world that exceeds contemporary robotics. Our hypothesis is that the sense-think-act paradigm that has proven so successful for autonomous robots is missing one or more key elements that will be needed for humanoids to meet their full potential as autonomous human assistants. This key ingredient is knowledge. The presented work includes experiments conducted on the Robonaut system, a NASA and the Defense Advanced research Projects Agency (DARPA) joint project, and includes collaborative efforts with a DARPA Mobile Autonomous Robot Software technical program team of researchers at NASA, MIT, USC, NRL, UMass and Vanderbilt. The paper reports on results in the areas of human-robot interaction (human tracking, gesture recognition, natural language, supervised control), perception (stereo vision, object identification, object pose estimation), autonomous grasping (tactile sensing, grasp reflex, grasp stability) and learning (human instruction, task level sequences, and sensorimotor association)

    The emergence of active perception - seeking conceptual foundations

    Get PDF
    The aim of this thesis is to explain the emergence of active perception. It takes an interdisciplinary approach, by providing the necessary conceptual foundations for active perception research - the key notions that bridge the conceptual gaps remaining in understanding emergent behaviours of active perception in the context of robotic implementations. On the one hand, the autonomous agent approach to mobile robotics claims that perception is active. On the other hand, while explanations of emergence have been extensively pursued in Artificial Life, these explanations have not yet successfully accounted for active perception.The main question dealt with in this thesis is how active perception systems, as behaviour -based autonomous systems, are capable of providing relatively optimal perceptual guidance in response to environmental challenges, which are somewhat unpredictable. The answer is: task -level emergence on grounds of complicatedly combined computational strategies, but this notion needs further explanation.To study the computational strategies undertaken in active perception re- search, the thesis surveys twelve implementations. On the basis of the surveyed implementations, discussions in this thesis show that the perceptual task executed in support of bodily actions does not arise from the intentionality of a homuncu- lus, but is identified automatically on the basis of the dynamic small mod- ules of particular robotic architectures. The identified tasks are accomplished by quasi -functional modules and quasi- action modules, which maintain transformations of perceptual inputs, compute critical variables, and provide guidance of sensory -motor movements to the most relevant positions for fetching further needed information. Given the nature of these modules, active perception emerges in a different fashion from the global behaviour seen in other autonomous agent research.The quasi- functional modules and quasi- action modules cooperate by estimating the internal cohesion of various sources of information in support of the envisaged task. Specifically, such modules basically reflect various computational facilities for a species to single out the most important characteristics of its ecological niche. These facilities help to achieve internal cohesion, by maintaining a stepwise evaluation over the previously computed information, the required task, and the most relevant features presented in the environment.Apart from the above exposition of active perception, the process of task - level emergence is understood with certain principles extracted from four models of life origin. First, the fundamental structure of active perception is identified as the stepwise computation. Second, stepwise computation is promoted from baseline to elaborate patterns, i.e. from a simple system to a combinatory system. Third, a core requirement for all stepwise computational processes is the comparison between collected and needed information in order to insure the contribution to the required task. Interestingly, this point indicates that active perception has an inherent pragmatist dimension.The understanding of emergence in the present thesis goes beyond the distinc- tion between external processes and internal representations, which some current philosophers argue is required to explain emergence. The additional factors are links of various knowledge sources, in which the role of conceptual foundations is two -fold. On the one hand, those conceptual foundations elucidate how various knowledge sources can be linked. On the other, they make possible an interdisci- plinary view of emergence. Given this two -fold role, this thesis shows the unity of task -level emergence. Thus, the thesis demonstrates a cooperation between sci- ence and philosophy for the purpose of understanding the integrity of emergent cognitive phenomena

    Sviluppo e sperimentazione di un ambiente interattivo per il potenziamento della coordinazione visuo-motoria in bambini con ipovisione grave

    Get PDF
    In una società moderna basata sull'abilità del vedere, la vista gioca un ruolo critico in ogni momento e fase della vita di una persona. Purtroppo, non tutti "vedono" allo stesso modo. Con un team multidisciplinare che comprendeva ingegneri informatici e terapisti della Fondazione Robert Hollman, sono stati progettati e sviluppati una serie di mini giochi digitali esplicitamente rivolti a bambini con problemi di vista e che mirano a migliorare le loro abilità cognitive e/o motorio-sensoriali. Questa tesi analizza i requisiti dei giochi che hanno necessitato di un'attenta e dettagliata progettazione che tenesse conto delle caratteristiche e dei bisogni degli operatori (terapisti) e dei giocatori. Descrive anche i dettagli sull'implementazione di tre giochi. Questi si basano su un large-scale interactive environment e vengono giocati proiettando il campo sul pavimento. Sopra quest'area viene posto un sistema di motion capture che permette di tracciare la posizione dei bambini. I movimenti dei giocatori all'interno del campo vengono usati per farli interagire con gli elementi del gioco, producendo output visivi e uditivi adeguati. Infine, vengono discussi l'usabilità e la funzionalità del sistema tramite l'analisi dei dati raccolti durante uno studio pilota. Quattro terapisti e undici bambini sono stati coinvolti facendo utilizzare loro il sistema in un ambiente appositamente predisposto. I risultati hanno permesso al team di migliorare il prodotto e di definire una serie di linee guida utili a terapisti, progettisti e sviluppatori.In a modern society based on the ability to see, vision plays a critical role in every moment and stage of a person's life. Unfortunately, not everyone "sees" in the same way. With a multidisciplinary team including computer engineers and therapists from the Robert Hollman Foundation, a series of digital mini-games, explicitly aimed at children with visual impairment, were designed and developed with the aim of improving their cognitive and/or motor-sensory skills. This thesis analyses the design requirements of the games, which needed a careful and detailed design that took into account the characteristics and needs of the operators (therapists) and players. It also details the implementation of three games based on a large-scale interactive environment that are played by projecting the field onto the floor. Above this area a motion capture system is placed to track the position of the children. The players' movements within the field are used to make them interact with the game elements, producing appropriate visual and auditory outputs. Finally, the usability and functionality of the system are discussed through the analysis of data collected during a pilot study. Four therapists and eleven children has been involved making them use the system in a specially designed environment. The results allowed the team to improve the final product and to define a set of guidelines useful for designers, developers, and therapists
    • …
    corecore