195 research outputs found

    Study of Human Hand-Eye Coordination Using Machine Learning Techniques in a Virtual Reality Setup

    Get PDF
    Theories of visually guided action are characterized as closed-loop control in the presence of reliable sources of visual information, and predictive control to compensate for visuomotor delay and temporary occlusion. However, prediction is not well understood. To investigate, a series of studies was designed to characterize the role of predictive strategies in humans as they perform visually guided actions, and to guide the development of computational models that capture these strategies. During data collection, subjects were immersed in a virtual reality (VR) system and were tasked with using a paddle to intercept a virtual ball. To force subjects into a predictive mode of control, the ball was occluded or made invisible for a portion of its 3D parabolic trajectory. The subjects gaze, hand and head movements were recorded during the performance. To improve the quality of gaze estimation, new algorithms were developed for the measurement and calibration of spatial and temporal errors of an eye tracking system. The analysis focused on the subjects gaze and hand movements reveal that, when the temporal constraints of the task did not allow the subjects to use closed-loop control, they utilized a short-term predictive strategy. Insights gained through behavioral analysis were formalized into computational models of visual prediction using machine learning techniques. In one study, LSTM recurrent neural networks were utilized to explain how information is integrated and used to guide predictive movement of the hand and eyes. In a subsequent study, subject data was used to train an inverse reinforcement learning (IRL) model that captures the full spectrum of strategies from closed-loop to predictive control of gaze and paddle placement. A comparison of recovered reward values between occlusion and no-occlusion conditions revealed a transition from online to predictive control strategies within a single course of action. This work has shed new insights on predictive strategies that guide our eye and hand movements

    High Fidelity Immersive Virtual Reality

    Get PDF

    Estimating motion and time to contact in 3D environments: Priors matter

    Get PDF
    [eng] Until the present moment, an extensive amount of research has been done on how humans estimate motion or parameters of a task, such as the timeto- contact in simple scenarios. However, most avoid questioning how we extract 3D information from 2D optic information. A Bayesian approach based on a combination of optic and prior knowledge about statistical regularities of the environment would allow solving the ambiguity when translating 2D into 3D estimates. The present dissertation aims to analyse if the estimation of motion and time-to-contact in complex 3D environments is compatible with a combination of visual and prior information. In the first study, we analyse the predictions of a Bayesian model with a preference for slow speeds to estimate the direction of an object. The information available to judge movement in depth is much less precise than information about the lateral movement. Thus, combining both sources of information with a prior with preference for low speeds, estimates of motion in depth will be proportionally more attracted to low speeds than estimates of lateral motion. Thus, the perceived direction would depend on stimulus speed when estimating the ball’s direction. Our experimental results showed that the bias in perceived direction increased at higher speeds, which would be congruent with increasingly less precise motion estimates (consistent with Weber’s law). In the second study, we analyse the existing evidence on using a priori knowledge of the Earth’s gravitational acceleration and the size of objects to estimate the time to contact in parabolic trajectories. We analysed the existing evidence for using knowledge of the Earth’s gravity and the size of an object in the interaction with the surrounding environment. Next, we simulate predictions of the GS model. This model allows predicting the time to contact based on a combination of a priori variables (gravity and ball size) and optic variables. We compare the accuracy of the predictions of time-to-contact with an alternative only using optic variables, showing that relying on priors of gravitation and ball size solves the ambiguity in the estimation of the time-to-contact. Finally, we offer scenarios where the GS model would lead to predictions with systematic errors, which we will test in the following studies. In the third study, we created trajectories for which the GS model gives accurate predictions of the time to contact at different flight times but provides different systematic errors at any other time. We hypothesized that if the ball’s visibility is restricted to a short time window, the participants would prefer to see the ball during the time windows in which the model predictions are accurate. Our results showed that observers preferred to use a relatively constant ball viewing time. However, we showed evidence that the direction of the errors made by the participants for the different trajectories tested corresponded to the direction predicted by the GS model. In the fourth and final study, we investigated the role of a priori knowledge of the Earth’s gravitational acceleration and ball size in estimating the time of flight and the direction of motion of an observer towards the interception point. We introduced our participants in an environment where both gravitational acceleration and ball size was randomized trial-to-trial. The observers’ task was to move towards the interception point and predict the remaining flight time after a short occlusion. Our results provide evidence for using prior knowledge of gravity and ball size to estimate the time-to-contact. We also find evidence that gravitational acceleration may play a role in guiding locomotion towards the interception point. In summary, in this thesis, we contribute to answering a fundamental question in Perception: how we interpret information to act in the world. To do so, we show evidence that humans apply their knowledge about regularities in the environment in the form of a priori knowledge of the Earth’s gravitational acceleration, the size of the ball, or that objects stand still in the world when interpreting visual information.[spa] Hasta el momento, se ha realizado una gran cantidad de investigación sobre como el ser humano estima el movimiento o los parámetros de una tarea como el tiempo de contacto en escenarios simples. Sin embargo, la mayoría evita preguntarse cómo se extrae la información 3D a partir de la información óptica 2D. Un enfoque bayesiano basado en una combinación de información óptica y a priori sobre regularidades estadísticas del entorno interiorizadas en forma de conocimiento permitiría resolver la ambigüedad a la hora de traducir claves ópticas en 2D a estimaciones sobre propiedades del mundo en 3D. El objetivo de esta tesis es analizar si la estimación del movimiento y del tiempo de contacto en entornos 3D complejos es compatible con una combinación de información visual y a priori. En el primer estudio, se analizan las predicciones de un modelo bayesiano con preferencia por las velocidades lentas para la estimación de la dirección de un objeto. La información disponible para juzgar el movimiento en profundidad es mucho menos precisa que la información sobre el movimiento lateral. Así, cuando se combinan ambas fuentes de información con un prior con preferencia por la velocidad baja, las estimaciones del movimiento en profundidad serán proporcionalmente más atraídas por el prior que las estimaciones del movimiento lateral. Por lo tanto, la dirección percibida dependería de la velocidad del estímulo. Nuestros resultados experimentales mostraron que el sesgo en la dirección percibida aumentaba a velocidades más altas, lo que sería congruente con estimaciones de movimiento cada vez menos precisas (consistente con la ley de Weber). En el segundo estudio, analizamos las evidencias existentes sobre el uso del conocimiento a priori de la aceleración gravitatoria de la Tierra y el tamaño de los objetos para estimar el tiempo de contacto en trayectorias parabólicas. Analizamos las pruebas existentes sobre el uso del conocimiento de la gravedad de la Tierra y el tamaño de un objeto en la interacción con el entorno. A continuación, simulamos las predicciones del modelo GS, un modelo que permite predecir el tiempo de contacto a partir de una combinación de variables a priori (gravedad y tamaño de pelota) y variables ópticas. Comparamos la precisión de las predicciones del tiempo de contacto con una alternativa que solo utiliza variables ópticas, mostrando que basarse en las variables a priori de la gravedad y el tamaño de la bola resuelve la ambigüedad en la estimación del tiempo de contacto. Por último, mostramos varios escenarios en los que el modelo GS conduciría a predicciones con errores sistemáticos; escenarios que pondremos a prueba en los siguientes estudios. En el tercer estudio, creamos trayectorias para las que el modelo GS da predicciones precisas del tiempo hasta el contacto en diferentes tiempos de vuelo, pero proporciona diferentes errores sistemáticos en cualquier otro momento. Hipotetizamos que, si la visibilidad de la pelota está restringida a una ventana de tiempo corta, los participantes preferirían ver la pelota durante las ventanas de tiempo en las que las predicciones del modelo son precisas. Nuestros resultados mostraron que los observadores preferían utilizar un tiempo de visualización de la pelota relativamente constante. Por otra parte, mostramos pruebas de que la dirección de los errores cometidos por los participantes para las diferentes trayectorias probadas se correspondía con la dirección predicha por el modelo GS. En el cuarto y último estudio, investigamos el papel del conocimiento a priori de la aceleración gravitatoria de la Tierra y del tamaño de la pelota en la estimación del tiempo de vuelo y la dirección de movimiento de un observador hacia el punto de interceptación. Introdujimos a nuestros participantes en un entorno en el que tanto la aceleración gravitatoria como el tamaño de la pelota se asignaban aleatoriamente ensayo a ensayo. La tarea de los observadores consistía en desplazarse hacia el punto de interceptación y predecir el tiempo de vuelo restante tras una breve oclusión. Nuestros resultados proporcionan pruebas del uso del conocimiento previo de la gravedad y el tamaño de la pelota para estimar el tiempo de contacto. También encontramos pruebas de que la aceleración gravitatoria puede desempeñar un papel en la orientación de la locomoción hacia el punto de intercepción. En resumen, en esta tesis contribuimos a responder a una cuestión fundamental en la Percepción: como interpretamos la información para actuar en el mundo. Para ello, mostramos evidencias de que los humanos aplican sus conocimientos sobre regularidades del entorno en forma de conocimiento a priori de la aceleración gravitatoria de la tierra, del tamaño de la pelota o de la estabilidad del mundo a nuestro alrededor para interpretar la información visual

    Technology

    Get PDF
    Session WA3 includes short reports concerning: (1) Physiolab A Cardio Vascular Laboratory; (2) MEDEX: A Flexible Modular Physiological Laboratory; (3) A Sensate Liner for Personnel Monitoring Applications; (4) Secure Remote Access to Physiological Data; (5) DARA Vestibular Equipment Onboard MIR; (6) The Kinelite Project: A New powerful Motion Analysis System for Spacelab Mission; (7) The Technical Evolution of the French Neurosciences Multipurpose Instruments Onboard the MIR Station; (8) Extended Ground-Based Research in Preparation for Life Sciences Experiments; and (9) MEDES Clinical Research Facility as a Tool to Prepare ISSA Space Flights

    Directing Attention in an Augmented Reality Environment: An Attentional Tunneling Evaluation

    Get PDF
    Augmented Reality applications use explicit cuing to support visual search. Explicit cues can help improve visual search performance but they can also cause perceptual issues such as attentional tunneling. An experiment was conducted to evaluate the relationship between directing attention and attentional tunneling, in a dual task structure. One task was tracking a target in motion and the other was detection of non-target elements. Three conditions were tested: baseline without cuing the target, cuing the target with the average scene color, and using a red cue. A different color for the cue was used to vary the attentional tunneling level. The results show that directing attention induced attentional tunneling only the in red condition and that effect is attributable to the color used for the cue

    Gravity and known size calibrate visual information to time parabolic trajectories

    Full text link
    Catching a ball in a parabolic flight is a complex task in which the time and area of interception are strongly coupled, making interception possible for a short period. Although this makes the estimation of time-to-contact (TTC) from visual information in parabolic trajectories very useful, previous attempts to explain our precision in interceptive tasks circumvent the need to estimate TTC to guide our action. Obtaining TTC from optical variables alone in parabolic trajectories would imply very complex transformations from 2D retinal images to a 3D layout. We propose based on previous work and show by using simulations that exploiting prior distributions of gravity and known physical size makes these transformations much simpler, enabling predictive capacities from minimal early visual information. Optical information is inherently ambiguous, and therefore, it is necessary to explain how these prior distributions generate predictions. Here is where the role of prior information comes into play: it could help to interpret and calibrate visual information to yield meaningful predictions of the remaining TTC. The objective of this work is: (1) to describe the primary sources of information available to the observer in parabolic trajectories; (2) unveil how prior information can be used to disambiguate the sources of visual information within a Bayesian encoding-decoding framework; (3) show that such predictions might be robust against complex dynamic environments; and (4) indicate future lines of research to scrutinize the role of prior knowledge calibrating visual information and prediction for action control

    The information for catching fly balls: judging and intercepting virtual balls in a CAVE

    Get PDF
    Visually guided action implies the existence of information as well as a control law relating that information to movement. For ball catching, the Chapman Strategy - keeping constant the rate of change of the tangent of the elevation angle (d(tan(α))/dt) - leads a catcher to the right location at the right time to intercept a fly ball. Previous studies showed the ability to detect the information and the consistency of running patterns with the use of the strategy. However, only direct manipulation of information can show its use. Participants were asked to intercept virtual balls in a Cave Automated Virtual Environment (CAVE) or to judge whether balls would pass behind or in front of them. Catchers in the CAVE successfully intercepted virtual balls with their forehead. Furthermore, the timing of judgments was related to the patterns of changing d(tan(α))/dt. The advantages and disadvantages of a CAVE as a tool for studying interceptive action are discussed

    Role of feedback in the accuracy of perceived direction of motion-in-depth and control of interceptive action

    Get PDF
    AbstractWe quantified the accuracy of the perception of the absolute direction of motion-in-depth (MID) of a simulated approaching object using a perceptual task and compared those data with the accuracy of estimating the passing distance measured by means of a simulated catching task. For the simulated catching task, movements of the index finger and thumb of the observer’s hand were tracked as participants tried to “catch” the simulated approaching object. A sensation of MID was created by providing monocular and/or binocular retinal image information. Visual stimuli were identical for perceptual and simulated catching tasks. We confirm previous reports that in the perceptual task, observers judged the object to pass wider of the head than indicated by the visual information provided. Although accuracy improved when auditory feedback was added to the perceptual (button pressing) task, consistent overestimates were still recorded. For the no-feedback simulated catching task, observers consistently overreached, i.e., the hand was further away from the midline than the simulated object at the time of hand closure. When auditory feedback was added to the simulated catching task successful catching was achieved. The relative accuracy in binocular and monocular conditions for individual observers could be partially explained by individual differences in sensitivity to unidirectional changes in angular size and changes in relative disparity. We conclude that catching an approaching ball requires that errors in the perceived direction of MID are corrected by feedback-driven learning in the motor system, and that this learning is more easily achieved for the catching action than for button pressing

    Investigation of vision strategies used in a dynamic visual acuity task

    Get PDF
    Purpose: Dynamic visual acuity (DVA), the ability to resolve fine details of a moving target, requires spatial resolution and accurate oculomotor control. Individuals who engage in activities in highly dynamic visual environments are thought to have superior dynamic visual acuity and utilize different gaze behaviours (fixations, smooth pursuits, and saccades). This study was designed to test the hypothesis that athletes and video game players (VGPs) have superior DVA to controls. Furthermore, the study was designed to investigate why DVA may be different between groups. Methods: A pre-registered, cross-sectional study examined static visual acuity (SVA), DVA, smooth pursuit gains, and gaze behaviours (fixations, smooth pursuits, and saccades) in 46 emmetropic participants (15 athletes, 11 VGPs, and 20 controls). Athletes were members of varsity teams (or equivalent) who played dynamic sports (such as hockey, soccer, and baseball) for more than 1 year with a current participation of more than 6 hours per week. VGPs played action video games four times per week for a minimum of one hour per day. Controls did not play sports or video games. SVA (LogMAR) was tested with an Early Treatment Diabetic Retinopathy Study (ETDRS) chart. DVA (LogMAR; mov&, V&mp Vision Suite) was tested with Tumbling E optotypes that moved either horizontally (left to right) or randomly (Brownian motion) at 5°/s, 10°/s, 20°/s, or 30°/s. Task response time was measured by averaging the amount of time it took to respond to each letter per trial (i.e random 30°/s, horizontal 10°/s, etc.) which indicated the time it took for a motor response to occur. Smooth pursuit gains were tested with El-Mar eye tracker while participants completed a step-ramp task with the same respective velocities as the DVA task. A one-way independent measures ANOVA was used to analyze smooth pursuits. Relative duration of gaze behaviours were measured with the Arrington eye tracker while participants performed the DVA task. A one-way independent measures ANOVA was used to test for group differences in SVA. A one-way ANOVA was used to test for group and speed differences in DVA. A repeated-measures two-way ANOVA was used to compare gaze behaviours of the first five and last five letters of 30°/s velocity. Results: SVA was not significantly different between groups (p=0.595). Random motion DVA at 30°/s was significantly different between groups (p=0.039), specifically between athletes and controls (p=0.030). Thus, athletes were better than controls at random 30°/s. Horizontal motion DVA at 30°/s was also significantly between groups (p=0.031). Post-hoc analysis revealed a significant difference between athletes and VGPs (p=0.046). This suggests that athletes were better than VGPs at horizontal 30°/s. DVA task response time per letter was not significantly different between groups for horizontal motion at 30°/s (p=0.707) or random motion at 30°/s (p=0.723). Therefore, the motor response times were similar between groups at both motion types. Smooth pursuit gains were not significantly different between group at 30°/s (p=0.100) which indicates similar physiological eye movements. Eye movement gaze behaviours of horizontal motion at 30°/s were not significant between each groups for fixations (p=0.598), smooth pursuits (p=0.226), and saccades (p=0.523). Similarly, there was no significant difference in gaze behaviours for random motion at 30°/s between groups, for fixation (p=0.503), smooth pursuits (p=0.481), and saccades (p=0.507). Thus, gaze behaviours for horizontal and random motion were similar for all groups. Conclusion: Athletes exhibited superior DVA for randomly moving targets compared to controls, and superior DVA for horizontally moving targets compared to VGPs. The task response times, gaze behaviours and smooth pursuit gains of each group were not significantly different. Therefore task response times, smooth pursuit gains and gaze behaviours cannot explain the superior DVA displayed by the athletes. Further research is required in order to determine why DVA in athletes is superior at 30°/s

    Novel haptic interface For viewing 3D images

    Get PDF
    In recent years there has been an explosion of devices and systems capable of displaying stereoscopic 3D images. While these systems provide an improved experience over traditional bidimensional displays they often fall short on user immersion. Usually these systems only improve depth perception by relying on the stereopsis phenomenon. We propose a system that improves the user experience and immersion by having a position dependent rendering of the scene and the ability to touch the scene. This system uses depth maps to represent the geometry of the scene. Depth maps can be easily obtained on the rendering process or can be derived from the binocular-stereo images by calculating their horizontal disparity. This geometry is then used as an input to be rendered in a 3D display, do the haptic rendering calculations and have a position depending render of the scene. The author presents two main contributions. First, since the haptic devices have a finite work space and limited resolution, we used what we call detail mapping algorithms. These algorithms compress geometry information contained in a depth map, by reducing the contrast among pixels, in such a way that it can be rendered into a limited resolution display medium without losing any detail. Second, the unique combination of a depth camera as a motion capturing system, a 3D display and haptic device to enhance user experience. While developing this system we put special attention on the cost and availability of the hardware. We decided to use only off-the-shelf, mass consumer oriented hardware so our experiments can be easily implemented and replicated. As an additional benefit the total cost of the hardware did not exceed the one thousand dollars mark making it affordable for many individuals and institutions
    corecore