117,838 research outputs found

    Looking for genre: the use of structural features during search tasks with Wikipedia

    Get PDF
    This paper reports on our task-based observational, logged, questionnaire study and analysis of ocular behavior pertaining to the interaction of structural features of text in Wikipedia using eye tracking. We set natural and realistic tasks searching Wikipedia online focusing on examining which features and strategies (skimming or scanning) were the most important for the participants to complete their tasks. Our research, carried out on a group of 30 participants, highlighted their interactions with the structural areas within Wikipedia articles, the visual cues and features perceived during the searching of the Wiki text. We collected questionnaire and ocular behavior (fixation metrics) data to highlight the ways in which people view the features in the articles. We found that our participants’ extensively interacted with layout features, such as tables, titles, bullet lists, contents lists, information boxes, and references. The eye tracking results showed that participants used the format and layout features and they also highlighted them as important. They were able to navigate to useful information consistently, and they were an effective means of locating relevant information for the completion of their tasks with some success. This work presents results which contribute to the long-term goals of studying the features for genre and theoretical perception research

    Understanding Eye Movements: Psychophysics and a Model of Primary Visual Cortex

    Get PDF
    Humans move their eyes in order to learn visual representations of the world. These eye movements depend on distinct factors, either by the scene that we perceive or by our own decisions. To select what is relevant to attend is part of our survival mechanisms and the way we build reality, as we constantly react both consciously and unconsciously to all the stimuli that is projected into our eyes. In this thesis we try to explain (1) how we move our eyes, (2) how to build machines that understand visual information and deploy eye movements, and (3) how to make these machines understand tasks in order to decide for eye movements.(1) We provided the analysis of eye movement behavior elicited by low-level feature distinctiveness with a dataset of 230 synthetically-generated image patterns. A total of 15 types of stimuli has been generated (e.g. orientation, brightness, color, size, etc.), with 7 feature contrasts for each feature category. Eye-tracking data was collected from 34 participants during the viewing of the dataset, using Free-Viewing and Visual Search task instructions. Results showed that saliency is predominantly and distinctively influenced by: 1. feature type, 2. feature contrast, 3. temporality of fixations, 4. task difficulty and 5. center bias. From such dataset (SID4VAM), we have computed a benchmark of saliency models by testing performance using psychophysical patterns. Model performance has been evaluated considering model inspiration and consistency with human psychophysics. Our study reveals that state-of-the-art Deep Learning saliency models do not perform well with synthetic pattern images, instead, models with Spectral/Fourier inspiration outperform others in saliency metrics and are more consistent with human psychophysical experimentation.(2) Computations in the primary visual cortex (area V1 or striate cortex) have long been hypothesized to be responsible, among several visual processing mechanisms, of bottom-up visual attention (also named saliency). In order to validate this hypothesis, images from eye tracking datasets have been processed with a biologically plausible model of V1 (named Neurodynamic Saliency Wavelet Model or NSWAM). Following Li's neurodynamic model, we define V1's lateral connections with a network of firing rate neurons, sensitive to visual features such as brightness, color, orientation and scale. Early subcortical processes (i.e. retinal and thalamic) are functionally simulated. The resulting saliency maps are generated from the model output, representing the neuronal activity of V1 projections towards brain areas involved in eye movement control. We want to pinpoint that our unified computational architecture is able to reproduce several visual processes (i.e.  brightness, chromatic induction and visual discomfort) without applying any type of training or optimization and keeping the same parametrization. The model has been extended (NSWAM-CM) with an implementation of the cortical magnification function to define the retinotopical projections towards V1, processing neuronal activity for each distinct view during scene observation. Novel computational definitions of top-down inhibition (in terms of inhibition of return and selection mechanisms), are also proposed to predict attention in Free-Viewing and Visual Search conditions. Results show that our model outpeforms other biologically-inpired models of saliency prediction as well as to predict visual saccade sequences, specifically for nature and synthetic images. We also show how temporal and spatial characteristics of inhibition of return can improve prediction of saccades, as well as how distinct search strategies (in terms of feature-selective or category-specific inhibition) predict attention at distinct image contexts.(3) Although previous scanpath models have been able to efficiently predict saccades during Free-Viewing, it is well known that stimulus and task instructions can strongly affect eye movement patterns. In particular, task priming has been shown to be crucial to the deployment of eye movements, involving interactions between brain areas related to goal-directed behavior, working and long-term memory in combination with stimulus-driven eye movement neuronal correlates. In our latest study we proposed an extension of the Selective Tuning Attentive Reference Fixation Controller Model based on task demands (STAR-FCT), describing novel computational definitions of Long-Term Memory, Visual Task Executive and Task Working Memory. With these modules we are able to use textual instructions in order to guide the model to attend to specific categories of objects and/or places in the scene. We have designed our memory model by processing a visual hierarchy of low- and high-level features. The relationship between the executive task instructions and the memory representations has been specified using a tree of semantic similarities between the learned features and the object category labels. Results reveal that by using this model, the resulting object localization maps and predicted saccades have a higher probability to fall inside the salient regions depending on the distinct task instructions compared to saliency

    Understanding Eye Movements : Psychophysics and a Model of Primary Visual Cortex

    Get PDF
    Humans move their eyes in order to learn visual representations of the world. These eye movements depend on distinct factors, either by the scene that we perceive or by our own decisions. To select what is relevant to attend is part of our survival mechanisms and the way we build reality, as we constantly react both consciously and unconsciously to all the stimuli that is projected into our eyes. In this thesis we try to explain (1) how we move our eyes, (2) how to build machines that understand visual information and deploy eye movements, and (3) how to make these machines understand tasks in order to decide for eye movements.(1) We provided the analysis of eye movement behavior elicited by low-level feature distinctiveness with a dataset of 230 synthetically-generated image patterns. A total of 15 types of stimuli has been generated (e.g. orientation, brightness, color, size, etc.), with 7 feature contrasts for each feature category. Eye-tracking data was collected from 34 participants during the viewing of the dataset, using Free-Viewing and Visual Search task instructions. Results showed that saliency is predominantly and distinctively influenced by: 1. feature type, 2. feature contrast, 3. temporality of fixations, 4. task difficulty and 5. center bias. From such dataset (SID4VAM), we have computed a benchmark of saliency models by testing performance using psychophysical patterns. Model performance has been evaluated considering model inspiration and consistency with human psychophysics. Our study reveals that state-of-the-art Deep Learning saliency models do not perform well with synthetic pattern images, instead, models with Spectral/Fourier inspiration outperform others in saliency metrics and are more consistent with human psychophysical experimentation.(2) Computations in the primary visual cortex (area V1 or striate cortex) have long been hypothesized to be responsible, among several visual processing mechanisms, of bottom-up visual attention (also named saliency). In order to validate this hypothesis, images from eye tracking datasets have been processed with a biologically plausible model of V1 (named Neurodynamic Saliency Wavelet Model or NSWAM). Following Li's neurodynamic model, we define V1's lateral connections with a network of firing rate neurons, sensitive to visual features such as brightness, color, orientation and scale. Early subcortical processes (i.e. retinal and thalamic) are functionally simulated. The resulting saliency maps are generated from the model output, representing the neuronal activity of V1 projections towards brain areas involved in eye movement control. We want to pinpoint that our unified computational architecture is able to reproduce several visual processes (i.e. brightness, chromatic induction and visual discomfort) without applying any type of training or optimization and keeping the same parametrization. The model has been extended (NSWAM-CM) with an implementation of the cortical magnification function to define the retinotopical projections towards V1, processing neuronal activity for each distinct view during scene observation. Novel computational definitions of top-down inhibition (in terms of inhibition of return and selection mechanisms), are also proposed to predict attention in Free-Viewing and Visual Search conditions. Results show that our model outpeforms other biologically-inpired models of saliency prediction as well as to predict visual saccade sequences, specifically for nature and synthetic images. We also show how temporal and spatial characteristics of inhibition of return can improve prediction of saccades, as well as how distinct search strategies (in terms of feature-selective or category-specific inhibition) predict attention at distinct image contexts.(3) Although previous scanpath models have been able to efficiently predict saccades during Free-Viewing, it is well known that stimulus and task instructions can strongly affect eye movement patterns. In particular, task priming has been shown to be crucial to the deployment of eye movements, involving interactions between brain areas related to goal-directed behavior, working and long-term memory in combination with stimulus-driven eye movement neuronal correlates. In our latest study we proposed an extension of the Selective Tuning Attentive Reference Fixation Controller Model based on task demands (STAR-FCT), describing novel computational definitions of Long-Term Memory, Visual Task Executive and Task Working Memory. With these modules we are able to use textual instructions in order to guide the model to attend to specific categories of objects and/or places in the scene. We have designed our memory model by processing a visual hierarchy of low- and high-level features. The relationship between the executive task instructions and the memory representations has been specified using a tree of semantic similarities between the learned features and the object category labels. Results reveal that by using this model, the resulting object localization maps and predicted saccades have a higher probability to fall inside the salient regions depending on the distinct task instructions compared to saliency

    Measuring gaze and pupil in the real world: object-based attention,3D eye tracking and applications

    Get PDF
    This dissertation contains studies on visual attention, as measured by gaze orientation, and the use of mobile eye-tracking and pupillometry in applications. It combines the development of methods for mobile eye-tracking (studies II and III) with experimental studies on gaze guidance and pupillary responses in patients (studies IV and VI) and healthy observers (studies I and V). Object based attention / Study I What is the main factor of fixation guidance in natural scenes? Low-level features or objects? We developed a fixation-predicting model, which regards preferred viewing locations (PVL) per object and combines these distributions over the entirety of existing objects in the scene. Object-based fixation predictions for natural scene viewing perform at par with the best early salience model, that are based on low-level features. However, when stimuli are manipulated so that low-level features and objects are dissociated, the greater prediction power of saliency models diminishes. Thus, we dare to claim, that highly developed saliency models implicitly obtain object-hood and that fixation selection is mainly influenced by objects and much less by low-level features. Consequently, attention guidance in natural scenes is object-based. 3D tracking / Study II The second study focussed on improving calibration procedures for eye-in-head positions with a mobile eye-tracker.We used a mobile eye-tracker prototype, the EyeSeeCam with a high video-oculography (VOG) sampling rate and the technical gadget to follow the users gaze direction instantaneously with a rotatable camera. For a better accuracy in eye-positioning, we explored a refinement in the implementation of the eye-in-head calibration that yields a measure for fixation distance, which led to a mobile eye-tracker 3D calibration. Additionally, by developing the analytical mechanics for parametrically reorienting the gaze-centred camera, the 3D calibration could be applied to reliably record gaze-centred videos. Such videos are suitable as stimuli for investigating gaze-behaviour during object manipulation or object recognition in real worlds point-of-view (PoV) perspective. In fact, the 3D calibration produces a higher accuracy in positioning the gaze-centred camera over the whole 3D visual range. Study III, eye-tracking methods With a further development on the EyeSeeCam we achieved to record gaze-in-world data, by superposing eye-in-head and head-in-world coordinates. This novel approach uses a combination of few absolute head-positions extracted manually from the PoV video and of relative head-shifts integrated over angular velocities and translational accelerations, both given by an inertia measurement unit (IMU) synchronized to the VOG data. Gaze-in-world data consist of room-referenced gaze directions and their origins within the environment. They easily allow to assign fixation targets by using a 3D model of the measuring environment – a strong rationalisation regarding fixation analysis. Applications Study III Daylight is an important perceptual factor for visual comfort, but can also create discomfort glare situations during office work, so we developed to measure its behavioural influences. We achieve to compare luminance distributions and fixations in a real-world setting, by also recording indoor luminance variations time-resolved using luminance maps of a scenery spanning over a 3pi sr. Luminance evaluations in the workplace environment yield a well controlled categorisation of different lighting conditions and a localisation as well as a brightness measure of glare sources.We used common tasks like reading, typing on a computer, a phone call and thinking about a subject. The 3D model gives the possibility to test for gaze distribution shifts in the presence of glare patches and for variations between lighting conditions. Here, a low contrast lighting condition with no sun inside and a high contrast lighting condition with direct sunlight inside were compared. When the participants are not engaged in any visually focused task and the presence of the task support is minimal, the dominant view directions are inclined towards the view outside the window under the low contrast lighting conditions, but this tendency is less apparent and sways more towards the inside of the room under the high contrast lighting condition. This result implicates an avoidance of glare sources in gaze behaviour. In a second more extensive series of experiments, the participants’ subjective assessments of the lighting conditions will be included. Thus, the influence of glare can be analysed in more detail and tested whether visual discomfort judgements are correlated in differences in gaze-behaviour. Study IV The advanced eye-tracker calibration found application in several following projects and included in this dissertation is an investigation with patients suffering either from idiopathic Parkinson’s disease or from progressive supranuclear palsy (PSP) syndrome. PSP’s key symptom is the decreased ability to carry out vertical saccades and thus the main diagnostic feature for differentiating between the two forms of Parkinson’s syndrome. By measuring ocular movements during a rapid (< 20s) procedure with a standardized fixation protocol, we could successfully differentiate pre-diagnosed patients between idiopathic Parkinson’s disease and PSP, thus between PSP patients and HCs too. In PSP patients, the EyeSeeCam detected prominent impairment of both saccade velocity and amplitude. Furthermore, we show the benefits of a mobile eye-tracking device for application in clinical practice. Study V Decision-making is one of the basic cognitive processes of human behaviours and thus, also evokes a pupil dilation. Since this dilation reflects a marker for the temporal occurrence of the decision, we wondered whether individuals can read decisions from another’s pupil and thus become a mentalist. For this purpose, a modified version of the rock-paper-scissors childhood game was played with 3 prototypical opponents, while their eyes were video taped. These videos served as stimuli for further persons, who competed in rock-paper-scissors. Our results show, that reading decisions from a competitor’s pupil can be achieved and players can raise their winning probability significantly above chance. This ability does not require training but the instruction, that the time of maximum pupil dilation was indicative of the opponent’s choice. Therefore we conclude, that people could use the pupil to detect cognitive decisions in another individual, if they get explicit knowledge of the pupil’s utility. Study VI For patients with severe motor disabilities, a robust mean of communication is a crucial factor for well-being. Locked-in-Syndrome (LiS) patients suffer from quadriplegia and lack the ability of articulating their voice, though their consciousness is fully intact. While classic and incomplete LiS allows at least voluntary vertical eye movements or blinks to be used for communication, total LiS patients are not able to perform such movements. What remains, are involuntarily evoked muscle reactions, like it is the case with the pupillary response. The pupil dilation reflects enhanced cognitive or emotional processing, which we successfully observed in LiS patients. Furthermore, we created a communication system based on yes-no questions combined with the task of solving arithmetic problems during matching answer intervals, that yet invokes the most solid pupil dilation usable on a trial-by-trial basis for decoding yes or no as answers. Applied to HCs and patients with various severe motor disabilities, we provide the proof of principle that pupil responses allow communication for all tested HCs and 4/7 typical LiS patients. Résumé Together, the methods established within this thesis are promising advances in measuring visual attention allocation with 3D eye-tracking in real world and in the use of pupillometry as on-line measurement of cognitive processes. The two most outstanding findings are the possibility to communicate with complete LiS patients and further a conclusive evidence that objects are the primary unit of fixation selection in natural scenes

    Investigating the effectiveness of an efficient label placement method using eye movement data

    Get PDF
    This paper focuses on improving the efficiency and effectiveness of dynamic and interactive maps in relation to the user. A label placement method with an improved algorithmic efficiency is presented. Since this algorithm has an influence on the actual placement of the name labels on the map, it is tested if this efficient algorithms also creates more effective maps: how well is the information processed by the user. We tested 30 participants while they were working on a dynamic and interactive map display. Their task was to locate geographical names on each of the presented maps. Their eye movements were registered together with the time at which a given label was found. The gathered data reveal no difference in the user's response times, neither in the number and the duration of the fixations between both map designs. The results of this study show that the efficiency of label placement algorithms can be improved without disturbing the user's cognitive map. Consequently, we created a more efficient map without affecting its effectiveness towards the user

    The feasibility of capturing learner interactions based on logs informed by eye-tracking and remote observation studies

    Get PDF
    Two small studies, one an eye-tracking study and the other a remote observation study, have been conducted to investigate ways to identify two kinds of online learner interactions: users flicking through the web pages in "browsing" action, and users engaging with the content of a page in "learning" action. The video data from four participants of the two small studies using the OpenLearn open educational resource materials offers some evidence for differentiating between 'browsing' and 'learning'. Further analysis of the data has considered possible ways of identifying similar browsing and learning actions based on automatic user logs. This research provides a specification for researching the pedagogical value of capturing and transforming logs of user interactions into external forms of representations. The paper examines the feasibility and challenge of capturing learner interactions giving examples of external representations such as sequence flow charts, timelines, and table of logs. The objective users information these represent offer potential for understanding user interactions both to aid design and improve feedback means that they should be given greater consideration alongside other more subjective ways to research user experience
    • …
    corecore