204 research outputs found

    An audiovisual attention model for natural conversation scenes

    No full text
    International audienceClassical visual attention models neither consider social cues, such as faces, nor auditory cues, such as speech. However, faces are known to capture visual attention more than any other visual features, and recent studies showed that speech turn-taking affects the gaze of non-involved viewers. In this paper, we propose an audiovisual saliency model able to predict the eye movements of observers viewing other people having a conversation. Thanks to a speaker diarization algorithm, our audiovisual saliency model increases the saliency of the speakers compared to the addressees. We evaluated our model with eye-tracking data, and found that it significantly outperforms visual attention models using an equal and constant saliency value for all faces

    How saliency, faces, and sound influence gaze in dynamic social scenes

    No full text
    International audienceConversation scenes are a typical example in which classical models of visual attention dramatically fail to predict eye positions. Indeed, these models rarely consider faces as particular gaze attractors and never take into account the important auditory information that always accompanies dynamic social scenes. We recorded the eye movements of participants viewing dynamic conversations taking place in various contexts. Conversations were seen either with their original soundtracks or with unrelated soundtracks (unrelated speech and abrupt or continuous natural sounds). First, we analyze how auditory conditions influence the eye movement parameters of participants. Then, we model the probability distribution of eye positions across each video frame with a statistical method (Expectation- Maximization), allowing the relative contribution of different visual features such as static low-level visual saliency (based on luminance contrast), dynamic low- level visual saliency (based on motion amplitude), faces, and center bias to be quantified. Through experimental and modeling results, we show that regardless of the auditory condition, participants look more at faces, and especially at talking faces. Hearing the original soundtrack makes participants follow the speech turn-taking more closely. However, we do not find any difference between the different types of unrelated soundtracks. These eye- tracking results are confirmed by our model that shows that faces, and particularly talking faces, are the features that best explain the gazes recorded, especially in the original soundtrack condition. Low-level saliency is not a relevant feature to explain eye positions made on social scenes, even dynamic ones. Finally, we propose groundwork for an audiovisual saliency model

    Présentation et évaluation d'un modèle d'attention audiovisuelle sur une base de scènes de conversations dynamiques

    No full text
    Presentation OraleNational audienceLes modèles de saillance classiques ne prennent pas en compte les aspects "sociaux" de la perception visuelle, et donnent des résultats peu satisfaisants dès que des visages sont présents à l'écran. En effet ces modèles considèrent rarement les visages, et jamais l'information sonore, pourtant critique pour modéliser l'attention visuelle dans des scènes de conversations entre plusieurs personnes. Dans cette étude, nous proposons un modèle de saillance audiovisuelle pour prédire les régions les plus susceptibles d'attirer les regards de personnes explorant librement des scènes dynamiques de conversations. Nous montrons que notre modèle présente de meilleurs résultats que de précédents ne prenant pas en compte l'information sonore et attribuant une valeur de saillance égale et constante à tous les visages

    Interaction des virus avec la voie cellulaire de réponse à l’hypoxie

    Get PDF
    Des travaux récents réalisés in vitro montrent que l’appauvrissement du milieu de culture en oxygène (hypoxie) active la réplication du parvovirus humain B19, des virus du sarcome de Kaposi et de l’immunodéficience humaine, ainsi que l’expression de protéines virales oncogènes. Les mécanismes de cette régulation impliquent le plus souvent le facteur cellulaire majeur de réponse à l’hypoxie, HIF-1 (hypoxia inducible factor-1). Le dérèglement de ce facteur de transcription participe également au pouvoir oncogène de certains de ces virus.Recent studies show that low oxygen tension levels in cell culture up-regulate the replication of human B19 parvovirus, Kaposi’s sarcoma, and human immunodeficiency viruses as well as the expression of viral oncogenic proteins. The mechanisms of this regulation proceed with the major hypoxia-related factor, HIF-1 (hypoxia inducible factor-1). HIF-1 misregulation is implicated in the oncogenesis potential of some of these viruses

    Using natural versus artificial stimuli to perform calibration for 3D gaze tracking

    No full text
    International audienceThe presented study tests which type of stereoscopic image, natural or artificial, is more adapted to perform efficient and reliable calibration in order to track the gaze of observers in 3D space using classical 2D eye tracker. We measured the horizontal disparities, i.e. the difference between the x coordinates of the two eyes obtained using a 2D eye tracker. This disparity was recorded for each observer and for several target positions he had to fixate. Target positions were equally distributed in the 3D space, some on the screen (with a null disparity), some behind the screen (uncrossed disparity) and others in front of the screen (crossed disparity). We tested different regression models (linear and non linear) to explain either the true disparity or the depth with the measured disparity. Models were tested and compared on their prediction error for new targets at new positions. First of all, we found that we obtained more reliable disparities measures when using natural stereoscopic images rather than artificial. Second, we found that overall a non-linear model was more efficient. Finally, we discuss the fact that our results were observer dependent, with variability's between the observer's behavior when looking at 3D stimuli. Because of this variability, we proposed to compute observer specific model to accurately predict their gaze position when exploring 3D stimuli

    Model of Cortical Cell Processing to Estimate Binocular Disparity

    No full text
    International audienceThe starting point of our work are the physiological and psychophysical studies made on 3D vision, we attempt to build a model of stereoscopic vision. Hence, we used 2D Gabor filters to model the simple and complex cells sensitive to horizontal binocular disparity (Barlow 1967, Daugman 1985). Each of these cells has a preferred disparity and is sensitive to spatial frequency and orientation. It has been shown by Prince et al (2002) that the range of preferred disparities depends on the spatial frequency. We designed a bank of filters in which the distribution of preferred disparity follows the same principle. Moreover, since the stereo-threshold was found to be increasing with the magnitude of disparity inside each spatial frequency channel, the disparity distribution is not uniform. We took the energy model of Ohzawa et al (1986) as a basis since it has been demonstrated that it fits well with the disparity sensitive cells response from V1 in front of most of stimuli. We modified the classical model by normalizing the complex binocular response by the monocular complex response. We took different measures to reduce false matches such as a pooling procedure and an orientation averaging already used by Chen and Qian (2004). As already demonstrated for 2D vision, a coarse-to-fine process seems to be the best way to deal with multiple spatial frequency channels for stereoscopic vision (Smallman 1995, Menz and Freeman 2003). The first estimation based on low spatial frequencies determines if the estimation will be refined channels depending on its inclusion in the disparity range of the higher spatial frequency channel

    Statistical modeling of the influence of a visual distractor on the following eye-fixations

    No full text
    International audienceWe examined the influence of a visual distractor appearing during a fixation on the following fixations during natural exploration. It is known that new objects, congruent or incongruent with the scene, appearing during a fixation are fixated more than chance [Brockmole, J. R., & Henderson, J. M. (2008). Prioritizing new objects for eye fixation in real-world scenes: Effects of object-scene consistency. Vis. Cog., 16(2-3), 375-390]. In this study, we replicated this result using a Gabor patch for the appearing object, called a distractor because it was artificial and non-related to scenes. Besides, we wanted to quantify its influence on the exploration. A statistical model of the fixation density function was designed to analyze how the exploration was disrupted from and after the onset of the distractor. The model was composed of a linear weighted combination of different maps modeling three independent factors influencing gaze positions. We wondered whether fixation locations observed were rather due to the distractor or the saliency of the scenes. As expected, at the beginning of the exploration, fixation locations were not randomly chosen but influenced by the saliency of the scene and the distractor. The distractor onset strongly influenced fixations and this influence decreased with time

    How a distractor influences fixations during the exploration of natural scenes

    Get PDF
    The distractor effect is a well-established means of studying different aspects of fixation pro-gramming during the exploration of visual scenes. In this study, we present a task-irrelevant distractor to participants during the free exploration of natural scenes. We investigate the con-trol and programming of fixations by analyzing fixation durations and locations, and the link between the two. We also propose a simple mixture model evaluated using the Expectation-Maximization algorithm to test the distractor effect on fixation locations, including fixations which did not land on the distractor. The model allows us to quantify the influence of a visual distractor on fixation location relative to scene saliency for all fixations, at distractor onset and during all subsequent exploration. The distractor effect is not just limited to the current fixa-tion, it continues to influence fixations during subsequent exploration. An abrupt change in the stimulus not only increases the duration of the current fixation, it also influences the location of the fixation which occurs immediately afterwards and to some extent, in function of the length of the change, the duration and location of any subsequent fixations. Overall, results from the eye movement analysis and the statistical model suggest that fixation durations and locations are both controlled by direct and indirect mechanisms

    Influence of soundtrack on eye movements during video exploration

    Get PDF
    Models of visual attention rely on visual features such as orientation, intensity or motion to predict which regions of complex scenes attract the gaze of observers. So far, sound has never been considered as a possible feature that might influence eye movements. Here, we evaluate the impact of non-spatial sound on the eye movements of observers watching videos. We recorded eye movements of 40 participants watching assorted videos with and without their related soundtracks. We found that sound impacts on eye position, fixation duration and saccade amplitude. The effect of sound is not constant across time but becomes significant around one second after the beginning of video shots

    Color Information in a Model of Saliency

    No full text
    International audienceBottom-up saliency models have been developed to predict the location of gaze according to the low level features of visual scenes, such as intensity, color, frequency and motion. We investigate in this paper the contribution of color features in computing the bottom-up saliency. We incorporated a chrominance pathway to a luminance-based model (Marat et al.). We evaluated the performance of the model with and without chrominance pathway. We added an efficient multi-GPU implementation of the chrominance pathway to the parallel implementation of the luminance-based model proposed by Rahman et al., preserving real time solution. Results show that color information improves the performance of the saliency model in predicting eye positions
    • …
    corecore