11,118 research outputs found

    Tracking the visual focus of attention for a varying number of wandering people

    Get PDF
    In this article, we define and address the problem of finding the visual focus of attention for a varying number of wandering people (VFOA-W) -- determining where a person is looking when their movement is unconstrained. VFOA-W estimation is a new and important problem with implications in behavior understanding and cognitive science, as well as real-world applications. One such application, presented in this article, monitors the attention passers-by pay to an outdoor advertisement using a single video camera. In our approach to the VFOA-W problem, we propose a multi-person tracking solution based on a dynamic Bayesian network that simultaneously infers the number of people in a scene, their body locations, their head locations, and their head pose. For efficient inference in the resulting variable-dimensional state-space we propose a Reversible Jump Markov Chain Monte Carlo (RJMCMC) sampling scheme, as well as a novel global observation model which determines the number of people in the scene and their locations. To determine if a person is looking at the advertisement or not, we propose a Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM)-based VFOA-W model which uses head pose and location information. Our models are evaluated for tracking performance and ability to recognize people looking at an outdoor advertisement, with results indicating good performance on sequences where up to three people pass in front of an advertisement

    Tracking Attention for Multiple People: Wandering Visual Focus of Attention Estimation

    Get PDF
    The problem of finding the visual focus of attention of multiple people free to move in an unconstrained manner is defined here as the {\em wandering visual focus of attention} (WVFOA) problem. Estimating the WVFOA for multiple unconstrained people is a new and important problem with implications for human behavior understanding and cognitive science, as well as real-world applications. One such application, which we present in this article, monitors the attention passers-by pay to an outdoor advertisement. In our approach to the WVFOA problem, we propose a multi-person tracking solution based on a hybrid Dynamic Bayesian Network that simultaneously infers the number of people in a scene, their body locations, their head locations, and their head pose. It is defined in a joint state-space formulation that allows for the modeling of interactions between people. For inference in the resulting high-dimensional state-space, we propose a trans-dimensional Markov Chain Monte Carlo (MCMC) sampling scheme, which not only handles a varying number of people, but also efficiently searches the state-space by allowing person-part state updates. Our model was rigorously evaluated for tracking quality and ability to recognize people looking at an outdoor advertisement, and the results indicate good performance for these tasks

    Tracking the Multi Person Wandering Visual Focus of Attention

    Get PDF
    Estimating the {\em wandering visual focus of attention} (WVFOA) for multiple people is an important problem with many applications in human behavior understanding. One such application, addressed in this paper, monitors the attention of passers-by to outdoor advertisements. To solve the WVFOA problem, we propose a multi-person tracking approach based on a hybrid Dynamic Bayesian Network that simultaneously infers the number of people in the scene, their body and head locations, and their head pose, in a joint state-space formulation that is amenable for person interaction modeling. The model exploits both global measurements and individual observations for the VFOA. For inference in the resulting high-dimensional state-space, we propose a trans-dimensional Markov Chain Monte Carlo (MCMC) sampling scheme, which not only handles a varying number of people, but also efficiently searches the state-space by allowing person-part state updates. Our model was rigorously evaluated for tracking and its ability to recognize when people look at an outdoor advertisement using a realistic data set

    Tracking the Multi Person Wandering Visual Focus of Attention

    Get PDF
    Estimating the {\em wandering visual focus of attention} (WVFOA) for multiple people is an important problem with many applications in human behavior understanding. One such application, addressed in this paper, monitors the attention of passers-by to outdoor advertisements. To solve the WVFOA problem, we propose a multi-person tracking approach based on a hybrid Dynamic Bayesian Network that simultaneously infers the number of people in the scene, their body and head locations, and their head pose, in a joint state-space formulation that is amenable for person interaction modeling. The model exploits both global measurements and individual observations for the VFOA. For inference in the resulting high-dimensional state-space, we propose a trans-dimensional Markov Chain Monte Carlo (MCMC) sampling scheme, which not only handles a varying number of people, but also efficiently searches the state-space by allowing person-part state updates. Our model was rigorously evaluated for tracking and its ability to recognize when people look at an outdoor advertisement using a realistic data set

    Rapid Serial Visual Presentation. Degradation of inferential reading comprehension as a function of speed

    Get PDF
    There is increasing interest in the readability of text presented on small digital screens. Designers have come up with novel text presentation methods, such as moving text from right to left, line-stepping, or showing successive text segments such as phrases or single words in a RSVP format. Comparative studies have indicated that RSVP is perhaps the best method of presenting text in a limited space. We tested the method using 209 participants divided into six groups. The groups included traditional reading, and RSVP reading at rates of 250, 300, 350, 400, and 450 wpm. No significant differences were found in comprehension for normal reading and RSVP reading at rates of 250, 300 and 350 wpm. However, higher rates produced significantly lower comprehension scores. It remains to be determined if, with additional practice and improved methods, good levels of reading comprehension at high rates can be achieved with RSV

    Differential Impact of Interference on Internally- and Externally-Directed Attention.

    Get PDF
    Attention can be oriented externally to the environment or internally to the mind, and can be derailed by interference from irrelevant information originating from either external or internal sources. However, few studies have explored the nature and underlying mechanisms of the interaction between different attentional orientations and different sources of interference. We investigated how externally- and internally-directed attention was impacted by external distraction, how this modulated internal distraction, and whether these interactions were affected by healthy aging. Healthy younger and older adults performed both an externally-oriented visual detection task and an internally-oriented mental rotation task, performed with and without auditory sound delivered through headphones. We found that the addition of auditory sound induced a significant decrease in task performance in both younger and older adults on the visual discrimination task, and this was accompanied by a shift in the type of distractions reported (from internal to external). On the internally-oriented task, auditory sound only affected performance in older adults. These results suggest that the impact of external distractions differentially impacts performance on tasks with internal, as opposed to external, attentional orientations. Further, internal distractibility is affected by the presence of external sound and increased suppression of internal distraction

    SALSA: A Novel Dataset for Multimodal Group Behavior Analysis

    Get PDF
    Studying free-standing conversational groups (FCGs) in unstructured social settings (e.g., cocktail party ) is gratifying due to the wealth of information available at the group (mining social networks) and individual (recognizing native behavioral and personality traits) levels. However, analyzing social scenes involving FCGs is also highly challenging due to the difficulty in extracting behavioral cues such as target locations, their speaking activity and head/body pose due to crowdedness and presence of extreme occlusions. To this end, we propose SALSA, a novel dataset facilitating multimodal and Synergetic sociAL Scene Analysis, and make two main contributions to research on automated social interaction analysis: (1) SALSA records social interactions among 18 participants in a natural, indoor environment for over 60 minutes, under the poster presentation and cocktail party contexts presenting difficulties in the form of low-resolution images, lighting variations, numerous occlusions, reverberations and interfering sound sources; (2) To alleviate these problems we facilitate multimodal analysis by recording the social interplay using four static surveillance cameras and sociometric badges worn by each participant, comprising the microphone, accelerometer, bluetooth and infrared sensors. In addition to raw data, we also provide annotations concerning individuals' personality as well as their position, head, body orientation and F-formation information over the entire event duration. Through extensive experiments with state-of-the-art approaches, we show (a) the limitations of current methods and (b) how the recorded multiple cues synergetically aid automatic analysis of social interactions. SALSA is available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure

    Physiological Indicators of Task Demand, Fatigue, and Cognition in Future Digital Manufacturing Environments

    Get PDF
    As Digital Manufacturing transforms traditionally physical work into more system-monitoring tasks, new methods are required for understanding people's mental workload and prolonged capacity for focused attention. Many physiological measures have shown promise for detecting changes in cognitive state, and recent advances in sensor technology offer minimally-invasive ways to monitor our cognitive activity. Previous research in functional near-infrared spectroscopy, for example, has observed changes in cerebral hemodynamic response during periods of high demand within tasks. This work investigated the relationships among task demand, fatigue, and attention degradation in a sustained attention task, and their effect on heart rate, breathing rate, nose temperature and hemodynamic response in the prefrontal cortex and middle temporal gyrus. Analysis revealed a small but significant effect of fatigue on heart rate relative to baseline, breathing rate and hemodynamic response. Task demand had a small but significant effect on breathing rate and nose temperature, both relative to baseline, but no difference between levels of demand was observed in heart rate or hemodynamic response. Our results provide insight into what physiological data can tell us about cognitive state, ability to focus, and the impact of fatigue over time

    F-formation Detection: Individuating Free-standing Conversational Groups in Images

    Full text link
    Detection of groups of interacting people is a very interesting and useful task in many modern technologies, with application fields spanning from video-surveillance to social robotics. In this paper we first furnish a rigorous definition of group considering the background of the social sciences: this allows us to specify many kinds of group, so far neglected in the Computer Vision literature. On top of this taxonomy, we present a detailed state of the art on the group detection algorithms. Then, as a main contribution, we present a brand new method for the automatic detection of groups in still images, which is based on a graph-cuts framework for clustering individuals; in particular we are able to codify in a computational sense the sociological definition of F-formation, that is very useful to encode a group having only proxemic information: position and orientation of people. We call the proposed method Graph-Cuts for F-formation (GCFF). We show how GCFF definitely outperforms all the state of the art methods in terms of different accuracy measures (some of them are brand new), demonstrating also a strong robustness to noise and versatility in recognizing groups of various cardinality.Comment: 32 pages, submitted to PLOS On
    • …
    corecore