11,118 research outputs found
Tracking the visual focus of attention for a varying number of wandering people
In this article, we define and address the problem of finding the visual focus of attention for a varying number of wandering people (VFOA-W) -- determining where a person is looking when their movement is unconstrained. VFOA-W estimation is a new and important problem with implications in behavior understanding and cognitive science, as well as real-world applications. One such application, presented in this article, monitors the attention passers-by pay to an outdoor advertisement using a single video camera. In our approach to the VFOA-W problem, we propose a multi-person tracking solution based on a dynamic Bayesian network that simultaneously infers the number of people in a scene, their body locations, their head locations, and their head pose. For efficient inference in the resulting variable-dimensional state-space we propose a Reversible Jump Markov Chain Monte Carlo (RJMCMC) sampling scheme, as well as a novel global observation model which determines the number of people in the scene and their locations. To determine if a person is looking at the advertisement or not, we propose a Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM)-based VFOA-W model which uses head pose and location information. Our models are evaluated for tracking performance and ability to recognize people looking at an outdoor advertisement, with results indicating good performance on sequences where up to three people pass in front of an advertisement
Tracking Attention for Multiple People: Wandering Visual Focus of Attention Estimation
The problem of finding the visual focus of attention of multiple people free to move in an unconstrained manner is defined here as the {\em wandering visual focus of attention} (WVFOA) problem. Estimating the WVFOA for multiple unconstrained people is a new and important problem with implications for human behavior understanding and cognitive science, as well as real-world applications. One such application, which we present in this article, monitors the attention passers-by pay to an outdoor advertisement. In our approach to the WVFOA problem, we propose a multi-person tracking solution based on a hybrid Dynamic Bayesian Network that simultaneously infers the number of people in a scene, their body locations, their head locations, and their head pose. It is defined in a joint state-space formulation that allows for the modeling of interactions between people. For inference in the resulting high-dimensional state-space, we propose a trans-dimensional Markov Chain Monte Carlo (MCMC) sampling scheme, which not only handles a varying number of people, but also efficiently searches the state-space by allowing person-part state updates. Our model was rigorously evaluated for tracking quality and ability to recognize people looking at an outdoor advertisement, and the results indicate good performance for these tasks
Tracking the Multi Person Wandering Visual Focus of Attention
Estimating the {\em wandering visual focus of attention} (WVFOA) for multiple people is an important problem with many applications in human behavior understanding. One such application, addressed in this paper, monitors the attention of passers-by to outdoor advertisements. To solve the WVFOA problem, we propose a multi-person tracking approach based on a hybrid Dynamic Bayesian Network that simultaneously infers the number of people in the scene, their body and head locations, and their head pose, in a joint state-space formulation that is amenable for person interaction modeling. The model exploits both global measurements and individual observations for the VFOA. For inference in the resulting high-dimensional state-space, we propose a trans-dimensional Markov Chain Monte Carlo (MCMC) sampling scheme, which not only handles a varying number of people, but also efficiently searches the state-space by allowing person-part state updates. Our model was rigorously evaluated for tracking and its ability to recognize when people look at an outdoor advertisement using a realistic data set
Tracking the Multi Person Wandering Visual Focus of Attention
Estimating the {\em wandering visual focus of attention} (WVFOA) for multiple people is an important problem with many applications in human behavior understanding. One such application, addressed in this paper, monitors the attention of passers-by to outdoor advertisements. To solve the WVFOA problem, we propose a multi-person tracking approach based on a hybrid Dynamic Bayesian Network that simultaneously infers the number of people in the scene, their body and head locations, and their head pose, in a joint state-space formulation that is amenable for person interaction modeling. The model exploits both global measurements and individual observations for the VFOA. For inference in the resulting high-dimensional state-space, we propose a trans-dimensional Markov Chain Monte Carlo (MCMC) sampling scheme, which not only handles a varying number of people, but also efficiently searches the state-space by allowing person-part state updates. Our model was rigorously evaluated for tracking and its ability to recognize when people look at an outdoor advertisement using a realistic data set
Rapid Serial Visual Presentation. Degradation of inferential reading comprehension as a function of speed
There is increasing interest in the readability of text presented on small digital screens. Designers have come up with novel text presentation methods, such as moving text from right to left, line-stepping, or showing successive text segments such as phrases or single words in a RSVP format. Comparative studies have indicated that RSVP is perhaps the best method of presenting text in a limited space. We tested the method using 209 participants divided into six groups. The groups included traditional reading, and RSVP reading at rates of 250, 300, 350, 400, and 450 wpm. No significant differences were found in comprehension for normal reading and RSVP reading at rates of 250, 300 and 350 wpm. However, higher rates produced significantly lower comprehension scores. It remains to be determined if, with additional practice and improved methods, good levels of reading comprehension at high rates can be achieved with RSV
Differential Impact of Interference on Internally- and Externally-Directed Attention.
Attention can be oriented externally to the environment or internally to the mind, and can be derailed by interference from irrelevant information originating from either external or internal sources. However, few studies have explored the nature and underlying mechanisms of the interaction between different attentional orientations and different sources of interference. We investigated how externally- and internally-directed attention was impacted by external distraction, how this modulated internal distraction, and whether these interactions were affected by healthy aging. Healthy younger and older adults performed both an externally-oriented visual detection task and an internally-oriented mental rotation task, performed with and without auditory sound delivered through headphones. We found that the addition of auditory sound induced a significant decrease in task performance in both younger and older adults on the visual discrimination task, and this was accompanied by a shift in the type of distractions reported (from internal to external). On the internally-oriented task, auditory sound only affected performance in older adults. These results suggest that the impact of external distractions differentially impacts performance on tasks with internal, as opposed to external, attentional orientations. Further, internal distractibility is affected by the presence of external sound and increased suppression of internal distraction
SALSA: A Novel Dataset for Multimodal Group Behavior Analysis
Studying free-standing conversational groups (FCGs) in unstructured social
settings (e.g., cocktail party ) is gratifying due to the wealth of information
available at the group (mining social networks) and individual (recognizing
native behavioral and personality traits) levels. However, analyzing social
scenes involving FCGs is also highly challenging due to the difficulty in
extracting behavioral cues such as target locations, their speaking activity
and head/body pose due to crowdedness and presence of extreme occlusions. To
this end, we propose SALSA, a novel dataset facilitating multimodal and
Synergetic sociAL Scene Analysis, and make two main contributions to research
on automated social interaction analysis: (1) SALSA records social interactions
among 18 participants in a natural, indoor environment for over 60 minutes,
under the poster presentation and cocktail party contexts presenting
difficulties in the form of low-resolution images, lighting variations,
numerous occlusions, reverberations and interfering sound sources; (2) To
alleviate these problems we facilitate multimodal analysis by recording the
social interplay using four static surveillance cameras and sociometric badges
worn by each participant, comprising the microphone, accelerometer, bluetooth
and infrared sensors. In addition to raw data, we also provide annotations
concerning individuals' personality as well as their position, head, body
orientation and F-formation information over the entire event duration. Through
extensive experiments with state-of-the-art approaches, we show (a) the
limitations of current methods and (b) how the recorded multiple cues
synergetically aid automatic analysis of social interactions. SALSA is
available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure
Physiological Indicators of Task Demand, Fatigue, and Cognition in Future Digital Manufacturing Environments
As Digital Manufacturing transforms traditionally physical work into more system-monitoring tasks, new methods are required for understanding people's mental workload and prolonged capacity for focused attention. Many physiological measures have shown promise for detecting changes in cognitive state, and recent advances in sensor technology offer minimally-invasive ways to monitor our cognitive activity. Previous research in functional near-infrared spectroscopy, for example, has observed changes in cerebral hemodynamic response during periods of high demand within tasks. This work investigated the relationships among task demand, fatigue, and attention degradation in a sustained attention task, and their effect on heart rate, breathing rate, nose temperature and hemodynamic response in the prefrontal cortex and middle temporal gyrus. Analysis revealed a small but significant effect of fatigue on heart rate relative to baseline, breathing rate and hemodynamic response. Task demand had a small but significant effect on breathing rate and nose temperature, both relative to baseline, but no difference between levels of demand was observed in heart rate or hemodynamic response. Our results provide insight into what physiological data can tell us about cognitive state, ability to focus, and the impact of fatigue over time
F-formation Detection: Individuating Free-standing Conversational Groups in Images
Detection of groups of interacting people is a very interesting and useful
task in many modern technologies, with application fields spanning from
video-surveillance to social robotics. In this paper we first furnish a
rigorous definition of group considering the background of the social sciences:
this allows us to specify many kinds of group, so far neglected in the Computer
Vision literature. On top of this taxonomy, we present a detailed state of the
art on the group detection algorithms. Then, as a main contribution, we present
a brand new method for the automatic detection of groups in still images, which
is based on a graph-cuts framework for clustering individuals; in particular we
are able to codify in a computational sense the sociological definition of
F-formation, that is very useful to encode a group having only proxemic
information: position and orientation of people. We call the proposed method
Graph-Cuts for F-formation (GCFF). We show how GCFF definitely outperforms all
the state of the art methods in terms of different accuracy measures (some of
them are brand new), demonstrating also a strong robustness to noise and
versatility in recognizing groups of various cardinality.Comment: 32 pages, submitted to PLOS On
- …