1,598 research outputs found

    SALSA: A Novel Dataset for Multimodal Group Behavior Analysis

    Get PDF
    Studying free-standing conversational groups (FCGs) in unstructured social settings (e.g., cocktail party ) is gratifying due to the wealth of information available at the group (mining social networks) and individual (recognizing native behavioral and personality traits) levels. However, analyzing social scenes involving FCGs is also highly challenging due to the difficulty in extracting behavioral cues such as target locations, their speaking activity and head/body pose due to crowdedness and presence of extreme occlusions. To this end, we propose SALSA, a novel dataset facilitating multimodal and Synergetic sociAL Scene Analysis, and make two main contributions to research on automated social interaction analysis: (1) SALSA records social interactions among 18 participants in a natural, indoor environment for over 60 minutes, under the poster presentation and cocktail party contexts presenting difficulties in the form of low-resolution images, lighting variations, numerous occlusions, reverberations and interfering sound sources; (2) To alleviate these problems we facilitate multimodal analysis by recording the social interplay using four static surveillance cameras and sociometric badges worn by each participant, comprising the microphone, accelerometer, bluetooth and infrared sensors. In addition to raw data, we also provide annotations concerning individuals' personality as well as their position, head, body orientation and F-formation information over the entire event duration. Through extensive experiments with state-of-the-art approaches, we show (a) the limitations of current methods and (b) how the recorded multiple cues synergetically aid automatic analysis of social interactions. SALSA is available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure

    Cognitive visual tracking and camera control

    Get PDF
    Cognitive visual tracking is the process of observing and understanding the behaviour of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level feedback control loop, which is the main novelty of our work, will serve to reduce uncertainties in the observed scene and to maximize the amount of information extracted from it. It is implemented with a distributed camera system using SQL tables as virtual communication channels, and Situation Graph Trees for knowledge representation, inference and high-level camera control. A set of experiments in a surveillance scenario show the effectiveness of our approach and its potential for real applications of cognitive vision

    Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy

    Full text link
    In this paper we shall consider the problem of deploying attention to subsets of the video streams for collating the most relevant data and information of interest related to a given task. We formalize this monitoring problem as a foraging problem. We propose a probabilistic framework to model observer's attentive behavior as the behavior of a forager. The forager, moment to moment, focuses its attention on the most informative stream/camera, detects interesting objects or activities, or switches to a more profitable stream. The approach proposed here is suitable to be exploited for multi-stream video summarization. Meanwhile, it can serve as a preliminary step for more sophisticated video surveillance, e.g. activity and behavior analysis. Experimental results achieved on the UCR Videoweb Activities Dataset, a publicly available dataset, are presented to illustrate the utility of the proposed technique.Comment: Accepted to IEEE Transactions on Image Processin
    • ā€¦
    corecore