17 research outputs found

    Human weapon-activity recognition in surveillance videos using structural-RNN

    No full text
    Abstract In today’s world, video surveillance systems play a vital role in commercial and industrial environments. The important goal of a surveillance activity is to observe suspicious behavior of humans and objects in a scene using camera or other sensors. Most of the current surveillance systems perform such activities by identifying persons, tracking their individual paths independently, not in conjunction with the objects in the scene. However, in a real world surveillance scenario, the behavior of people and their interaction with objects need to be modeled to reason about suspicious activities. Our contribution, through this work is in using the state-of-the-art Structural Recurrent Neural Networks (SRNN) method to model the complex spatio-temporal human-object interactions in surveillance. Our best results have a final F₁ score of 87.3 on the human sub-activity recognition task and 82.7 on the object affordances recognition task. Our work considered weapons as objects of interest

    An Opportunity to Investigate the Role of Specific Nonverbal Cues and First Impression in Interviews using Deepfake Based Controlled Video Generation.

    No full text
    International audienceThe study of nonverbal cues in a dyadic interaction, such as a job interview, mostly relies on videos and does not allow to disentangle the role of specific cues. It is thus not clear whether, for instance, an interviewee who smiles while listening to an interviewer would be perceived more favorably than an interviewee who only gazes at an interviewer. While a similar analysis in naturalistic situations requires careful curation of interview recordings, it still does not allow to disentangle the effect of specific nonverbal cues on first impression. Deepfake technology provides the opportunity to address this challenge by creating highly standardized videos of interviewees manifesting a determined behavior (i.e., a combination of specific nonverbal cues). Accordingly, we created a set of deepfake videos enabling us to manipulate the occurrence of three classes of nonverbal attributes (i.e., eye contact, nodding, and smiling). The deepfake videos showed interviewees manifesting one of four behaviors while listening to the interviewer: eye contact with smile and nod, eye contact with only nodding, just eye contact, and looking distracted. Then we tested whether these combinations of nonverbal cues influenced how the interviewees were perceived with respect to personality, confidence, and hireability. Our work reveals the potential of using deepfake technology for generating behaviorally controlled videos, useful for psychology experiments

    More than words: inference of socially relevant information from nonverbal vocal cues in speech

    Get PDF
    This paper presents two examples of how nonverbal communication can be automatically detected and interpreted in terms of social phenomena. In particular, the presented approaches use simple prosodic features to distinguish between journalists and non-journalists in media, and extract social networks from turn-taking to recognize roles in different interaction settings (broadcast data and meetings). Furthermore, the article outlines some of the most interesting perspectives in this line of research

    A generative score space for statistical dialog characterization in social signalling

    No full text
    The analysis of human conversations under a social signalling perspective recently raised the joint attention of pattern recognition and psychology researchers. In particular, the dialog classification represents an appealing recent application whose aim is to go beyond the meaning of the spoken words, focusing instead on the way the sentences are pronounced by capturing natural (or hidden) characteristics, such the mood of the conversation. An effective strategy to face this issue is to encode the turn-taking dynamics in a generative model, whose structure is composed by conditional dependencies among first-order Markov processes. In this paper, we follow this strategy, investigating how to boost the classification performances of this model and of the related higher-order Markov extensions, through the definition of a novel generative score space. Generative score spaces are employed to increase generative classification in a discriminative way, also allowing a deep understanding of the processed data through the use of standard pattern recognition strategies. Experiments on real data certify the goodness of our intuition

    Conversation analysis at work: detection of conflict in competitive discussions through semi-automatic turn-organization analysis

    No full text
    This study proposes a semi-automatic approach aimed at detecting conflict in conversations. The approach is based on statistical techniques capable of identifying turn-organization regularities associated with conflict. The only manual step of the process is the segmentation of the conversations into turns (time intervals during which only one person talks) and overlapping speech segments (time intervals during which several persons talk at the same time). The rest of the process takes place automatically and the results show that conflictual exchanges can be detected with Precision and Recall around 70% (the experiments have been performed over 6 h of political debates). The approach brings two main benefits: the first is the possibility of analyzing potentially large amounts of conversational data with a limited effort, the second is that the model parameters provide indications on what turn-regularities are most likely to account for the presence of conflict
    corecore