12,590 research outputs found

    Multichannel Attention Network for Analyzing Visual Behavior in Public Speaking

    Get PDF
    Public speaking is an important aspect of human communication and interaction. The majority of computational work on public speaking concentrates on analyzing the spoken content, and the verbal behavior of the speakers. While the success of public speaking largely depends on the content of the talk, and the verbal behavior, non-verbal (visual) cues, such as gestures and physical appearance also play a significant role. This paper investigates the importance of visual cues by estimating their contribution towards predicting the popularity of a public lecture. For this purpose, we constructed a large database of more than 18001800 TED talk videos. As a measure of popularity of the TED talks, we leverage the corresponding (online) viewers' ratings from YouTube. Visual cues related to facial and physical appearance, facial expressions, and pose variations are extracted from the video frames using convolutional neural network (CNN) models. Thereafter, an attention-based long short-term memory (LSTM) network is proposed to predict the video popularity from the sequence of visual features. The proposed network achieves state-of-the-art prediction accuracy indicating that visual cues alone contain highly predictive information about the popularity of a talk. Furthermore, our network learns a human-like attention mechanism, which is particularly useful for interpretability, i.e. how attention varies with time, and across different visual cues by indicating their relative importance

    Exploiting Group Structures to Infer Social Interactions From Videos

    Get PDF
    In this thesis, we consider the task of inferring the social interactions between humans by analyzing multi-modal data. Specifically, we attempt to solve some of the problems in interaction analysis, such as long-term deception detection, political deception detection, and impression prediction. In this work, we emphasize the importance of using knowledge about the group structure of the analyzed interactions. Previous works on the matter mostly neglected this aspect and analyzed a single subject at a time. Using the new Resistance dataset, collected by our collaborators, we approach the problem of long-term deception detection by designing a class of histogram-based features and a novel class of meta-features we callLiarRank. We develop a LiarOrNot model to identify spies in Resistance videos. We achieve AUCs of over 0.70 outperforming our baselines by 3% and human judges by 12%. For the problem of political deception, we first collect a dataset of videos and transcripts of 76 politicians from 18 countries making truthful and deceptive statements. We call it the Global Political Deception Dataset. We then show how to analyze the statements in a broader context by building a Video-Article-Topic graph. From this graph, we create a novel class of features called Deception Score that captures how controversial each topic is and how it affects the truthfulness of each statement. We show that our approach achieves 0.775 AUC outperforming competing baselines. Finally, we use the Resistance data to solve the problem of dyadic impression prediction. Our proposed Dyadic Impression Prediction System (DIPS) contains four major innovations: a novel class of features called emotion ranks, sign imbalance features derived from signed graphs theory, a novel method to align the facial expressions of subjects, and finally, we propose the concept of a multilayered stochastic network we call Temporal Delayed Network. Our DIPS architecture beats eight baselines from the literature, yielding statistically significant improvements of 19.9-30.8% in AUC

    Towards Inferring Users' Impressions of Robot Performance in Navigation Scenarios

    Full text link
    Human impressions of robot performance are often measured through surveys. As a more scalable and cost-effective alternative, we study the possibility of predicting people's impressions of robot behavior using non-verbal behavioral cues and machine learning techniques. To this end, we first contribute the SEAN TOGETHER Dataset consisting of observations of an interaction between a person and a mobile robot in a Virtual Reality simulation, together with impressions of robot performance provided by users on a 5-point scale. Second, we contribute analyses of how well humans and supervised learning techniques can predict perceived robot performance based on different combinations of observation types (e.g., facial, spatial, and map features). Our results show that facial expressions alone provide useful information about human impressions of robot performance; but in the navigation scenarios we tested, spatial features are the most critical piece of information for this inference task. Also, when evaluating results as binary classification (rather than multiclass classification), the F1-Score of human predictions and machine learning models more than doubles, showing that both are better at telling the directionality of robot performance than predicting exact performance ratings. Based on our findings, we provide guidelines for implementing these predictions models in real-world navigation scenarios

    Pressing Matter

    Get PDF
    This thesis document is divided into eight chapters, each one representative of a step and/or necessary component within traditional printmaking processes. Serving as both didactic terms and metaphoric interpretations, the steps are essential parts of my explorative and reactionary print process, methodically formed through intuition and response. Shedding light on the under workings of my research based practice, influences, and inspirations, each section adopts the traditional vocabulary of print as a strategy to validate the historically under-appreciated single impression (monotype) print and its ability to decelerate viewership. Using the method of “Registration”, I situate my practice in correlation to the land on which I work and live. My “First impressions” on the unique print are aligned with its infinitely distinctive and mysterious characteristics, which support the formulation of the core questions that drive my creative process. The most vital element of my practice, “Pressure”, is an entry point to write more specifically about the (physical) work, which shifts from the outside world to my body, and to the printing press. Chance is unavoidable: It is through “The Reveal” that the diverse potentialities of mediums and materials, as well as the occasionally unexpected variations in the process, display their meaningful impact. “The Proof” is the unique print in all its various states and inclusivity. Inevitably, there are many “Future Editions” to come. Their dissemination and display are dependent on the cyclical elements that enkindle their creation: when one series of work ends, new understandings, gestures and formations unearth, informing the next state. Lastly, the mirroring that occurs when a plate is printed and is revealed in reverse, acts as a reflection. While not knowing exactly how events and decisions during the making process will eventuate, the resolution comes to light by rumination

    Craniofacial Growth Series Volume 56

    Full text link
    https://deepblue.lib.umich.edu/bitstream/2027.42/153991/1/56th volume CF growth series FINAL 02262020.pdfDescription of 56th volume CF growth series FINAL 02262020.pdf : Proceedings of the 46th Annual Moyers Symposium and 44th Moyers Presymposiu

    Preliminary Study on Haptics of Textile Surfaces via Digital Visual Cues

    Get PDF
    Humans perceive through various sensory impressions, including the five senses. Not only the number of different stimuli in everyday life increase, but also the degree of assessment of urgent and irrelevant information. But online it is not possible for the customer to physically perceive and assess the haptics of a product. This paper focus on the questions if it is possible for humans to perceive and identify surface properties without using their sense of touch and if humans can judge and classify the haptics of a textile materials via digital channels through a purely visual perception
    • …
    corecore