12,590 research outputs found
Multichannel Attention Network for Analyzing Visual Behavior in Public Speaking
Public speaking is an important aspect of human communication and
interaction. The majority of computational work on public speaking concentrates
on analyzing the spoken content, and the verbal behavior of the speakers. While
the success of public speaking largely depends on the content of the talk, and
the verbal behavior, non-verbal (visual) cues, such as gestures and physical
appearance also play a significant role. This paper investigates the importance
of visual cues by estimating their contribution towards predicting the
popularity of a public lecture. For this purpose, we constructed a large
database of more than TED talk videos. As a measure of popularity of the
TED talks, we leverage the corresponding (online) viewers' ratings from
YouTube. Visual cues related to facial and physical appearance, facial
expressions, and pose variations are extracted from the video frames using
convolutional neural network (CNN) models. Thereafter, an attention-based long
short-term memory (LSTM) network is proposed to predict the video popularity
from the sequence of visual features. The proposed network achieves
state-of-the-art prediction accuracy indicating that visual cues alone contain
highly predictive information about the popularity of a talk. Furthermore, our
network learns a human-like attention mechanism, which is particularly useful
for interpretability, i.e. how attention varies with time, and across different
visual cues by indicating their relative importance
Exploiting Group Structures to Infer Social Interactions From Videos
In this thesis, we consider the task of inferring the social interactions between humans by analyzing multi-modal data. Specifically, we attempt to solve some of the problems in interaction analysis, such as long-term deception detection, political deception detection, and impression prediction. In this work, we emphasize the importance of using knowledge about the group structure of the analyzed interactions. Previous works on the matter mostly neglected this aspect and analyzed a single subject at a time. Using the new Resistance dataset, collected by our collaborators, we approach the problem of long-term deception detection by designing a class of histogram-based features and a novel class of meta-features we callLiarRank. We develop a LiarOrNot model to identify spies in Resistance videos. We achieve AUCs of over 0.70 outperforming our baselines by 3% and human judges by 12%. For the problem of political deception, we first collect a dataset of videos and transcripts of 76 politicians from 18 countries making truthful and deceptive statements. We call it the Global Political Deception Dataset. We then show how to analyze the statements in a broader context by building a Video-Article-Topic graph. From this graph, we create a novel class of features called Deception Score that captures how controversial each topic is and how it affects the truthfulness of each statement. We show that our approach achieves 0.775 AUC outperforming competing baselines. Finally, we use the Resistance data to solve the problem of dyadic impression prediction. Our proposed Dyadic Impression Prediction System (DIPS) contains four major innovations: a novel class of features called emotion ranks, sign imbalance features derived from signed graphs theory, a novel method to align the facial expressions of subjects, and finally, we propose the concept of a multilayered stochastic network we call Temporal Delayed Network. Our DIPS architecture beats eight baselines from the literature, yielding statistically significant improvements of 19.9-30.8% in AUC
Towards Inferring Users' Impressions of Robot Performance in Navigation Scenarios
Human impressions of robot performance are often measured through surveys. As
a more scalable and cost-effective alternative, we study the possibility of
predicting people's impressions of robot behavior using non-verbal behavioral
cues and machine learning techniques. To this end, we first contribute the SEAN
TOGETHER Dataset consisting of observations of an interaction between a person
and a mobile robot in a Virtual Reality simulation, together with impressions
of robot performance provided by users on a 5-point scale. Second, we
contribute analyses of how well humans and supervised learning techniques can
predict perceived robot performance based on different combinations of
observation types (e.g., facial, spatial, and map features). Our results show
that facial expressions alone provide useful information about human
impressions of robot performance; but in the navigation scenarios we tested,
spatial features are the most critical piece of information for this inference
task. Also, when evaluating results as binary classification (rather than
multiclass classification), the F1-Score of human predictions and machine
learning models more than doubles, showing that both are better at telling the
directionality of robot performance than predicting exact performance ratings.
Based on our findings, we provide guidelines for implementing these predictions
models in real-world navigation scenarios
Recommended from our members
Identifying interior spatial dimensions according to user preference: an associative concept network analysis
English: This study proposed a fundamental technique for evaluating the preferences of interior space users by capturing their verbally expressed preferences and then determining word associations. To accomplish this, the Pajek visualization software for large network analysis was employed in conjunction with the USF Word Association dictionary to visualize the structures and network depths of the derived associative meanings. The generated associative words were then qualitatively categorized into taxonomic word groups to reveal 13 dimensions of perceived interior-environmental quality, as follows: House-related, Territorial, Impression, Activity, Active Element of Nature, Nature, Building Materials, Companion, Household Basics, Color, Location, Composition, and Time Period. A factor analysis was then conducted to sort the generated associative words according to Out-Degree Centrality/ODC score. These were validated into five factors that appeared to influence the comfort levels of interior space users. These five factors and 13 dimensions are useful as objective bases for determining the composition of adjectival pairs through the Semantic Differential (SD) method, which helps designers and architects evaluate interior space preferences.
Indonesian: Penelitian ini menggunakan teknik fundamental untuk melakukan evaluasi terhadap preferensi ekspresi verbal pengguna ruang interior dengan cara menghimpun kata-kata asosiatif kesan mendalam pengguna (user's in-depth impression). Peneliti menggunakan perangkat lunak visualisasi Pajek untuk analisis data jaringan yang sangat besar yang dibantu dengan penggunaan kamus USF Word Association; perangkat lunak dan kamus ini digunakan untuk memvisualisasikan struktur dan kedalaman jaringan makna asosiatif yang terbentuk. Hasil pengumpulan kata-kata asosiatif kemudian dikelompokkan secara kualitatif berdasarkan pengelompokan taksonomi kata menjadi 13 dimensi kualitas lingkungan-interior berdasarkan persepsi: Terkait rumah (Housedengan menggunakan analisis faktor, sejumlah kata terpilih yang memiliki nilai sebaran kata asosiatif yang tinggi (Out-Degree Centrality/ODC score) divalidasi menjadi 5 faktor yang berpengaruh terhadap kenyamanan pengguna ruang interior. Hasilnya, 5 Faktor dan 13 Dimensi ini menjadi dasar yang objektif dalam menentukan komposisi pasangan kata adjektif pada Semantic Differential method (SD) yang dapat membantu desainer/arsitek mengevaluasi preferensi pengguna ruang interior. Kata Kunci: dimensi spasial interior, kenyamanan pengguna, konsep asosiatif, analisis jaringan
Pressing Matter
This thesis document is divided into eight chapters, each one representative of a step and/or necessary component within traditional printmaking processes. Serving as both didactic terms and metaphoric interpretations, the steps are essential parts of my explorative and reactionary print process, methodically formed through intuition and response. Shedding light on the under workings of my research based practice, influences, and inspirations, each section adopts the traditional vocabulary of print as a strategy to validate the historically under-appreciated single impression (monotype) print and its ability to decelerate viewership. Using the method of “Registration”, I situate my practice in correlation to the land on which I work and live. My “First impressions” on the unique print are aligned with its infinitely distinctive and mysterious characteristics, which support the formulation of the core questions that drive my creative process. The most vital element of my practice, “Pressure”, is an entry point to write more specifically about the (physical) work, which shifts from the outside world to my body, and to the printing press. Chance is unavoidable: It is through “The Reveal” that the diverse potentialities of mediums and materials, as well as the occasionally unexpected variations in the process, display their meaningful impact. “The Proof” is the unique print in all its various states and inclusivity. Inevitably, there are many “Future Editions” to come. Their dissemination and display are dependent on the cyclical elements that enkindle their creation: when one series of work ends, new understandings, gestures and formations unearth, informing the next state. Lastly, the mirroring that occurs when a plate is printed and is revealed in reverse, acts as a reflection. While not knowing exactly how events and decisions during the making process will eventuate, the resolution comes to light by rumination
Craniofacial Growth Series Volume 56
https://deepblue.lib.umich.edu/bitstream/2027.42/153991/1/56th volume CF growth series FINAL 02262020.pdfDescription of 56th volume CF growth series FINAL 02262020.pdf : Proceedings of the 46th Annual Moyers Symposium and 44th Moyers Presymposiu
Preliminary Study on Haptics of Textile Surfaces via Digital Visual Cues
Humans perceive through various sensory impressions, including the five senses. Not only the number of different stimuli in everyday life increase, but also the degree of assessment of urgent and irrelevant information. But online it is not possible for the customer to physically perceive and assess the haptics of a product. This paper focus on the questions if it is possible for humans to perceive and identify surface properties without using their sense of touch and if humans can judge and classify the haptics of a textile materials via digital channels through a purely visual perception
- …