12 research outputs found

    Multimodal Human Group Behavior Analysis

    Get PDF
    Human behaviors in a group setting involve a complex mixture of multiple modalities: audio, visual, linguistic, and human interactions. With the rapid progress of AI, automatic prediction and understanding of these behaviors is no longer a dream. In a negotiation, discovering human relationships and identifying the dominant person can be useful for decision making. In security settings, detecting nervous behaviors can help law enforcement agents spot suspicious people. In adversarial settings such as national elections and court defense, identifying persuasive speakers is a critical task. It is beneficial to build accurate machine learning (ML) models to predict such human group behaviors. There are two elements for successful prediction of group behaviors. The first is to design domain-specific features for each modality. Social and Psychological studies have uncovered various factors including both individual cues and group interactions, which inspire us to extract relevant features computationally. In particular, the group interaction modality plays an important role, since human behaviors influence each other through interactions in a group. Second, effective multimodal ML models are needed to align and integrate the different modalities for accurate predictions. However, most previous work ignored the group interaction modality. Moreover, they only adopt early fusion or late fusion to combine different modalities, which is not optimal. This thesis presents methods to train models taking multimodal inputs in group interaction videos, and to predict human group behaviors. First, we develop an ML algorithm to automatically predict human interactions from videos, which is the basis to extract interaction features and model group behaviors. Second, we propose a multimodal method to identify dominant people in videos from multiple modalities. Third, we study the nervousness in human behavior by a developing hybrid method: group interaction feature engineering combined with individual facial embedding learning. Last, we introduce a multimodal fusion framework that enables us to predict how persuasive speakers are. Overall, we develop one algorithm to extract group interactions and build three multimodal models to identify three kinds of human behavior in videos: dominance, nervousness and persuasion. The experiments demonstrate the efficacy of the methods and analyze the modality-wise contributions

    Argumentation Mining in User-Generated Web Discourse

    Full text link
    The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17

    Robust Methods for the Automatic Quantification and Prediction of Affect in Spoken Interactions

    Full text link
    Emotional expression plays a key role in interactions as it communicates the necessary context needed for understanding the behaviors and intentions of individuals. Therefore, a speech-based Artificial Intelligence (AI) system that can recognize and interpret emotional expression has many potential applications with measurable impact to a variety of areas, including human-computer interaction (HCI) and healthcare. However, there are several factors that make speech emotion recognition (SER) a difficult task; these factors include: variability in speech data, variability in emotion annotations, and data sparsity. This dissertation explores methodologies for improving the robustness of the automatic recognition of emotional expression from speech by addressing the impacts of these factors on various aspects of the SER system pipeline. For addressing speech data variability in SER, we propose modeling techniques that improve SER performance by leveraging short-term dynamical properties of speech. Furthermore, we demonstrate how data augmentation improves SER robustness to speaker variations. Lastly, we discover that we can make more accurate predictions of emotion by considering the fine-grained interactions between the acoustic and lexical components of speech. For addressing the variability in emotion annotations, we propose SER modeling techniques that account for the behaviors of annotators (i.e., annotators' reaction delay) to improve time-continuous SER robustness. For addressing data sparsity, we investigate two methods that enable us to learn robust embeddings, which highlight the differences that exist between neutral speech and emotionally expressive speech, without requiring emotion annotations. In the first method, we demonstrate how emotionally charged vocal expressions change speaker characteristics as captured by embeddings extracted from a speaker identification model, and we propose the use of these embeddings in SER applications. In the second method, we propose a framework for learning emotion embeddings using audio-textual data that is not annotated for emotion. The unification of the methods and results presented in this thesis helps enable the development of more robust SER systems, making key advancements toward an interactive speech-based AI system that is capable of recognizing and interpreting human behaviors.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/166106/1/aldeneh_1.pd

    Bending Opinion

    Get PDF
    With communication playing an increasingly important role in contemporary society, rhetoric appears to have gained in influence and importance. The ancients knew all along: power belongs to those who know how to use their words. Nowadays, we know that rhetoric pervades all discourse. There is no communication without rhetoric. In a society with ever-increasing amounts of information, and with media whose significance cannot be overestimated, we need to know all the mechanisms playing a role in the gathering, making and reporting of information and opinions, and its processing by an audience. Rhetoric is, from both a practical and a theoretical perspective, essential to the conduct, analysis and evaluation of public debates. After all, the idea of democracy is closely intertwined with the ideal of transparent decision-making on the basis of open, informed discussions in the public domain, in political, organizational and journalistic discourse. Bending Opinion cites a host of relevant examples, from Barack Obama to Geert Wilders, as well as compelling case studies

    Bending Opinion : Essays on Persuasion in the Public Domain

    Get PDF
    Het belang van communicatie neemt steeds verder toe in de huidige samenleving, en daarmee blijkt ook retoriek aan invloed te winnen. Het was al bekend in de Oudheid: de macht behoort hen toe die weten hoe ze hun woorden moeten gebruiken. Tegenwoordig beheerst retoriek het publieke debat, en is er geen communicatie zonder retoriek. In een samenleving die wordt gekenmerkt door een overvloed aan informatie en zeer invloedrijke media, is het van groot belang belang inzicht te krijgen in de mechanismen die een rol spelen bij het verzamelen, maken en het overbrengen van informatie en meningen, en de manier waarop het publiek deze verwerkt. Retoriek is, zowel vanuit praktisch als theoretisch oogpunt, essentieel voor de uitwerking, analyse en evaluatie van het publieke debat. Uiteindelijk is het idee van democratie nauw vervlochten met het ideaal van transparante besluitvorming op basis van open, goed-ge nformeerderde discussies in het publieke domein en het politieke en journalistieke discours. With communication playing an increasingly important role in contemporary society, rhetoric appears to have gained in influence and importance. The ancients knew all along: power belongs to those who know how to use their words. Nowadays, we know that rhetoric pervades all discourse. There is no communication without rhetoric. In a society with ever-increasing amounts of information, and with media whose significance cannot be overestimated, we need to know all the mechanisms playing a role in the gathering, making and reporting of information and opinions, and its processing by an audience. Rhetoric is, from both a practical and a theoretical perspective, essential to the conduct, analysis and evaluation of public debates. After all, the idea of democracy is closely intertwined with the ideal of transparent decision-making on the basis of open, informed discussions in the public domain, in political, organizational and journalistic discourse. Bending Opinion cites a host of relevant examples, from Barack Obama to Geert Wilders, as well as compelling case studies.9789400600201 (eisbn

    A Socio-Cognitive Approach To Political Interaction: An Analysis of Candidates Discourses in U.S. Political Campaign Debates.

    Get PDF
    The present research focuses on politeness in candidates discourses in U.S. political campaign debates of the 2000 elections from a socio-cognitive approach to social interaction. This approach entails an eclectic perspective on communication that intends to account for its cognitive, linguistic, relational and socio-cultural aspects in a determinate communicative encounter. This eclectic perspective is based on Brown and Levinsons (1987) Politeness Theory on the one hand, and Sperber and Wilsons Relevance Theory (1986/1995) on the other hand, with the latter constituting a cognitive complement to the former on theoretical grounds. From this eclectic approach, politeness has been conceived as the context-sensitive cognitive-based linguistic instantiation of social bonds. Therefore, politeness constitutes the linguistic enactment of social relationships in a specific communicative situation, and the internal knowledge on what is appropriate or inappropriate therein underlying such enactment. Politeness may thus consist of a) mitigating behaviour, whereby the speaker (S) attends to his/her own and/or the hearers (H) face or social image one wants for him/herself in a specific society (Brown & Levinson, 1987), or b) aggravating behaviour, that is, damage of ones own and/or Hs face. In view of this, the following research questions were posited in this study: 1) what are the main features of politicians face mitigating and aggravating sequences in terms of: type of politeness prevailing in these (if any), recurrent linguistic elements (if any), and typical location of these sequences in the whole discourse debates themselves constitute (if any)?; 2) what are the specific forms face mitigating and aggravating sequences adopt (if any), and which are their features?. In order to provide an answer to these questions, a total of 89 North-American electoral debates were collected together with other secondary data (e.g. newspaper articles, television programmes, etc.). These debates were organised into Corpus of Analysis (Corpus A) and Corpus of Reference (Corpus B), out of which the former consists of 16 debates corresponding to a total of 20 hours of on-going talk, and the latter contains the rest of the debates collected. Corpus A was transcribed in its entirety and analysed according to the units of analysis of the pragmatic sequence and the micro strategy. Overall, face mitigation, which is commonly directed towards the audience, appeared to be the predominant shape of politeness in candidates discourses in debates, more specifically, mitigation of the non-pure type oriented towards Hs positive face or his/her desire to be approved of (ibid.). This positive face attention characteristic of mitigating sequences in these events was found to be principally based on the strategies presuppose/raise/assert common ground, assert/presuppose Ss knowledge of and concern for Hs needs and wants, and offer and promise. These results back one of the main claims of this study, namely, that political campaign debates are essentially persuasive discourses besides antagonistic exchanges as the debate literature has commonly shown. Face aggravation, which is typically targeted at the opponent, was observed to primarily consist of aggravation of the pure sort oriented towards Hs negative face or his/her desire to be unimpeded upon (ibid.). Negative face aggravation was found to usually lie in the strategies increase imposition weight, refer to rights, duties and rules not respected, fulfilled or complied with respectively, and challenge. A possible explanation for the predominance of this variety of aggravation over others is that a) pure aggravation leaves no doubt as for a politicians intent to discredit the adversary, and b) negative face aggravation is not as hostile as positive face attack in debates and political discourse (cf., e.g. Lakoff, 2001), and enables the speaker to attack the rival without a potential boomerang effect on Ss own image. __________________________________________________________________________________________________ RESUMEN El presente trabajo explora el fenómeno de la cortesía lingüística en los discursos de los candidatos en debates electorales de las elecciones Norte-Americanas del año 2000. Para ello se ha partido de un enfoque ecléctico al estudio de la interacción social basado en la Teoría de la Cortesía de Brown y Levinson (1987) de carácter social, y la Teoría de la Relevancia de Sperber y Wilson (1986/1995) de carácter cognitivo, constituyendo esta última un complemento cognitivo a la primera a nivel teórico. En base a este enfoque, hemos concebido la cortesía en esta investigación como la manifestación lingüística de las relaciones sociales en contexto fundamentada en la cognición de los individuos. La cortesía se compone, por tanto, de comportamientos o actitudes lingüísticas de atención a la imagen social de un interlocutor (mitigadoras), y comportamientos o actitudes lingüísticas dañinas hacia dicha imagen (agravadoras). Con el objetivo de encontrar posibles tendencias o patrones definitorios de la cortesía en debates políticos Norte-Americanos de campaña, se procedió a la recogida de un corpus de datos consistente en 89 debates, de los cuales 16 (i.e. 20 horas de habla) fueron transcritos al completo y analizados según las unidades de análisis de la secuencia pragmática y la micro estrategia. En general, la mitigación, la cual se dirige comúnmente hacia la audiencia, resultó ser la forma predominante de la cortesía en estos eventos, en concreto, la mitigación no pura con orientación hacia la imagen positiva del oyente o su deseo de ser aprobado por los demás (Brown y Levinson, 1987). Por otro lado, la agravación, la cual se dirige normalmente al oponente, no resultó ser tan frecuente como la mitigación y aparecía sobre todo en forma de agravación hacia la imagen negativa del oyente o su deseo de tener libertad de acción. Estos resultados apoyan en general uno de los argumentos principales de esta investigación, a saber, que los debates electorales son fundamentalmente discursos persuasivos más que encuentros antagónicos como la bibliografía sobre debates se ha centrado en mostrar

    Post Rio Communication Styles for Deliberation:between individualization and collective action

    Get PDF
    corecore