539 research outputs found

    VICA, a visual counseling agent for emotional distress

    Get PDF
    We present VICA, a Visual Counseling Agent designed to create an engaging multimedia face-to-face interaction. VICA is a human-friendly agent equipped with high-performance voice conversation designed to help psychologically stressed users, to offload their emotional burden. Such users specifically include non-computer-savvy elderly persons or clients. Our agent builds replies exploiting interlocutor\u2019s utterances expressing such as wishes, obstacles, emotions, etc. Statements asking for confirmation, details, emotional summary, or relations among such expressions are added to the utterances. We claim that VICA is suitable for positive counseling scenarios where multimedia specifically high-performance voice communication is instrumental for even the old or digital divided users to continue dialogue towards their self-awareness. To prove this claim, VICA\u2019s effect is evaluated with respect to a previous text-based counseling agent CRECA and ELIZA including its successors. An experiment involving 14 subjects shows VICA effects as follows: (i) the dialogue continuation (CPS: Conversation-turns Per Session) of VICA for the older half (age > 40) substantially improved 53% to CRECA and 71% to ELIZA. (ii) VICA\u2019s capability to foster peace of mind and other positive feelings was assessed with a very high score of 5 or 6 mostly, out of 7 stages of the Likert scale, again by the older. Compared on average, such capability of VICA for the older is 5.14 while CRECA (all subjects are young students, age < 25) is 4.50, ELIZA is 3.50, and the best of ELIZA\u2019s successors for the older (> 25) is 4.41

    SID 04, Social Intelligence Design:Proceedings Third Workshop on Social Intelligence Design

    Get PDF

    Gesture in Automatic Discourse Processing

    Get PDF
    Computers cannot fully understand spoken language without access to the wide range of modalities that accompany speech. This thesis addresses the particularly expressive modality of hand gesture, and focuses on building structured statistical models at the intersection of speech, vision, and meaning.My approach is distinguished in two key respects. First, gestural patterns are leveraged to discover parallel structures in the meaning of the associated speech. This differs from prior work that attempted to interpret individual gestures directly, an approach that was prone to a lack of generality across speakers. Second, I present novel, structured statistical models for multimodal language processing, which enable learning about gesture in its linguistic context, rather than in the abstract.These ideas find successful application in a variety of language processing tasks: resolving ambiguous noun phrases, segmenting speech into topics, and producing keyframe summaries of spoken language. In all three cases, the addition of gestural features -- extracted automatically from video -- yields significantly improved performance over a state-of-the-art text-only alternative. This marks the first demonstration that hand gesture improves automatic discourse processing

    ACII 2009: Affective Computing and Intelligent Interaction. Proceedings of the Doctoral Consortium 2009

    Get PDF

    Dimensions of communication

    Get PDF

    Let’s lie together:Co-presence effects on children’s deceptive skills

    Get PDF

    Relationship-building through embodied feedback: Teacher-student alignment in writing conferences

    Get PDF
    Over the last two decades, an impressive amount of work has been done on the interaction that takes place during writing conferences (Ewert, 2009). However, most previous studies focused on the instructional aspects of conference discourse, without considering its affective components. Yet conferences are by no means emotionally neutral (Witt & Kerssen-Griep, 2011), as they involve evaluation of student work, correction, directions for improvement, and even criticism—that is, they involve potentially face-threatening acts. Therefore, it is important for teachers to know how to conference with students in non-threatening and affiliative ways. The present study examines 1) the interactional resources, including talk and embodied action (e.g., gaze, facial expression, gesture, body position) that one experienced writing instructor used in writing conferences to respond to student writers and their writings in affiliative ways, and 2) the interactional resources that the teacher used to repair disaffiliative actions—either her own or those of the students—in conference interaction. The data for the study are comprised of 14 video recordings of conference interaction between one instructor and two students collected over a 16-week semester in an introductory composition course for international students at a large U.S. university. Data were analyzed using methods from conversation analysis (Jefferson, 1988; Sacks, Schegloff, & Jefferson, 1974; Schegloff, 2007; Schegloff & Sacks, 1973) and multimodal interaction analysis (Nishino & Atkinson, 2015; Norris, 2004, 2013). The conceptual framework adopted in this study is based on the notions of embodied interaction (Streeck, Goodwin, & LeBaron, 2011a, 2011b), embodied participation frameworks (Goodwin, 2000a), and alignment (Atkinson, Churchill, Nishino, & Okada, 2007). Findings indicate that the instructor was responsive to the potentialities of face-threatening acts during conference interaction, and she effectively employed various interactional resources not only in responding to student writing in affiliative and non-threatening ways, but also in repairing the disruption in alignment caused by disaffiliative actions of either of the participants. This study demonstrates the value of teachers’ embodied actions not only as tools that facilitate instruction but also as resources that can be used to keep a positive atmosphere in writing conferences. The findings contribute to the existing body of research on writing conferences, feedback, embodied practices in teacher-student interaction, and teacher-student relationships and rapport. The study also has implications for general classroom pedagogy, second language teaching, and second language writing instruction

    Automatic social role recognition and its application in structuring multiparty interactions

    Get PDF
    Automatic processing of multiparty interactions is a research domain with important applications in content browsing, summarization and information retrieval. In recent years, several works have been devoted to find regular patterns which speakers exhibit in a multiparty interaction also known as social roles. Most of the research in literature has generally focused on recognition of scenario specific formal roles. More recently, role coding schemes based on informal social roles have been proposed in literature, defining roles based on the behavior speakers have in the functioning of a small group interaction. Informal social roles represent a flexible classification scheme that can generalize across different scenarios of multiparty interaction. In this thesis, we focus on automatic recognition of informal social roles and exploit the influence of informal social roles on speaker behavior for structuring multiparty interactions. To model speaker behavior, we systematically explore various verbal and non verbal cues extracted from turn taking patterns, vocal expression and linguistic style. The influence of social roles on the behavior cues exhibited by a speaker is modeled using a discriminative approach based on conditional random fields. Experiments performed on several hours of meeting data reveal that classification using conditional random fields improves the role recognition performance. We demonstrate the effectiveness of our approach by evaluating it on previously unseen scenarios of multiparty interaction. Furthermore, we also consider whether formal roles and informal roles can be automatically predicted by the same verbal and nonverbal features. We exploit the influence of social roles on turn taking patterns to improve speaker diarization under distant microphone condition. Our work extends the Hidden Markov model (HMM)- Gaussian mixture model (GMM) speaker diarization system, and is based on jointly estimating both the speaker segmentation and social roles in an audio recording. We modify the minimum duration constraint in HMM-GMM diarization system by using role information to model the expected duration of speaker's turn. We also use social role n-grams as prior information to model speaker interaction patterns. Finally, we demonstrate the application of social roles for the problem of topic segmentation in meetings. We exploit our findings that social roles can dynamically change in conversations and use this information to predict topic changes in meetings. We also present an unsupervised method for topic segmentation which combines social roles and lexical cohesion. Experimental results show that social roles improve performance of both speaker diarization and topic segmentation

    Proceedings of the LREC 2018 Special Speech Sessions

    Get PDF
    LREC 2018 Special Speech Sessions "Speech Resources Collection in Real-World Situations"; Phoenix Seagaia Conference Center, Miyazaki; 2018-05-0
    • 

    corecore