15,661 research outputs found

    Automatic social role recognition and its application in structuring multiparty interactions

    Get PDF
    Automatic processing of multiparty interactions is a research domain with important applications in content browsing, summarization and information retrieval. In recent years, several works have been devoted to find regular patterns which speakers exhibit in a multiparty interaction also known as social roles. Most of the research in literature has generally focused on recognition of scenario specific formal roles. More recently, role coding schemes based on informal social roles have been proposed in literature, defining roles based on the behavior speakers have in the functioning of a small group interaction. Informal social roles represent a flexible classification scheme that can generalize across different scenarios of multiparty interaction. In this thesis, we focus on automatic recognition of informal social roles and exploit the influence of informal social roles on speaker behavior for structuring multiparty interactions. To model speaker behavior, we systematically explore various verbal and non verbal cues extracted from turn taking patterns, vocal expression and linguistic style. The influence of social roles on the behavior cues exhibited by a speaker is modeled using a discriminative approach based on conditional random fields. Experiments performed on several hours of meeting data reveal that classification using conditional random fields improves the role recognition performance. We demonstrate the effectiveness of our approach by evaluating it on previously unseen scenarios of multiparty interaction. Furthermore, we also consider whether formal roles and informal roles can be automatically predicted by the same verbal and nonverbal features. We exploit the influence of social roles on turn taking patterns to improve speaker diarization under distant microphone condition. Our work extends the Hidden Markov model (HMM)- Gaussian mixture model (GMM) speaker diarization system, and is based on jointly estimating both the speaker segmentation and social roles in an audio recording. We modify the minimum duration constraint in HMM-GMM diarization system by using role information to model the expected duration of speaker's turn. We also use social role n-grams as prior information to model speaker interaction patterns. Finally, we demonstrate the application of social roles for the problem of topic segmentation in meetings. We exploit our findings that social roles can dynamically change in conversations and use this information to predict topic changes in meetings. We also present an unsupervised method for topic segmentation which combines social roles and lexical cohesion. Experimental results show that social roles improve performance of both speaker diarization and topic segmentation

    Automatic role recognition

    Get PDF
    The computing community is making significant efforts towards the development of automatic approaches for the analysis of social interactions. The way people interact depends on the context, but there is one aspect that all social interactions seem to have in common: humans behave according to roles. Therefore, recognizing the roles of participants is an essential step towards understanding social interactions and the construction of socially aware computer. This thesis addresses the problem of automatically recognizing roles of participants in multi-party recordings. The objective is to assign to each participant a role. All the proposed approaches use a similar strategy. They all start by segmenting the audio into turns. Those turns are used as basic analysis units. The next step is to extract features accounting for the organization of turns. The more sophisticated approaches extend the features extracted with features from either the prosody or the semantic. Finally, the mapping of people or turns to roles is done using statistical models. The goal of this thesis is to gain a better understanding of role recognition and we will investigate three aspects that can influence the performance of the system: We investigate the impact of modelling the dependency between the roles. We investigate the contribution of different modalities for the effectiveness of role recognition approach. We investigate the effectiveness of the approach for different scenarios. Three models are proposed and tested on three different corpora totalizing more than 90 hours of audio. The first contribution of this thesis is to investigate the combination of turn-taking features and semantic information for role recognition, improving the accuracy of role recognition from a baseline of 46.4% to 67.9% on the AMI meeting corpus. The second contribution is to use features extracted from the prosody to assign roles. The performance of this model is 89.7% on broadcast news and 87.0% on talk-shows. Finally, the third contribution is the development of a model robust to change in the social setting. This model achieved an accuracy of 86.7% on a database composed of a mixture of broadcast news and talk-shows

    Non-Verbal Communication Analysis in Victim-Offender Mediations

    Full text link
    In this paper we present a non-invasive ambient intelligence framework for the semi-automatic analysis of non-verbal communication applied to the restorative justice field. In particular, we propose the use of computer vision and social signal processing technologies in real scenarios of Victim-Offender Mediations, applying feature extraction techniques to multi-modal audio-RGB-depth data. We compute a set of behavioral indicators that define communicative cues from the fields of psychology and observational methodology. We test our methodology on data captured in real world Victim-Offender Mediation sessions in Catalonia in collaboration with the regional government. We define the ground truth based on expert opinions when annotating the observed social responses. Using different state-of-the-art binary classification approaches, our system achieves recognition accuracies of 86% when predicting satisfaction, and 79% when predicting both agreement and receptivity. Applying a regression strategy, we obtain a mean deviation for the predictions between 0.5 and 0.7 in the range [1-5] for the computed social signals.Comment: Please, find the supplementary video material at: http://sunai.uoc.edu/~vponcel/video/VOMSessionSample.mp

    Getting Into Networks and Clusters: Evidence on the GNSS composite knowledge process in (and from) Midi-Pyrénées

    Get PDF
    This paper aims to contribute to the empirical identification of clusters by proposing methodological issues based on network analysis. We start with the detection of a composite knowledge process rather than a territorial one stricto sensu. Such a consideration allows us to avoid the overestimation of the role played by geographical proximity between agents, and grasp its ambivalence in knowledge relations. Networks and clusters correspond to the complex aggregation process of bi or n-lateral relations in which agents can play heterogeneous structural roles. Their empirical reconstitution requires thus to gather located relational data, whereas their structural properties analysis requires to compute a set of indexes developed in the field of the social network analysis. Our theoretical considerations are tested in the technological field of GNSS (Global Satellite Navigation Systems). We propose a sample of knowledge relations based on collaborative R&D projects and discuss how this sample is shaped and why we can assume its representativeness. The network we obtain allows us to show how the composite knowledge process gives rise to a structure with a peculiar combination of local and distant relations. Descriptive statistics and structural properties show the influence or the centrality of certain agents in the aggregate structure, and permit to discuss the complementarities between their heterogeneous knowledge profiles. Quantitative results are completed and confirmed by an interpretative discussion based on a run of semi-structured interviews. Concluding remarks provide theoretical feedbacks.Knowledge, Networks, Economic Geography, Cluster, GNSS

    Talk and Let Talk: The Effects of Language Proficiency on Speaking up and Competence Perceptions in Multinational Teams

    Get PDF
    Collaboration within multinational teams necessitates the adoption of a common language, typically English, which often leads to significant differences in language proficiency across members. We develop and test a multilevel model of the effects of language proficiency within multinational teams. An experimental study of 51 teams (102 American and 102 Chinese participants) revealed that, at the individual level, members with higher levels of language proficiency were more likely to speak up, which led to more positive perceptions of their competence. At the team level, greater dispersion in language proficiency across members was associated with less accurate competence recognition, which, in turn, led to lower overall team performance. Moreover, communication medium moderated these relationships, such that the effects of language proficiency were more potent in face-to-face than in computer-mediated teams. We discuss the implications of these findings for future research and for managing participation, competence, and technology in multinational teams

    The non-Verbal Structure of Patient Case Discussions in Multidisciplinary Medical Team Meetings

    Get PDF
    Meeting analysis has a long theoretical tradition in social psychology, with established practical rami?cations in computer science, especially in computer supported cooperative work. More recently, a good deal of research has focused on the issues of indexing and browsing multimedia records of meetings. Most research in this area, however, is still based on data collected in laboratories, under somewhat arti?cial conditions. This paper presents an analysis of the discourse structure and spontaneous interactions at real-life multidisciplinary medical team meetings held as part of the work routine in a major hospital. It is hypothesised that the conversational structure of these meetings, as indicated by sequencing and duration of vocalisations, enables segmentation into individual patient case discussions. The task of segmenting audio-visual records of multidisciplinary medical team meetings is described as a topic segmentation task, and a method for automatic segmentation is proposed. An empirical evaluation based on hand labelled data is presented which determines the optimal length of vocalisation sequences for segmentation, and establishes the competitiveness of the method with approaches based on more complex knowledge sources. The effectiveness of Bayesian classi?cation as a segmentation method, and its applicability to meeting segmentation in other domains are discusse
    • 

    corecore