1,854 research outputs found

    Listening between the Lines: Learning Personal Attributes from Conversations

    Full text link
    Open-domain dialogue agents must be able to converse about many topics while incorporating knowledge about the user into the conversation. In this work we address the acquisition of such knowledge, for personalization in downstream Web applications, by extracting personal attributes from conversations. This problem is more challenging than the established task of information extraction from scientific publications or Wikipedia articles, because dialogues often give merely implicit cues about the speaker. We propose methods for inferring personal attributes, such as profession, age or family status, from conversations using deep learning. Specifically, we propose several Hidden Attribute Models, which are neural networks leveraging attention mechanisms and embeddings. Our methods are trained on a per-predicate basis to output rankings of object values for a given subject-predicate combination (e.g., ranking the doctor and nurse professions high when speakers talk about patients, emergency rooms, etc). Experiments with various conversational texts including Reddit discussions, movie scripts and a collection of crowdsourced personal dialogues demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.Comment: published in WWW'1

    Listening between the Lines: Learning Personal Attributes from Conversations

    No full text
    Open-domain dialogue agents must be able to converse about many topics while incorporating knowledge about the user into the conversation. In this work we address the acquisition of such knowledge, for personalization in downstream Web applications, by extracting personal attributes from conversations. This problem is more challenging than the established task of information extraction from scientific publications or Wikipedia articles, because dialogues often give merely implicit cues about the speaker. We propose methods for inferring personal attributes, such as profession, age or family status, from conversations using deep learning. Specifically, we propose several Hidden Attribute Models, which are neural networks leveraging attention mechanisms and embeddings. Our methods are trained on a per-predicate basis to output rankings of object values for a given subject-predicate combination (e.g., ranking the doctor and nurse professions high when speakers talk about patients, emergency rooms, etc). Experiments with various conversational texts including Reddit discussions, movie scripts and a collection of crowdsourced personal dialogues demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines

    Computational Sociolinguistics: A Survey

    Get PDF
    Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

    Mining Behavior of Citizen Sensor Communities to Improve Cooperation with Organizational Actors

    Get PDF
    Web 2.0 (social media) provides a natural platform for dynamic emergence of citizen (as) sensor communities, where the citizens generate content for sharing information and engaging in discussions. Such a citizen sensor community (CSC) has stated or implied goals that are helpful in the work of formal organizations, such as an emergency management unit, for prioritizing their response needs. This research addresses questions related to design of a cooperative system of organizations and citizens in CSC. Prior research by social scientists in a limited offline and online environment has provided a foundation for research on cooperative behavior challenges, including \u27articulation\u27 and \u27awareness\u27, but Web 2.0 supported CSC offers new challenges as well as opportunities. A CSC presents information overload for the organizational actors, especially in finding reliable information providers (for awareness), and finding actionable information from the data generated by citizens (for articulation). Also, we note three data level challenges: ambiguity in interpreting unconstrained natural language text, sparsity of user behaviors, and diversity of user demographics. Interdisciplinary research involving social and computer sciences is essential to address these socio-technical issues. I present a novel web information-processing framework, called the Identify-Match- Engage (IME) framework. IME allows operationalizing computation in design problems of awareness and articulation of the cooperative system between citizens and organizations, by addressing data problems of group engagement modeling and intent mining. The IME framework includes: a.) Identification of cooperation-assistive intent (seeking-offering) from short, unstructured messages using a classification model with declarative, social and contrast pattern knowledge, b.) Facilitation of coordination modeling using bipartite matching of complementary intent (seeking-offering), and c.) Identification of user groups to prioritize for engagement by defining a content-driven measure of \u27group discussion divergence\u27. The use of prior knowledge and interplay of features of users, content, and network structures efficiently captures context for computing cooperation-assistive behavior (intent and engagement) from unstructured social data in the online socio-technical systems. Our evaluation of a use-case of the crisis response domain shows improvement in performance for both intent classification and group engagement prioritization. Real world applications of this work include use of the engagement interface tool during various recent crises including the 2014 Jammu and Kashmir floods, and intent classification as a service integrated by the crisis mapping pioneer Ushahidi\u27s CrisisNET project for broader impact
    corecore