1,386 research outputs found

    Listening between the Lines: Learning Personal Attributes from Conversations

    Full text link
    Open-domain dialogue agents must be able to converse about many topics while incorporating knowledge about the user into the conversation. In this work we address the acquisition of such knowledge, for personalization in downstream Web applications, by extracting personal attributes from conversations. This problem is more challenging than the established task of information extraction from scientific publications or Wikipedia articles, because dialogues often give merely implicit cues about the speaker. We propose methods for inferring personal attributes, such as profession, age or family status, from conversations using deep learning. Specifically, we propose several Hidden Attribute Models, which are neural networks leveraging attention mechanisms and embeddings. Our methods are trained on a per-predicate basis to output rankings of object values for a given subject-predicate combination (e.g., ranking the doctor and nurse professions high when speakers talk about patients, emergency rooms, etc). Experiments with various conversational texts including Reddit discussions, movie scripts and a collection of crowdsourced personal dialogues demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.Comment: published in WWW'1

    Creating More Personas Improves Representation of Demographically Diverse Populations : Implications Towards Interactive Persona Systems

    Get PDF
    Personas represent distinct user types. However, while online user data can be demographically and behaviorally heterogeneous, most studies generate less than ten personas, regardless of how heterogeneous the data is. Because all persona creation efforts need to assign a number of personas to create, assigning this number evokes a fundamental question, How many personas to create?. To address this question, we apply data-driven persona creation in a dataset with 250 million YouTube views from a global news and media organization. We focus on a statistically optimal number of personas, namely, how the distribution of demographic persona attributes deviates from the baseline user data. Altering the number of generated personas, ranging from 5 to 160 personas per set, we find that more personas cover more age groups and countries, thus improving the statistical correspondence with the raw user data, and increasing the representation of demographic diversity by including more fringe user segments. While the user representation continuously improved with more personas, the relative diversity gain was maximal with 40 personas, implying that, using our data, one ought to create more than 4 times more personas than generally advocated. The results imply that organizations with heterogeneous online audiences benefit from many personas in terms of more inclusive user representation. We further demonstrate how an interactive persona system can help stakeholders navigate many personas with possibly smaller cognitive effort.© AuthorACM 2022. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Participative Computing for Sustainable Futures: Proceedings of the 12th Nordic Conference on Human-Computer Interaction (NordiCHI’22), http://dx.doi.org/10.1145/3546155.3546654fi=vertaisarvioitu|en=peerReviewed

    Wearing Many (Social) Hats: How Different are Your Different Social Network Personae?

    Full text link
    This paper investigates when users create profiles in different social networks, whether they are redundant expressions of the same persona, or they are adapted to each platform. Using the personal webpages of 116,998 users on About.me, we identify and extract matched user profiles on several major social networks including Facebook, Twitter, LinkedIn, and Instagram. We find evidence for distinct site-specific norms, such as differences in the language used in the text of the profile self-description, and the kind of picture used as profile image. By learning a model that robustly identifies the platform given a user's profile image (0.657--0.829 AUC) or self-description (0.608--0.847 AUC), we confirm that users do adapt their behaviour to individual platforms in an identifiable and learnable manner. However, different genders and age groups adapt their behaviour differently from each other, and these differences are, in general, consistent across different platforms. We show that differences in social profile construction correspond to differences in how formal or informal the platform is.Comment: Accepted at the 11th International AAAI Conference on Web and Social Media (ICWSM17

    Toxic Text in Personas: An Experiment on User Perceptions

    Get PDF
    When algorithms create personas from social media data, the personas can become noxious via automatically including toxic comments. To investigate how users perceive such personas, we conducted a 2 × 2 user experiment with 496 participants that showed participants toxic and non-toxic versions of data-driven personas. We found that participants gave higher credibility, likability, empathy, similarity, and willingness-to-use scores to non-toxic personas. Also, gender affected toxicity perceptions in that female toxic data-driven personas scored lower in likability, empathy, and similarity than their male counterparts. Female participants gave higher perceptions scores to non-toxic personas and lower scores to toxic personas than male participants. We discuss implications from our research for designing data-driven personas

    Toxic text in personas: An experiment on user perceptions

    Get PDF
    When algorithms create personas from social media data, the personas can become noxious via automatically including toxic comments. To investigate how users perceive such personas, we conducted a 2 × 2 user experiment with 496 participants that showed participants toxic and non-toxic versions of data-driven personas. We found that participants gave higher credibility, likability, empathy, similarity, and willingness-to-use scores to non-toxic personas. Also, gender affected toxicity perceptions in that female toxic data-driven personas scored lower in likability, empathy, and similarity than their male counterparts. Female participants gave higher perceptions scores to non-toxic personas and lower scores to toxic personas than male participants. We discuss implications from our research for designing data-driven personas.info:eu-repo/semantics/publishedVersio

    Proceedings of ACM Woodstock conference (WOODSTOCK’18). ACM, New York, NY, USA, 2 pages

    Get PDF
    We generate player personas from game preference survey data using the system and methodology of automatic persona generation (APG). The purpose is to demonstrate the potential of data driven technologies for segmenting players by their game preferences. The resulting prototype personas are particularly intended for game marketing purposes, e.g. targeting gamers with social media advertising. The personas can also be enhanced by additional data to provide deeper insights.</p

    Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models

    Full text link
    To recognize and mitigate harms from large language models (LLMs), we need to understand the prevalence and nuances of stereotypes in LLM outputs. Toward this end, we present Marked Personas, a prompt-based method to measure stereotypes in LLMs for intersectional demographic groups without any lexicon or data labeling. Grounded in the sociolinguistic concept of markedness (which characterizes explicitly linguistically marked categories versus unmarked defaults), our proposed method is twofold: 1) prompting an LLM to generate personas, i.e., natural language descriptions, of the target demographic group alongside personas of unmarked, default groups; 2) identifying the words that significantly distinguish personas of the target group from corresponding unmarked ones. We find that the portrayals generated by GPT-3.5 and GPT-4 contain higher rates of racial stereotypes than human-written portrayals using the same prompts. The words distinguishing personas of marked (non-white, non-male) groups reflect patterns of othering and exoticizing these demographics. An intersectional lens further reveals tropes that dominate portrayals of marginalized groups, such as tropicalism and the hypersexualization of minoritized women. These representational harms have concerning implications for downstream applications like story generation.Comment: To appear at ACL 2023, 9 pages, 3 figures, 3 table
    • …
    corecore