9,301 research outputs found
Listening between the Lines: Learning Personal Attributes from Conversations
Open-domain dialogue agents must be able to converse about many topics while
incorporating knowledge about the user into the conversation. In this work we
address the acquisition of such knowledge, for personalization in downstream
Web applications, by extracting personal attributes from conversations. This
problem is more challenging than the established task of information extraction
from scientific publications or Wikipedia articles, because dialogues often
give merely implicit cues about the speaker. We propose methods for inferring
personal attributes, such as profession, age or family status, from
conversations using deep learning. Specifically, we propose several Hidden
Attribute Models, which are neural networks leveraging attention mechanisms and
embeddings. Our methods are trained on a per-predicate basis to output rankings
of object values for a given subject-predicate combination (e.g., ranking the
doctor and nurse professions high when speakers talk about patients, emergency
rooms, etc). Experiments with various conversational texts including Reddit
discussions, movie scripts and a collection of crowdsourced personal dialogues
demonstrate the viability of our methods and their superior performance
compared to state-of-the-art baselines.Comment: published in WWW'1
Using Text Similarity to Detect Social Interactions not Captured by Formal Reply Mechanisms
In modeling social interaction online, it is important to understand when
people are reacting to each other. Many systems have explicit indicators of
replies, such as threading in discussion forums or replies and retweets in
Twitter. However, it is likely these explicit indicators capture only part of
people's reactions to each other, thus, computational social science approaches
that use them to infer relationships or influence are likely to miss the mark.
This paper explores the problem of detecting non-explicit responses, presenting
a new approach that uses tf-idf similarity between a user's own tweets and
recent tweets by people they follow. Based on a month's worth of posting data
from 449 ego networks in Twitter, this method demonstrates that it is likely
that at least 11% of reactions are not captured by the explicit reply and
retweet mechanisms. Further, these uncaptured reactions are not evenly
distributed between users: some users, who create replies and retweets without
using the official interface mechanisms, are much more responsive to followees
than they appear. This suggests that detecting non-explicit responses is an
important consideration in mitigating biases and building more accurate models
when using these markers to study social interaction and information diffusion.Comment: A final version of this work was published in the 2015 IEEE 11th
International Conference on e-Science (e-Science
- …