5,618 research outputs found
Listening between the Lines: Learning Personal Attributes from Conversations
Open-domain dialogue agents must be able to converse about many topics while
incorporating knowledge about the user into the conversation. In this work we
address the acquisition of such knowledge, for personalization in downstream
Web applications, by extracting personal attributes from conversations. This
problem is more challenging than the established task of information extraction
from scientific publications or Wikipedia articles, because dialogues often
give merely implicit cues about the speaker. We propose methods for inferring
personal attributes, such as profession, age or family status, from
conversations using deep learning. Specifically, we propose several Hidden
Attribute Models, which are neural networks leveraging attention mechanisms and
embeddings. Our methods are trained on a per-predicate basis to output rankings
of object values for a given subject-predicate combination (e.g., ranking the
doctor and nurse professions high when speakers talk about patients, emergency
rooms, etc). Experiments with various conversational texts including Reddit
discussions, movie scripts and a collection of crowdsourced personal dialogues
demonstrate the viability of our methods and their superior performance
compared to state-of-the-art baselines.Comment: published in WWW'1
Academic Performance and Behavioral Patterns
Identifying the factors that influence academic performance is an essential
part of educational research. Previous studies have documented the importance
of personality traits, class attendance, and social network structure. Because
most of these analyses were based on a single behavioral aspect and/or small
sample sizes, there is currently no quantification of the interplay of these
factors. Here, we study the academic performance among a cohort of 538
undergraduate students forming a single, densely connected social network. Our
work is based on data collected using smartphones, which the students used as
their primary phones for two years. The availability of multi-channel data from
a single population allows us to directly compare the explanatory power of
individual and social characteristics. We find that the most informative
indicators of performance are based on social ties and that network indicators
result in better model performance than individual characteristics (including
both personality and class attendance). We confirm earlier findings that class
attendance is the most important predictor among individual characteristics.
Finally, our results suggest the presence of strong homophily and/or peer
effects among university students
Temporal word embeddings for dynamic user profiling in Twitter
The research described in this paper focused on exploring
the domain of user profiling, a nascent and contentious technology which
has been steadily attracting increased interest from the research community as its potential for providing personalised digital services is realised.
An extensive review of related literature revealed that limited research
has been conducted into how temporal aspects of users can be captured
using user profiling techniques. This, coupled with the notable lack of
research into the use of word embedding techniques to capture temporal
variances in language, revealed an opportunity to extend the Random Indexing word embedding technique such that the interests of users could
be modelled based on their use of language. To achieve this, this work
concerned itself with extending an existing implementation of Temporal
Random Indexing to model Twitter users across multiple granularities of
time based on their use of language. The product of this is a novel technique for temporal user profiling, where a set of vectors is used to describe
the evolution of a Twitter user’s interests over time through their use of
language. The vectors produced were evaluated against a temporal implementation of another state-of-the-art word embedding technique, the
Word2Vec Dynamic Independent Skip-gram model, where it was found
that Temporal Random Indexing outperformed Word2Vec in the generation of temporal user profiles
Extracting personal information from conversations
Personal knowledge is a versatile resource that is valuable for a wide range of downstream applications. Background facts about users can allow chatbot assistants to produce more topical and empathic replies. In the context of recommendation and retrieval models, personal facts can be used to customize the ranking results for individual users. A Personal Knowledge Base, populated with personal facts, such as demographic information, interests and interpersonal relationships, is a unique endpoint for storing and querying personal knowledge. Such knowledge bases are easily interpretable and can provide users with full control over their own personal knowledge, including revising stored facts and managing access by downstream services for personalization purposes. To alleviate users from extensive manual effort to build such personal knowledge base, we can leverage automated extraction methods applied to the textual content of the users, such as dialogue transcripts or social media posts. Mainstream extraction methods specialize on well-structured data, such as biographical texts or encyclopedic articles, which are rare for most people. In turn, conversational data is abundant but challenging to process and requires specialized methods for extraction of personal facts. In this dissertation we address the acquisition of personal knowledge from conversational data. We propose several novel deep learning models for inferring speakers’ personal attributes: • Demographic attributes, age, gender, profession and family status, are inferred by HAMs - hierarchical neural classifiers with attention mechanism. Trained HAMs can be transferred between different types of conversational data and provide interpretable predictions. • Long-tailed personal attributes, hobby and profession, are predicted with CHARM - a zero-shot learning model, overcoming the lack of labeled training samples for rare attribute values. By linking conversational utterances to external sources, CHARM is able to predict attribute values which it never saw during training. • Interpersonal relationships are inferred with PRIDE - a hierarchical transformer-based model. To accurately predict fine-grained relationships, PRIDE leverages personal traits of the speakers and the style of conversational utterances. Experiments with various conversational texts, including Reddit discussions and movie scripts, demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.Personengebundene Fakten sind eine vielseitig nutzbare Quelle für die verschiedensten Anwendungen. Hintergrundfakten über Nutzer können es Chatbot-Assistenten ermöglichen, relevantere und persönlichere Antworten zu geben. Im Kontext von Empfehlungs- und Retrievalmodellen können personengebundene Fakten dazu verwendet werden, die Ranking-Ergebnisse für Nutzer individuell anzupassen. Eine Personengebundene Wissensdatenbank, gefüllt mit persönlichen Daten wie demografischen Angaben, Interessen und Beziehungen, kann eine universelle Schnittstelle für die Speicherung und Abfrage solcher Fakten sein. Wissensdatenbanken sind leicht zu interpretieren und bieten dem Nutzer die vollständige Kontrolle über seine personenbezogenen Fakten, einschließlich der Überarbeitung und der Verwaltung des Zugriffs durch nachgelagerte Dienste, etwa für Personalisierungszwecke. Um den Nutzern den aufwändigen manuellen Aufbau einer solchen persönlichen Wissensdatenbank zu ersparen, können automatisierte Extraktionsmethoden auf den textuellen Inhalten der Nutzer – wie z.B. Konversationen oder Beiträge in sozialen Medien – angewendet werden. Die üblichen Extraktionsmethoden sind auf strukturierte Daten wie biografische Texte oder enzyklopädische Artikel spezialisiert, die bei den meisten Menschen keine Rolle spielen. In dieser Dissertation beschäftigen wir uns mit der Gewinnung von persönlichem Wissen aus Dialogdaten und schlagen mehrere neuartige Deep-Learning-Modelle zur Ableitung persönlicher Attribute von Sprechern vor: • Demographische Attribute wie Alter, Geschlecht, Beruf und Familienstand werden durch HAMs - Hierarchische Neuronale Klassifikatoren mit Attention-Mechanismus - abgeleitet. Trainierte HAMs können zwischen verschiedenen Arten von Gesprächsdaten übertragen werden und liefern interpretierbare Vorhersagen • Vielseitige persönliche Attribute wie Hobbys oder Beruf werden mit CHARM ermittelt - einem Zero-Shot-Lernmodell, das den Mangel an markierten Trainingsbeispielen für seltene Attributwerte überwindet. Durch die Verknüpfung von Gesprächsäußerungen mit externen Quellen ist CHARM in der Lage, Attributwerte zu ermitteln, die es beim Training nie gesehen hat • Zwischenmenschliche Beziehungen werden mit PRIDE, einem hierarchischen transformerbasierten Modell, abgeleitet. Um präzise Beziehungen vorhersagen zu können, nutzt PRIDE persönliche Eigenschaften der Sprecher und den Stil von Konversationsäußerungen Experimente mit verschiedenen Konversationstexten, inklusive Reddit-Diskussionen und Filmskripten, demonstrieren die Praxistauglichkeit unserer Methoden und ihre hervorragende Leistung im Vergleich zum aktuellen Stand der Technik
Hierarchical Attention Network for Visually-aware Food Recommendation
Food recommender systems play an important role in assisting users to
identify the desired food to eat. Deciding what food to eat is a complex and
multi-faceted process, which is influenced by many factors such as the
ingredients, appearance of the recipe, the user's personal preference on food,
and various contexts like what had been eaten in the past meals. In this work,
we formulate the food recommendation problem as predicting user preference on
recipes based on three key factors that determine a user's choice on food,
namely, 1) the user's (and other users') history; 2) the ingredients of a
recipe; and 3) the descriptive image of a recipe. To address this challenging
problem, we develop a dedicated neural network based solution Hierarchical
Attention based Food Recommendation (HAFR) which is capable of: 1) capturing
the collaborative filtering effect like what similar users tend to eat; 2)
inferring a user's preference at the ingredient level; and 3) learning user
preference from the recipe's visual images. To evaluate our proposed method, we
construct a large-scale dataset consisting of millions of ratings from
AllRecipes.com. Extensive experiments show that our method outperforms several
competing recommender solutions like Factorization Machine and Visual Bayesian
Personalized Ranking with an average improvement of 12%, offering promising
results in predicting user preference for food. Codes and dataset will be
released upon acceptance
- …