706 research outputs found
ăă„ăŒă©ă«ćŻŸè©±ćżççæăźæ§èœćäžăźăăăźăăŒăżé§ćăąăăăŒă
Tohoku UniversityäčŸć„ć€ȘéèȘČ
Extracting personal information from conversations
Personal knowledge is a versatile resource that is valuable for a wide range of downstream applications. Background facts about users can allow chatbot assistants to produce more topical and empathic replies. In the context of recommendation and retrieval models, personal facts can be used to customize the ranking results for individual users. A Personal Knowledge Base, populated with personal facts, such as demographic information, interests and interpersonal relationships, is a unique endpoint for storing and querying personal knowledge. Such knowledge bases are easily interpretable and can provide users with full control over their own personal knowledge, including revising stored facts and managing access by downstream services for personalization purposes. To alleviate users from extensive manual effort to build such personal knowledge base, we can leverage automated extraction methods applied to the textual content of the users, such as dialogue transcripts or social media posts. Mainstream extraction methods specialize on well-structured data, such as biographical texts or encyclopedic articles, which are rare for most people. In turn, conversational data is abundant but challenging to process and requires specialized methods for extraction of personal facts. In this dissertation we address the acquisition of personal knowledge from conversational data. We propose several novel deep learning models for inferring speakersâ personal attributes: âą Demographic attributes, age, gender, profession and family status, are inferred by HAMs - hierarchical neural classifiers with attention mechanism. Trained HAMs can be transferred between different types of conversational data and provide interpretable predictions. âą Long-tailed personal attributes, hobby and profession, are predicted with CHARM - a zero-shot learning model, overcoming the lack of labeled training samples for rare attribute values. By linking conversational utterances to external sources, CHARM is able to predict attribute values which it never saw during training. âą Interpersonal relationships are inferred with PRIDE - a hierarchical transformer-based model. To accurately predict fine-grained relationships, PRIDE leverages personal traits of the speakers and the style of conversational utterances. Experiments with various conversational texts, including Reddit discussions and movie scripts, demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.Personengebundene Fakten sind eine vielseitig nutzbare Quelle fĂŒr die verschiedensten Anwendungen. Hintergrundfakten ĂŒber Nutzer können es Chatbot-Assistenten ermöglichen, relevantere und persönlichere Antworten zu geben. Im Kontext von Empfehlungs- und Retrievalmodellen können personengebundene Fakten dazu verwendet werden, die Ranking-Ergebnisse fĂŒr Nutzer individuell anzupassen. Eine Personengebundene Wissensdatenbank, gefĂŒllt mit persönlichen Daten wie demografischen Angaben, Interessen und Beziehungen, kann eine universelle Schnittstelle fĂŒr die Speicherung und Abfrage solcher Fakten sein. Wissensdatenbanken sind leicht zu interpretieren und bieten dem Nutzer die vollstĂ€ndige Kontrolle ĂŒber seine personenbezogenen Fakten, einschlieĂlich der Ăberarbeitung und der Verwaltung des Zugriffs durch nachgelagerte Dienste, etwa fĂŒr Personalisierungszwecke. Um den Nutzern den aufwĂ€ndigen manuellen Aufbau einer solchen persönlichen Wissensdatenbank zu ersparen, können automatisierte Extraktionsmethoden auf den textuellen Inhalten der Nutzer â wie z.B. Konversationen oder BeitrĂ€ge in sozialen Medien â angewendet werden. Die ĂŒblichen Extraktionsmethoden sind auf strukturierte Daten wie biografische Texte oder enzyklopĂ€dische Artikel spezialisiert, die bei den meisten Menschen keine Rolle spielen. In dieser Dissertation beschĂ€ftigen wir uns mit der Gewinnung von persönlichem Wissen aus Dialogdaten und schlagen mehrere neuartige Deep-Learning-Modelle zur Ableitung persönlicher Attribute von Sprechern vor: âą Demographische Attribute wie Alter, Geschlecht, Beruf und Familienstand werden durch HAMs - Hierarchische Neuronale Klassifikatoren mit Attention-Mechanismus - abgeleitet. Trainierte HAMs können zwischen verschiedenen Arten von GesprĂ€chsdaten ĂŒbertragen werden und liefern interpretierbare Vorhersagen âą Vielseitige persönliche Attribute wie Hobbys oder Beruf werden mit CHARM ermittelt - einem Zero-Shot-Lernmodell, das den Mangel an markierten Trainingsbeispielen fĂŒr seltene Attributwerte ĂŒberwindet. Durch die VerknĂŒpfung von GesprĂ€chsĂ€uĂerungen mit externen Quellen ist CHARM in der Lage, Attributwerte zu ermitteln, die es beim Training nie gesehen hat âą Zwischenmenschliche Beziehungen werden mit PRIDE, einem hierarchischen transformerbasierten Modell, abgeleitet. Um prĂ€zise Beziehungen vorhersagen zu können, nutzt PRIDE persönliche Eigenschaften der Sprecher und den Stil von KonversationsĂ€uĂerungen Experimente mit verschiedenen Konversationstexten, inklusive Reddit-Diskussionen und Filmskripten, demonstrieren die Praxistauglichkeit unserer Methoden und ihre hervorragende Leistung im Vergleich zum aktuellen Stand der Technik
Utilizing Review Summarization in a Spoken Recommendation System
In this paper we present a framework for spoken recommendation
systems. To provide reliable recommendations
to users, we incorporate a review summarization
technique which extracts informative opinion
summaries from grass-roots usersâ reviews. The dialogue
system then utilizes these review summaries to
support both quality-based opinion inquiry and feature-
specific entity search. We propose a probabilistic
language generation approach to automatically creating
recommendations in spoken natural language
from the text-based opinion summaries. A user study
in the restaurant domain shows that the proposed approaches
can effectively generate reliable and helpful
recommendations in human-computer conversations.T-Party ProjectQuanta Computer (Firm
IMAD: IMage-Augmented multi-modal Dialogue
Currently, dialogue systems have achieved high performance in processing
text-based communication. However, they have not yet effectively incorporated
visual information, which poses a significant challenge. Furthermore, existing
models that incorporate images in dialogue generation focus on discussing the
image itself. Our proposed approach presents a novel perspective on multi-modal
dialogue systems, which interprets the image in the context of the dialogue. By
doing so, we aim to expand the capabilities of current dialogue systems and
transition them from single modality (text) to multi-modality. However, there
is a lack of validated English datasets that contain both images and dialogue
contexts for this task. Thus, we propose a two-stage approach to automatically
construct a multi-modal dialogue dataset. In the first stage, we utilize
text-to-image similarity and sentence similarity to identify which utterances
could be replaced with an image. In the second stage, we replace those
utterances by selecting a subset of relevant images and filtering them with a
visual question answering model. We used this approach, along with additional
labeling, to create the IMage Augmented multi-modal Dialogue dataset (IMAD),
which can serve as a validated dataset for this task. Furthermore, we propose a
baseline model trained on this dataset, which outperforms model trained on the
same data without images and BlenderBot.Comment: Main part contains 6 pages, 4 figures. It was accepted on AINL. We
wait the publication and DO
YARBUS : Yet Another Rule Based belief Update System Jérémy Fix Hervé Frezza-Buet
We introduce a new rule based system for belief tracking in dialog systems. Despite the simplicity of the rules being considered, the proposed belief tracker ranks favourably compared to the previous submissions on the second and third Dialog State Tracking challenges. The results of this simple tracker allows to reconsider the performances of previous submissions using more elaborate techniques
TOWARDS BUILDING INTELLIGENT COLLABORATIVE PROBLEM SOLVING SYSTEMS
Historically, Collaborative Problem Solving (CPS) systems were more focused on Human Computer Interaction (HCI) issues, such as providing good experience of communication among the participants. Whereas, Intelligent Tutoring Systems (ITS) focus both on HCI issues as well as leveraging Artificial Intelligence (AI) techniques in their intelligent agents. This dissertation seeks to minimize the gap between CPS systems and ITS by adopting the methods used in ITS researches. To move towards this goal, we focus on analyzing interactions with textual inputs in online learning systems such as DeepTutor and Virtual Internships (VI) to understand their semantics and underlying intents. In order to address the problem of assessing the student generated short text, this research explores firstly data driven machine learning models coupled with expert generated as well as general text analysis features. Secondly it explores method to utilize knowledge graph embedding for assessing student answer in ITS. Finally, it also explores a method using only standard reference examples generated by human teacher. Such method is useful when a new system has been deployed and no student data were available.To handle negation in tutorial dialogue, this research explored a Long Short Term Memory (LSTM) based method. The advantage of this method is that it requires no human engineered features and performs comparably well with other models using human engineered features.Another important analysis done in this research is to find speech acts in conversation utterances of multiple players in VI. Among various models, a noise label trained neural network model performed better in categorizing the speech acts of the utterances.The learners\u27 professional skill development in VI is characterized by the distribution of SKIVE elements, the components of epistemic frames. Inferring the population distribution of these elements could help to assess the learners\u27 skill development. This research sought a Markov method to infer the population distribution of SKIVE elements, namely the stationary distribution of the elements.While studying various aspects of interactions in our targeted learning systems, we motivate our research to replace the human mentor or tutor with intelligent agent. Introducing intelligent agent in place of human helps to reduce the cost as well as scale up the system
- âŠ