1,095 research outputs found

    MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue

    Full text link
    Task-oriented dialogue (TOD) systems have been widely deployed in many industries as they deliver more efficient customer support. These systems are typically constructed for a single domain or language and do not generalise well beyond this. To support work on Natural Language Understanding (NLU) in TOD across multiple languages and domains simultaneously, we constructed MULTI3NLU++, a multilingual, multi-intent, multi-domain dataset. MULTI3NLU++ extends the English only NLU++ dataset to include manual translations into a range of high, medium, and low resource languages (Spanish, Marathi, Turkish and Amharic), in two domains (BANKING and HOTELS). Because of its multi-intent property, MULTI3NLU++ represents complex and natural user goals, and therefore allows us to measure the realistic performance of TOD systems in a varied set of the world's languages. We use MULTI3NLU++ to benchmark state-of-the-art multilingual models for the NLU tasks of intent detection and slot labelling for TOD systems in the multilingual setting. The results demonstrate the challenging nature of the dataset, particularly in the low-resource language setting, offering ample room for future experimentation in multi-domain multilingual TOD setups.Comment: ACL 2023 (Findings) Camera Read

    Neural approaches to dialog modeling

    Full text link
    Cette thèse par article se compose de quatre articles qui contribuent au domaine de l’apprentissage profond, en particulier dans la compréhension et l’apprentissage des ap- proches neuronales des systèmes de dialogue. Le premier article fait un pas vers la compréhension si les architectures de dialogue neuronal couramment utilisées capturent efficacement les informations présentes dans l’historique des conversations. Grâce à une série d’expériences de perturbation sur des ensembles de données de dialogue populaires, nous constatons que les architectures de dialogue neuronal couramment utilisées comme les modèles seq2seq récurrents et basés sur des transformateurs sont rarement sensibles à la plupart des perturbations du contexte d’entrée telles que les énoncés manquants ou réorganisés, les mots mélangés, etc. Le deuxième article propose d’améliorer la qualité de génération de réponse dans les systèmes de dialogue de domaine ouvert en modélisant conjointement les énoncés avec les attributs de dialogue de chaque énoncé. Les attributs de dialogue d’un énoncé se réfèrent à des caractéristiques ou des aspects discrets associés à un énoncé comme les actes de dialogue, le sentiment, l’émotion, l’identité du locuteur, la personnalité du locuteur, etc. Le troisième article présente un moyen simple et économique de collecter des ensembles de données à grande échelle pour modéliser des systèmes de dialogue orientés tâche. Cette approche évite l’exigence d’un schéma d’annotation d’arguments complexes. La version initiale de l’ensemble de données comprend 13 215 dialogues basés sur des tâches comprenant six domaines et environ 8 000 entités nommées uniques, presque 8 fois plus que l’ensemble de données MultiWOZ populaire.This thesis by article consists of four articles which contribute to the field of deep learning, specifically in understanding and learning neural approaches to dialog systems. The first article takes a step towards understanding if commonly used neural dialog architectures effectively capture the information present in the conversation history. Through a series of perturbation experiments on popular dialog datasets, wefindthatcommonly used neural dialog architectures like recurrent and transformer-based seq2seq models are rarely sensitive to most input context perturbations such as missing or reordering utterances, shuffling words, etc. The second article introduces a simple and cost-effective way to collect large scale datasets for modeling task-oriented dialog systems. This approach avoids the requirement of a com-plex argument annotation schema. The initial release of the dataset includes 13,215 task-based dialogs comprising six domains and around 8k unique named entities, almost 8 times more than the popular MultiWOZ dataset. The third article proposes to improve response generation quality in open domain dialog systems by jointly modeling the utterances with the dialog attributes of each utterance. Dialog attributes of an utterance refer to discrete features or aspects associated with an utterance like dialog-acts, sentiment, emotion, speaker identity, speaker personality, etc. The final article introduces an embedding-free method to compute word representations on-the-fly. This approach significantly reduces the memory footprint which facilitates de-ployment in on-device (memory constraints) devices. Apart from being independent of the vocabulary size, we find this approach to be inherently resilient to common misspellings

    Multi3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue

    Get PDF
    Task-oriented dialogue (ToD) systems have been widely deployed in many industries as they deliver more efficient customer support. These systems are typically constructed for a single domain or language and do not generalise well beyond this. To support work on Natural Language Understanding (NLU) in ToD across multiple languages and domains simultaneously, we constructed Multi3NLU++, a multilingual, multi-intent, multi-domain dataset. Multi3NLU++ extends the English-only NLU++ dataset to include manual translations into a range of high, medium, and low resource languages (Spanish, Marathi, Turkish and Amharic), in two domains (banking and hotels). Because of its multi-intent property, Multi3NLU++ represents complex and natural user goals, and therefore allows us to measure the realistic performance of ToD systems in a varied set of the world's languages. We use Multi3NLU++ to benchmark state-of-the-art multilingual models for the NLU tasks of intent detection and slot labeling for ToD systems in the multilingual setting. The results demonstrate the challenging nature of the dataset, particularly in the low-resource language setting, offering ample room for future experimentation in multi-domain multilingual ToD setups

    Customisable chatbot as a research instrument

    Get PDF
    Abstract. Chatbots are proliferating rapidly online for a variety of different purposes. This thesis presents a customisable chatbot that was designed and developed as a research instrument for online customer interaction research. The developed chatbot facilitates creation of different bot personas, data management tools, and a fully functional online chat user interface. Customer-facing bots in the system are rulebased, with basic input processing and text response selection based on best match. The system uses its own database to store user-chatbot dialogue history. Further, bots can be assigned unique dialogue scripts and their profiles can be customised concerning name, description and profile image. In the presented validation studies, participants completed a task by taking part in a conversation with different bots, as hosted by the system and invoked through distinct URL parameters. Second, the participants filled in a questionnaire on their experience with the bot, designed to reveal differences in how the bots were perceived. Our results suggest that the chatbot’s personality impacted how customers experienced the interactions. Therefore, the developed system can facilitate research scenarios that deal with investigating participant responses to different chatbot personas. Future work is necessary for a wider range of applications and enhanced response control.Personoitava chatbot tutkimustyökaluna. Tiivistelmä. Chatbotit yleistyvät nopeasti Internetissä ja niitä käytetään enenevissä määrin useissa eri käyttötarkoituksissa. Tämä diplomityö esittelee personoitavan chatbotin, joka on kehitetty tutkimustyökaluksi verkon yli tapahtuvaan vuorovaikutustutkimukseen. Kehitetty chatbot sisältää erilaisten bottipersoonien luonnin, apuvälineitä datan käsittelyn, ja itse botin käyttöliittymän. Järjestelmän käyttäjille vastailevat bottipersoonat ovat sääntöihin perustuvia, niiden syötteet käsitellään suoraviivaisesti ja vastaukseksi valitaan vertailun mukaan paras ennaltamääritellyn skriptin mukaisesti. Järjestelmä käyttää omaa tietokantaa tallentamaan käyttäjä-botti keskusteluhistorian. Lisäksi boteille voidaan asettaa uniikki dialogimalli, ja niiden profiilista voidaan personoida URL-parametrillä nimi, botin kuvaus ja profiilikuva. Chatbotin tekninen toiminta todettiin tutkimuksella, jossa osallistujat suorittivat annetun tehtävän seuraamalla osittain valmista käsikirjoitusta eri bottien kanssa. Tämän jälkeen osallistujat täyttivät käyttäjäkyselyn liittyen heidän kokemukseensa botin kanssa. Kysely oli suunniteltu paljastamaan mahdolliset eroavaisuudet siinä, kuinka botin käyttäytyminen miellettiin keskustelun aikana. Käyttäjätestin tulokset viittaavat siihen, että chatbotin persoonalla oli vaikutus käyttäjien kokemukseen. Kehitetty järjestelmä siis pystyy mahdollistamaan tutkimusasetelmia, joissa tutkitaan osallistujien reaktioita erilaisten chattibottien persooniin. Jatkotyö kehitetyn chatbotin yhteydessä keskittyy monimutkaisempien käyttötarkoitusten lisäämiseen ja botin vastausten parantamiseen edistyksellisemmän luonnollisen kielen käsittelyn avulla

    A web-based AI assistant Application using Python and JavaScript

    Get PDF
    Our research is mainly based on a chatbot which is powered by Artificial Intelligence. Nowadays, Artificial Intelligence assistants such as Apple’s Siri, Google’s Now and Amazon’s Alexa are currently fast-growing and widely integrated with many smart devices. These assistants are built with the primary purpose of being personal assistants for every individual user in certain contexts. In this research, we would highlight the development process of the chatbots, features, problems, case studies and limitations. This research delivers the information, helps developers to build answer bots and integrate chatbots with business accounts. The aim is to assist users and allow transactions between client companies and their customers. As a result, users can accomplish results to queries as well as clients can grow their business

    CELDA: Leveraging Black-box Language Model as Enhanced Classifier without Labels

    Full text link
    Utilizing language models (LMs) without internal access is becoming an attractive paradigm in the field of NLP as many cutting-edge LMs are released through APIs and boast a massive scale. The de-facto method in this type of black-box scenario is known as prompting, which has shown progressive performance enhancements in situations where data labels are scarce or unavailable. Despite their efficacy, they still fall short in comparison to fully supervised counterparts and are generally brittle to slight modifications. In this paper, we propose Clustering-enhanced Linear Discriminative Analysis, a novel approach that improves the text classification accuracy with a very weak-supervision signal (i.e., name of the labels). Our framework draws a precise decision boundary without accessing weights or gradients of the LM model or data labels. The core ideas of CELDA are twofold: (1) extracting a refined pseudo-labeled dataset from an unlabeled dataset, and (2) training a lightweight and robust model on the top of LM, which learns an accurate decision boundary from an extracted noisy dataset. Throughout in-depth investigations on various datasets, we demonstrated that CELDA reaches new state-of-the-art in weakly-supervised text classification and narrows the gap with a fully-supervised model. Additionally, our proposed methodology can be applied universally to any LM and has the potential to scale to larger models, making it a more viable option for utilizing large LMs.Comment: ACL 202
    corecore