948 research outputs found

    Sequence to Sequence Learning for Query Expansion

    Full text link
    Using sequence to sequence algorithms for query expansion has not been explored yet in Information Retrieval literature nor in Question-Answering's. We tried to fill this gap in the literature with a custom Query Expansion engine trained and tested on open datasets. Starting from open datasets, we built a Query Expansion training set using sentence-embeddings-based Keyword Extraction. We therefore assessed the ability of the Sequence to Sequence neural networks to capture expanding relations in the words embeddings' space.Comment: 8 pages, 2 figures, AAAI-19 Student Abstract and Poster Progra

    Comparing How a Chatbot References User Utterances from Previous Chatting Sessions: An Investigation of Users' Privacy Concerns and Perceptions

    Full text link
    Chatbots are capable of remembering and referencing previous conversations, but does this enhance user engagement or infringe on privacy? To explore this trade-off, we investigated the format of how a chatbot references previous conversations with a user and its effects on a user's perceptions and privacy concerns. In a three-week longitudinal between-subjects study, 169 participants talked about their dental flossing habits to a chatbot that either, (1-None): did not explicitly reference previous user utterances, (2-Verbatim): referenced previous utterances verbatim, or (3-Paraphrase): used paraphrases to reference previous utterances. Participants perceived Verbatim and Paraphrase chatbots as more intelligent and engaging. However, the Verbatim chatbot also raised privacy concerns with participants. To gain insights as to why people prefer certain conditions or had privacy concerns, we conducted semi-structured interviews with 15 participants. We discuss implications from our findings that can help designers choose an appropriate format to reference previous user utterances and inform in the design of longitudinal dialogue scripting.Comment: 10 pages, 3 figures, to be published in Proceedings of the 11th International Conference on Human-Agent Interaction (ACM HAI'23

    A Survey of Available Corpora For Building Data-Driven Dialogue Systems: The Journal Version

    Get PDF
    During the past decade, several areas of speech and language understanding have witnessed substantial breakthroughs from the use of data-driven models. In the area of dialogue systems, the trend is less obvious, and most practical systems are still built through significant engineering and expert knowledge. Nevertheless, several recent results suggest that data-driven approaches are feasible and quite promising. To facilitate research in this area, we have carried out a wide survey of publicly available datasets suitable for data-driven learning of dialogue systems. We discuss important characteristics of these datasets, how they can be used to learn diverse dialogue strategies, and their other potential uses. We also examine methods for transfer learning between datasets and the use of external knowledge. Finally, we discuss appropriate choice of evaluation metrics for the learning objective

    Spoken Language Learning System : an online conversational spoken language learning system

    Get PDF
    Thesis: M. Eng., Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003Includes bibliographical references (leaves 75-77).The Spoken Language Learning System (SLLS) is intended to be an engaging, educational, and extensible spoken language learning system showcasing the multilingual capabilities of the Spoken Language Systems Group's (SLS) systems. The motivation behind SLLS is to satisfy both the demand for spoken language learning in an increasingly multi-cultural society and the desire for continued development of the multilingual systems at SLS. SLLS is an integration of an Internet presence with augmentations to SLS's Mandarin systems built within the Galaxy architecture, focusing on the situation of an English speaker learning Mandarin. We offer language learners the ability to listen to spoken phrases and simulated conversations online, engage in interactive dynamic conversations over the telephone, and review audio and visual feedback of their conversations. We also provide a wide array of administration and maintenance features online for teachers and administrators to facilitate continued system development and user interaction, such as lesson plan creation, vocabulary management, and a requests forum. User studies have shown that there is an appreciation for the potential of the system and that the core operation is intuitive and entertaining. The studies have also helped to illuminate the vast array of future work necessary to further polish the language learning experience and reduce the administrative burden. The focus of this thesis is the creation of the first iteration of SLLS; we believe we have taken the first step down the long but hopeful path towards helping people speak a foreign language.by Tien-Lok Jonathan Lau.M. Eng.M.Eng. Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Scienc

    Arabic goal-oriented conversational agents using semantic similarity techniques

    Get PDF
    Conversational agents (CAs) are computer programs used to interact with humans in conversation. Goal-Oriented Conversational agents (GO-CAs) are programs that interact with humans to serve a specific domain of interest; its’ importance has increased recently and covered fields of technology, sciences and marketing. There are several types of CAs used in the industry, some of them are simple with limited usage, others are sophisticated. Generally, most CAs were to serve the English language speakers, a few were built for the Arabic language, this is due to the complexity of the Arabic language, lack of researchers in both linguistic and computing. This thesis covered two types of GO-CAs. The first is the traditional pattern matching goal oriented CA (PMGO-CA), and the other is the semantic goal oriented CA (SGO-CA). Pattern matching conversational agents (PMGO-CA) techniques are widely used in industry due to their flexibility and high performance. However, they are labour intensive, difficult to maintain or update, and need continuous housekeeping to manage users’ utterances (especially when instructions or knowledge changes). In addition to that they lack for any machine intelligence. Semantic conversational agents (SGO-CA) techniques utilises humanly constructed knowledge bases such as WordNet to measure word and sentence similarity. Such measurement witnessed many researches for the English language, and very little for the Arabic language. In this thesis, the researcher developed a novelty of a new methodology for the Arabic conversational agents (using both Pattern Matching and Semantic CAs), starting from scripting, knowledge engineering, architecture, implementation and evaluation. New tools to measure the word and sentence similarity were also constructed. To test performance of those CAs, a domain representing the Iraqi passport services was built. Both CAs were evaluated and tested by domain experts using special evaluation metrics. The evaluation showed very promising results, and the viability of the system for real life

    Diagnosing Reading strategies: Paraphrase Recognition

    Get PDF
    Paraphrase recognition is a form of natural language processing used in tutoring, question answering, and information retrieval systems. The context of the present work is an automated reading strategy trainer called iSTART (Interactive Strategy Trainer for Active Reading and Thinking). The ability to recognize the use of paraphrase—a complete, partial, or inaccurate paraphrase; with or without extra information—in the student\u27s input is essential if the trainer is to give appropriate feedback. I analyzed the most common patterns of paraphrase and developed a means of representing the semantic structure of sentences. Paraphrases are recognized by transforming sentences into this representation and comparing them. To construct a precise semantic representation, it is important to understand the meaning of prepositions. Adding preposition disambiguation to the original system improved its accuracy by 20%. The preposition sense disambiguation module itself achieves about 80% accuracy for the top 10 most frequently used prepositions. The main contributions of this work to the research community are the preposition classification and generalized preposition disambiguation processes, which are integrated into the paraphrase recognition system and are shown to be quite effective. The recognition model also forms a significant part of this contribution. The present effort includes the modeling of the paraphrase recognition process, featuring the Syntactic-Semantic Graph as a sentence representation, the implementation of a significant portion of this design demonstrating its effectiveness, the modeling of an effective preposition classification based on prepositional usage, the design of the generalized preposition disambiguation module, and the integration of the preposition disambiguation module into the paraphrase recognition system so as to gain significant improvement

    Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009)

    Get PDF

    A Conversational Bot Expert in TCP/IP

    Get PDF
    When studying a telecommunication degree, it can be sometimes hard to remember all concepts or memorizing in detail how certain protocols work. To answer this problem, this project aimed to study how to create a bot in order to answer simple questions regarding the TCP/IP protocols. First of all, it was necessary to analyse general information about conversational bots and programming tools in order to choose how to make the best implementation possible. Afterwards, we proposed different design alternatives that had to be done in order to develop the bot. These alternatives included the creation of a new algorithm to analyse text from users and obtain the main concepts for creating answers to questions. Finally, we divided TeCePe’s implementation in programming modules that perform each of its functionalities separately to make easier its analysis and addition to the general code. Users’ results suggest that bots like TeCePe could provide some benefits to students while studying a subject. They usually prefer realistic human interactions and want more additional features besides bot’s main functionality in order to be encouraged to use conversational bots, which are not very popular in the education field at this moment. The main results of this project are generally favourable, as the bot developed fulfilled most requirements using all algorithms proposed. TeCePe is fast when searching, and can correctly detect users’ intention in order to output the best possible answer.Al estudiar un grado en ingeniería de telecomunicaciones, puede ocurrir que sea difícil recordar todos los conceptos dados en clase o memorizar cómo funcionan algunos protocolos. Para resolver este problema, en este proyecto se ha estudiado como crear un bot para resolver preguntas sencillas relacionadas con los protocolos TCP/IP. En primer lugar fue necesario un análisis sobre los bots conversacionales y herramientas de programación para poder realizar la mejor implementación posible. A continuación se propusieron diferentes alternativas de diseño que deberían realizarse para desarrollar el bot. Estas alternativas incluyen crear un nuevo algoritmo para analizar textos de los usuarios y obtener los principales conceptos e ideas para crear las respuestas del bot. Por último, dividimos la implementación de TeCePe en diferentes módulos de programación, realizando cada una de las funciones de TeCePe por separado para hacer la programación más sencilla y facilitar su integración con el código principal. Los resultados con usuarios sugieren que bots como TeCePe podrían otorgar algunos beneficios a los estudiantes que estén estudiando una asignatura concreta. Normalmente prefieren interacciones realistas (similares a las humanas) y quieren funcionalidades extra para que estén motivados a utilizar bots conversacionales, que no son muy populares en el campo educativo por el momento. Los principales resultados del proyecto son generalmente favorables, puesto que el bot desarrollado cumple la mayoría de los requisitos utilizando todos los algoritmos propuestos anteriormente. TeCePe es rápido en sus búsquedas y puede detectar las intenciones de los usuarios para dar la mejor respuesta posible en cada caso.Ingeniería Telemátic
    • …
    corecore