1,702 research outputs found

    Usefulness, localizability, humanness, and language-benefit: additional evaluation criteria for natural language dialogue systems

    Get PDF
    Human–computer dialogue systems interact with human users using natural language. We used the ALICE/AIML chatbot architecture as a platform to develop a range of chatbots covering different languages, genres, text-types, and user-groups, to illustrate qualitative aspects of natural language dialogue system evaluation. We present some of the different evaluation techniques used in natural language dialogue systems, including black box and glass box, comparative, quantitative, and qualitative evaluation. Four aspects of NLP dialogue system evaluation are often overlooked: “usefulness” in terms of a user’s qualitative needs, “localizability” to new genres and languages, “humanness” or “naturalness” compared to human–human dialogues, and “language benefit” compared to alternative interfaces. We illustrated these aspects with respect to our work on machine-learnt chatbot dialogue systems; we believe these aspects are worthwhile in impressing potential new users and customers

    Parallel corpus multi stream question answering with applications to the Qu'ran

    Get PDF
    Question-Answering (QA) is an important research area, which is concerned with developing an automated process that answers questions posed by humans in a natural language. QA is a shared task for the Information Retrieval (IR), Information Extraction (IE), and Natural Language Processing communities (NLP). A technical review of different QA system models and methodologies reveals that a typical QA system consists of different components to accept a natural language question from a user and deliver its answer(s) back to the user. Existing systems have been usually aimed at structured/ unstructured data collected from everyday English text, i.e. text collected from television programmes, news wires, conversations, novels and other similar genres. Despite all up-to-date research in the subject area, a notable fact is that none of the existing QA Systems has been tested on a Parallel Corpus of religious text with the aim of question answering. Religious text has peculiar characteristics and features which make it more challenging for traditional QA methods than other kinds of text. This thesis proposes PARMS (Parallel Corpus Multi Stream) Methodology; a novel method applying existing advanced IR (Information Retrieval) techniques, and combining them with NLP (Natural Language Processing) methods and additional semantic knowledge to implement QA (Question Answering) for a parallel corpus. A parallel Corpus involves use of multiple forms of the same corpus where each form differs from others in a certain aspect, e.g. translations of a scripture from one language to another by different translators. Additional semantic knowledge can be referred as a stream of information related to a corpus. PARMS uses Multiple Streams of semantic knowledge including a general ontology (WordNet) and domain-specific ontologies (QurTerms, QurAna, QurSim). This additional knowledge has been used in embedded form for Query Expansion, Corpus Enrichment and Answer Ranking. The PARMS Methodology has wider applications. This thesis applies it to the Quran – the core text of Islam; as a first case study. The PARMS Method uses parallel corpus comprising ten different English translations of the Quran. An individual Quranic verse is treated as an answer to questions asked in a natural language, English. This thesis also implements PARMS QA Application as a proof of concept for the PARMS methodology. The PARMS Methodology aims to evaluate the range of semantic knowledge streams separately and in combination; and also to evaluate alternative subsets of the DATA source: QA from one stream vs. parallel corpus. Results show that use of Parallel Corpus and Multiple Streams of semantic knowledge have obvious advantages. To the best of my knowledge, this method is developed for the first time and it is expected to be a benchmark for further research area

    Chatbot development to assist patients in health care services

    Get PDF
    Dissertação de mestrado integrado em Engenharia InformĂĄticaDados de alta qualidade sobre tratamentos mĂ©dicos e de informação tĂ©cnica tornaram-se acessĂ­veis, criando novas oportunidades de E-SaĂșde para a recuperação de um paciente. A implementação da aprendizagem automĂĄtica nestas soluçÔes provou ser essencial e eficaz na elaboração de aplicaçÔes para o utilizador para aliviar a sobrecarga do sector de saĂșde. Atualmente, muitas interaçÔes com os utentes sĂŁo realizadas via telefonemas e mensagens de texto. Os agentes de conversação podem responder a estas questĂ”es, fomentando uma rĂĄpida interação com os pacientes. O objetivo fundamental desta dissertação Ă© prestar apoio aos pacientes, fornecendo uma fonte de informação fidedigna que lhes permita instruir-se e esclarecer dĂșvidas sobre os procedimentos e repercussĂ”es dos seus problemas de saĂșde. Este propĂłsito foi concretizado nĂŁo apenas atravĂ©s de uma plataforma Web intuitiva e acessĂ­vel, composta por perguntas frequentes, mas tambĂ©m integrando um agente de conversação inteligente para responder a questĂ”es. Para este fim, cientificamente, foi necessĂĄrio conduzir a investigação, implementação e viabilidade dos agentes de conversação no domĂ­nio fechado para os cuidados de saĂșde. Constitui um importante contributo para a comunidade de desenvolvimento de chatbots, na qual se reĂșnem as Ășltimas inovaçÔes e descobertas, bem os desafios actuais da aprendizagem automĂĄtica, contribuindo para a consciencialização desta ĂĄrea.High-quality data on medical treatments and facility-level information has become accessible, creating new eHealth opportunities for the recuperation of a patient. Machine learning implementation in these solutions has been proven to be essential and effective in building user-centred applications to relieves the burden on the healthcare sector. Nowadays, many patient interactions are handled through healthcare services via phone calls and text message exchange. Conversation agents can provide answers to these queries, promoting fast patient interaction. The underlying aim of this dissertation is to assist patients by providing a reliable source of information to educate themselves and clarify any doubts about procedures and implications of their health issue. This purpose was achieved not only through an intuitive and accessible web platform, with frequently asked questions, but also by integrating an intelligent chatting agent to answer questions. To this end, scientifically, it was necessary to conduct the research, implementation and feasibility of closed-domain conversation agents for healthcare. It is a valuable input for the chatbot development community, which assembles the latest innovations and findings, as well as the current challenges of machine learning, contributing to the awareness of this field

    Mining question-answer pairs from web forum: a survey of challenges and resolutions

    Get PDF
    Internet forums, which are also known as discussion boards, are popular web applications. Members of the board discuss issues and share ideas to form a community within the board, and as a result generate huge amount of content on different topics on daily basis. Interest in information extraction and knowledge discovery from such sources has been on the increase in the research community. A number of factors are limiting the potentiality of mining knowledge from forums. Lexical chasm or lexical gap that renders some Natural Language Processing techniques (NLP) less effective, Informal tone that creates noisy data, drifting of discussion topic that prevents focused mining and asynchronous issue that makes it difficult to establish post-reply relationship are some of the problems that need to be addressed. This survey introduces these challenges within the framework of question answering. The survey provides description of the problems; cites and explores useful publications to the reader for further examination; provides an overview of resolution strategies and findings relevant to the challenges

    The effect of component recognition on flexibility and speech recognition performance in a spoken question answering system

    Get PDF
    A spoken question answering system that recognizes questions as full sentences performs well when users ask one of the questions defined. A system that recognizes component words and finds an equivalent defined question might be more flexible, but is likely to have decreased speech recognition performance, leading to a loss in overall system success. The research described in this document compares the advantage in flexibility to the loss in recognition performance when using component recognition. Questions posed by participants were processed by a system of each type. As expected, the component system made frequent recognition errors while detecting words (word error rate of 31%). In comparison, the full system made fewer errors while detecting full sentences (sentence error rate of 10%). Nevertheless, the component system succeeded in providing proper responses to 76% of the queries posed, while the full system responded properly to only 46%. Four variations of the traditional tf-idf weighting method were compared as applied to the matching of short text strings (fewer than 10 words). It was found that the general approach was successful in finding matches, and that all four variations compensated for the loss in speech recognition performance to a similar degree. No significant difference due to the variations in weighting was detected in the results

    Designing for Conversational System Trustworthiness: The Impact of Model Transparency on Trust and Task Performance

    Get PDF
    Designing for system trustworthiness promises to address challenges of opaqueness and uncertainty introduced through Machine Learning (ML)-based systems by allowing users to understand and interpret systems’ underlying working mechanisms. However, empirical exploration of trustworthiness measures and their effectiveness is scarce and inconclusive. We investigated how varying model confidence (70% versus 90%) and making confidence levels transparent to the user (explanatory statement versus no explanatory statement) may influence perceptions of trust and performance in an information retrieval task assisted by a conversational system. In a field experiment with 104 users, our findings indicate that neither model confidence nor transparency seem to impact trust in the conversational system. However, users’ task performance is positively influenced by both transparency and trust in the system. While this study considers the complex interplay of system trustworthiness, trust, and subsequent behavioral outcomes, our results call into question the relation between system trustworthiness and user trust
    • 

    corecore